Smalltalk/X Webserver

Documentation of class 'HTML::RichTextExtractor':

Class: RichTextExtractor (in HTML)

Inheritance
Description
Instance protocol
Examples

Inheritance:

   Object
   |
   +--HTML::Visitor
      |
      +--HTML::TextExtractor
         |
         +--HTML::RichTextExtractor
            |
            +--HTML::HTMLToTextConverter

Package:: stx:goodies/webServer/htmlTree

Category:: Net-Documents-HTML-Utilities

Version:: rev: 1.6 date: 2022/07/05 18:10:28; user: cg; file: HTML__RichTextExtractor.st directory: goodies/webServer/htmlTree; module: stx stc-classLibrary: htmlTree

Description:

a tool to extract the rich text of some html 
(either a constructed tree, or from a parser).

Can be used to extract text (aka strings with emphasis) for display in
a tooltip or similar. 
Ignores everything, except <B>...</B> and <UL>..</UL>.

CAVEAT:
    I am not sure if this implementation is good enough for
    other uses in its current state 
    (maybe we have to look for specialities like PRE.../PRE or     
    text within form-elements to make this really correct).

copyrightCOPYRIGHT (c) 2014 by eXept Software AG
             All Rights Reserved

This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice.   This software may not
be provided or otherwise made available to, or used by, any
other person.  No title to or ownership of the software is
hereby transferred.

Instance protocol:

accessing

isBold
isItalic
isUnderline

initialization

initialize: (comment from inherited method)
allow for a subclass to have this already initialized

visiting

appendString: aString
visitBold: aBold: A 'bold' gets visited.
visitEm: el: A 'em' gets visited.
visitItalic: aItalic: (comment from inherited method)
An italic gets visited.
visitString: aString
visitUnderlined: anUnderline: (comment from inherited method)
An underlined gets visited.

Examples:

     |document|

     document := HTML::HTMLParser parseText:'<h1>Hello <b>World</b></h1>'.
     Transcript showCR:(HTML::RichTextExtractor extractTextFromDocument:document).

     Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b></h1>').

     Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b> and <i>italic</i></h1>').

ST/X 7.7.0.0; WebServer 1.702 at 20f6060372b9.unknown:8081; Sat, 26 Jul 2025 12:13:02 GMT