eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'HTML::RichTextExtractor':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: RichTextExtractor (in HTML)


Inheritance:

   Object
   |
   +--HTML::Visitor
      |
      +--HTML::TextExtractor
         |
         +--HTML::RichTextExtractor
            |
            +--HTML::HTMLToTextConverter

Package:
stx:goodies/webServer/htmlTree
Category:
Net-Documents-HTML-Utilities
Version:
rev: 1.4 date: 2018/06/05 03:30:56
user: cg
file: HTML__RichTextExtractor.st directory: goodies/webServer/htmlTree
module: stx stc-classLibrary: htmlTree
Author:
Claus Gittinger

Description:


a tool to extract the rich text of some html 
(either a constructed tree, or from a parser).

Can be used to extract text (aka strings with emphasis) for display in
a tooltip or similar. 
Ignores everything, except <B>...</B> and <UL>..</UL>.

CAVEAT:
    I am not sure if this implementation is good enough for
    other uses in its current state 
    (maybe we have to look for specialities like PRE.../PRE or     
    text within form-elements to make this really correct).


Related information:

    HTML::HTMLToTextConverter
    HTML::HTMLTextExtractor

Instance protocol:

accessing
o  isBold

o  isItalic

o  isUnderline

initialization
o  initialize

visiting
o  appendString: aString

o  visitBold: aBold
A 'bold' gets visited.

o  visitItalic: aItalic

o  visitString: aString

o  visitUnderlined: anUnderline


Examples:


     |document|

     document := HTML::HTMLParser parseText:'<h1>Hello <b>World</b></h1>'.
     Transcript showCR:(HTML::RichTextExtractor extractTextFromDocument:document).
     Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b></h1>').
     Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b> and <i>italic</i></h1>').


ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Sun, 27 Nov 2022 08:53:59 GMT