|
Class: RichTextExtractor (in HTML)
Object
|
+--HTML::Visitor
|
+--HTML::TextExtractor
|
+--HTML::RichTextExtractor
|
+--HTML::HTMLToTextConverter
- Package:
- stx:goodies/webServer/htmlTree
- Category:
- Net-Documents-HTML-Utilities
- Version:
- rev:
1.4
date: 2018/06/05 03:30:56
- user: cg
- file: HTML__RichTextExtractor.st directory: goodies/webServer/htmlTree
- module: stx stc-classLibrary: htmlTree
- Author:
- Claus Gittinger
a tool to extract the rich text of some html
(either a constructed tree, or from a parser).
Can be used to extract text (aka strings with emphasis) for display in
a tooltip or similar.
Ignores everything, except <B>...</B> and <UL>..</UL>.
CAVEAT:
I am not sure if this implementation is good enough for
other uses in its current state
(maybe we have to look for specialities like PRE.../PRE or
text within form-elements to make this really correct).
HTML::HTMLToTextConverter
HTML::HTMLTextExtractor
accessing
-
isBold
-
-
isItalic
-
-
isUnderline
-
initialization
-
initialize
-
visiting
-
appendString: aString
-
-
visitBold: aBold
-
A 'bold' gets visited.
-
visitItalic: aItalic
-
-
visitString: aString
-
-
visitUnderlined: anUnderline
-
|document|
document := HTML::HTMLParser parseText:'<h1>Hello <b>World</b></h1>'.
Transcript showCR:(HTML::RichTextExtractor extractTextFromDocument:document).
|
Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b></h1>').
|
Transcript showCR:(HTML::RichTextExtractor extractTextFromHtmlString:'<h1>Hello <b>World</b> and <i>italic</i></h1>').
|
|