|
Class: HTMLParser
Object
|
+--HTMLParser
- Package:
- stx:libhtml
- Category:
- System-Documentation
- Version:
- rev:
1.98
date: 2019/06/04 10:02:12
- user: cg
- file: HTMLParser.st directory: libhtml
- module: stx stc-classLibrary: libhtml
- Author:
- Claus Gittinger
Notice & Warning:
this HTML markup framework and the corresponding parser
started as a quick hack (in the 90's) when replacing a buggy mosaic
X-widget with a Smalltalk written HTML viewer.
Its goals were to be fast enough for typical uses, to be not too memory hungry
and to provide the functionality required to display simple help documents.
It was NOT meant to become a full featured web-browser replacement.
We plan to replace all uses of this parser by the newer HTML::HTMLParser,
which generates a better DOM representation.
This framework is still in use as the document viewer inside ST/X,
and supported to the extent that simple online help documents and html tooltips are to be displayed.
However, there are no plans to further enhance or spend more time on its maintenance.
If you need more sophisticated html/dom/doc functionality, you may want to use either
the HTMLTree framework or one of the free frameworks found in the goodies folder.
instances of this class are used to read HTML documents
and build a collection (linked list) of markup elements for simple online help documents.
This markup-collection can be displayed using the HTMLDocumentViewer
or printed by the HTMLDocumentPrinter.
HTMLMarkup
HTMLDocumentView
HTMLDocumentPainter
HTMLDocumentPrinter
accessing
-
ampersandEscapes
-
backward compatibility only
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
mathAmpersandEscapes
-
backward compatibility only
** This is an obsolete interface - do not use it (it may vanish in future versions) **
initialization
-
initialize
-
save space by reusing common strings (empty lines and single spaces)
usage example(s):
AmpersandEscapes := nil.
HTMLParser initialize
MathAmpersandEscapes := nil.
HTMLParser initialize
|
parsing
-
parseText: aStringOrStream
-
parse aStringOrStream.and answer the parsed document
usage example(s):
self parseText:'hello world - this is easy'
self parseText:'hello < world > - this is easy'
self parseText:'hello world this is easy'
self parseText:'hello world this is easy'
self parseText:'hello this is easy'
self parseText:' this is easy'
self
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString)
self
parseText:('../../doc/online/english/TOP.html'
asFilename readStream)
self
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString)
|
-
parseText: aStringOrStream characterEncoding: anEncodingString
-
parse aStringOrStream. The encoding of the character set is specified by anEncodingString
(e.g. #utf8 or 'iso8859-1').
Answer the parsed document
usage example(s):
self
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString) characterEncoding:#utf8
|
accessing
-
characterEncoding: aString
-
set the character set / encoding for the following text
error reporting
-
infoMessage: msg
-
scanning
-
ampersandEscape
-
parse an ampersand escape; the '&' has already been read.
-
ampersandEscape: aString
-
return a new string, containing the ampersand escape character.
Expects aString to NOT contain the initial ampersand.
usage example(s):
(HTMLParser new) ampersandEscape:'lt'
(HTMLParser new) ampersandEscape:'ouml'
(HTMLParser new) ampersandEscape:'#32'
(HTMLParser new) parseText:'hello α β γ normal'
(HTMLParser new) parseText:'hello
-
ampersandEscapeString
-
parse an ampersand escape; the '&' has already been read.
Return the escape string.
-
extractMetaInformationFrom: element
-
<mime-type> ; charset=
-
finishTextBlock
-
finish a scanned textBlock; add it to the markup list
-
parseMarkup
-
parse '<' and return a markup element
-
parseText: aStringOrStream
-
parse some string, return a list of markups
usage example(s):
(HTMLParser new) parseText:'hello world - this is easy'
(HTMLParser new) parseText:'helloworld - this is easy'
(HTMLParser new) parseText:'hello < world > - this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello this is easy'
(HTMLParser new) parseText:' this is easy'
(HTMLParser new)
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString)
(HTMLParser new)
parseText:('../../doc/online/english/TOP.html'
asFilename readStream)
(HTMLParser new)
parseText:('../../doc/online/english/programming/viewintro.html'
asFilename contentsOfEntireFile asString)
|
-
parseText: aStringOrStream withBindings: metaBindings
-
parse some string, return a list of HTMLMarkups.
Ampersand variables (i.e. &url) are expanded as given in the
metabindings dictionary.
(this seems to be non-standard HTML, but is used in hotjava).
The destination is only required for scripts, which may want to access
document very early.
usage example(s):
(HTMLParser new) parseText:'hello world - this is easy'
(HTMLParser new) parseText:'hello < world > - this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello this is easy'
(HTMLParser new) parseText:' this is easy'
(HTMLParser new)
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString)
(HTMLParser new)
parseText:('../../doc/online/english/programming/viewintro.html'
asFilename contentsOfEntireFile asString)
|
-
parseText: aStringOrStream withBindings: metaBindings for: aDestination
-
parse some string, return a list of HTMLMarkups.
Ampersand variables (i.e. &url) are expanded as given in the
metabindings dictionary.
(this seems to be non-standard HTML, but is used in hotjava).
The destination is only required for scripts, which may want to access
document very early.
usage example(s):
(HTMLParser new) parseText:'hello world - this is easy'
(HTMLParser new) parseText:'hello < world > - this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello world this is easy'
(HTMLParser new) parseText:'hello this is easy'
(HTMLParser new) parseText:' this is easy'
(HTMLParser new)
parseText:('../../doc/online/english/TOP.html'
asFilename contentsOfEntireFile asString)
(HTMLParser new)
parseText:('../../doc/online/english/programming/viewintro.html'
asFilename contentsOfEntireFile asString)
|
-
startNewTextBlock
-
scripts
-
parseJavaScriptFrom: scriptStream
-
HTML
-
parseSmalltalkScriptFrom: scriptStream
-
-
script: element
-
a <script> TAG was encountered.
check for the language (which defaults to javaScript) and dispatch
to a script language handler.
-
script_javascript: element
-
a <script language=javaScript> TAG was encountered.
parse the script, and construct the scriptObject
-
script_smalltalkscript: element
-
a <script language=smalltalkScript> TAG was encountered.
parse the script, and construct the scriptObject (which has the methods in
its anonymous class)
|p in document|
p := HTMLParser new.
in := '../../doc/online/english/TOP.html' asFilename readStream.
document := p parseText:in.
in close.
document inspect
|
|v document|
v := HTMLDocumentView new openAndWait.
v homeDocument:'../../doc/online/english/TOP.html'.
|
|top v document|
top := StandardSystemView extent:200@500.
v := HVScrollableView for:HTMLDocumentView miniScrollerH:true in:top.
v origin:0.0@ 0.0 corner:1.0@1.0.
top openAndWait.
v homeDocument:'../../doc/online/english/TOP.html'.
|
|v document|
v := HTMLDocumentView new openAndWait.
document := (HTMLParser new)
parseText:('../../doc/online/english/programming/viewintro.html'
asFilename readStream).
v document:document.
|
|
|
ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Thu, 28 Mar 2024 21:57:10 GMT
|
|