|
Class: Visitor (in HTML)
Object
|
+--HTML::Visitor
|
+--HTML::Encoder
|
+--HTML::TextExtractor
- Package:
- stx:goodies/webServer/htmlTree
- Category:
- Net-Documents-HTML-TreeBuilder
- Version:
- rev:
1.41
date: 2019/07/13 12:20:40
- user: cg
- file: HTML__Visitor.st directory: goodies/webServer/htmlTree
- module: stx stc-classLibrary: htmlTree
this is an abstract framework class which does the dispatch based on the HTML tag.
The minimum to redefine are visitString: and visitElement:,
although you can redefine any of the visitXXX: methods to directly get dispatched into it.
interface-visiting-BlockLevel
-
visitAddress: anAddress
-
An address gets visited.
-
visitArea: anArea
-
-
visitBlockQuote: aBlockQuote
-
A blockQuote gets visited.
-
visitCenter: aParagraph
-
A 'center' gets visited.
-
visitDiv: aDiv
-
A div gets visited.
-
visitEmbed: anEmbeddedDocument
-
-
visitHeading1: aHeading1
-
A heading level one gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading2: aHeading2
-
A heading level two gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading3: aHeading3
-
A heading level three gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading4: aHeading4
-
A heading level four gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading5: aHeading5
-
A heading level five gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading6: aHeading6
-
A heading level six gets visited.
Rubbish: this is never called; instead visitHeading: is called for all levels
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
visitHeading: aHeading
-
A heading gets visited.
-
visitHorizontalRule: aHorizontalRule
-
A horizontal rule gets visited.
-
visitParagraph: aParagraph
-
A paragraph gets visited.
-
visitPre: aPre
-
A pre gets visited.
interface-visiting-FontStyle
-
visitBig: aBig
-
A 'big' gets visited.
-
visitBold: aBold
-
A 'bold' gets visited.
-
visitItalic: anItalic
-
An italic gets visited.
-
visitSmall: aSmall
-
A 'small' gets visited.
-
visitStrong: anElement
-
-
visitStyleElement: anElement
-
-
visitUnderlined: aSmall
-
An underlined gets visited.
interface-visiting-Form
-
visitButton: aButton
-
A button gets visited.
-
visitForm: aForm
-
A form gets visited.
-
visitFormElement: anElement
-
-
visitInput: anInput
-
An input gets visited.
-
visitLabel: anInput
-
A label gets visited.
-
visitOption: anOption
-
An option gets visited.
-
visitSelect: anSelect
-
A select gets visited.
-
visitTextArea: anInput
-
A textarea gets visited.
interface-visiting-Frame
-
visitFrame: aFrame
-
A frame gets visited.
-
visitFrameset: aFrameset
-
A frameset gets visited.
-
visitIFrame: anIFrame
-
-
visitNoFrames: aNoFramesElement
-
A noframes element gets visited.
-
visitNoScript: aNoScriptElement
-
interface-visiting-Head
-
visitBase: aBase
-
-
visitLink: aLink
-
A link gets visited.
-
visitMeta: aMeta
-
A meta element gets visited.
-
visitScript: aScript
-
A script gets visited.
-
visitStyle: aStyle
-
A style element gets visited.
-
visitTitle: aTitle
-
A title gets visited.
interface-visiting-Inline
-
visitAnchor: anAnchor
-
An anchor gets visited.
-
visitBreak: aLineBreak
-
A line break gets visited.
-
visitCode: anImage
-
A code gets visited.
-
visitImage: anImage
-
An image gets visited.
-
visitNoBr: anImage
-
A noBr gets visited.
-
visitSpan: anImage
-
A span gets visited.
interface-visiting-List
-
visitComment: anItem
-
A comment gets visited.
-
visitListItem: aListItem
-
A list item gets visited.
-
visitOrderedList: anOrderedList
-
An ordered list gets visited.
-
visitProcessingInstruction: anItem
-
A processing instruction gets visited.
-
visitUnorderedList: anUnorderedList
-
An unordered list gets visited.
interface-visiting-ListDefinition
-
visitDefinitionDescription: aDefinitionDescription
-
A definition description gets visited.
-
visitDefinitionList: aList
-
A definition list gets visited.
-
visitDefinitionTerm: aTerm
-
A definition term gets visited.
interface-visiting-Special
-
visitDocumentType: aDocumentType
-
A document type gets visited.
The document type is no real html element.
It only holds the version string.
Do nothing here.
interface-visiting-Table
-
visitCaption: aCaption
-
A table caption gets visited.
-
visitCol: aCol
-
A col gets visited.
-
visitColgroup: aColgroup
-
A colgroup gets visited.
-
visitTable: aTable
-
A table gets visited.
-
visitTableBody: aTableBody
-
A table body gets visited.
-
visitTableDataCell: aTableDataCell
-
A table data cell gets visited.
-
visitTableFoot: aTableFoot
-
A table foot gets visited.
-
visitTableHead: aTableHead
-
A table head gets visited.
-
visitTableHeaderCell: aTableHeaderCell
-
A table header cell gets visited.
-
visitTableRow: aTableRow
-
A table row gets visited.
interface-visiting-TopLevel
-
visitBody: aBody
-
A body gets visited.
-
visitDocument: aDocument
-
A document gets visited.
-
visitDummyContainer: aContainer
-
The container has no tag; however, its children are treated as usual
-
visitHead: aHead
-
A head gets visited.
required-visiting
-
visitElement: anElement
-
Default method for all html elements.
To be defined in subclasses.
** This method raises an error - it must be redefined in concrete classes **
-
visitString: aString
-
Default method for all html text pieces.
To be defined in subclasses.
** This method raises an error - it must be redefined in concrete classes **
visiting
-
visit: anObject
-
Visit an object.
-
visitSubelement: aSubElement
-
Visit a subelement.
Hook for redefinition.
Some subclasses may want to have a hook
here to do something special before or
after visitng a subelement.
See Encoder for example.
-
visitSubelementsOf: anElement
-
Visit the subelements of an element.
ElementTypes := nil.
HTMLParser initializeElementTypes
|p in document visitorClass visitor|
p := HTML::HTMLParser new.
in := '../../../exept/expecco/projects/not_delivered/buggyWebShopDemo/selenium_tests/buggyWebshop_bestellung'
asFilename readStream.
document := p parseText:in.
in close.
Smalltalk withoutUpdatingChangesDo:[
visitorClass := HTML::Visitor
subclass:#TestVisitor
instanceVariableNames:''
classVariableNames:''
poolDictionaries:''
category:''.
visitorClass compile:'visitElement:anElement self visitSubelementsOf:anElement'.
visitorClass compile:'visitString:aString'.
visitorClass compile:'visitTableRow:aTR
aTR children size > 1 ifTrue:[
|cmdTD argTDs cmd args|
cmdTD := aTR children first.
argTDs := aTR children copyFrom:2.
cmd := cmdTD children first.
args := argTDs collect:[:td | td children firstIfEmpty:nil].
args := args copyTo:(args findLast:[:arg | arg notNil]).
Transcript show:cmd; show:'' ''; showCR:args.
]
'.
].
visitor := visitorClass new.
visitor visit:document.
|
|