|
Class: HtmlSanitizer
Object
|
+--HtmlSanitizer
- Package:
- stx:goodies/webServer
- Category:
- Net-Communication-HTTP-Server
- Version:
- rev:
1.7
date: 2011/02/09 14:15:37
- user: stefan
- file: HtmlSanitizer.st directory: goodies/webServer
- module: stx stc-classLibrary: webServer
- Author:
- james (james@kokxnix)
A port of http://quickwired.com/kallahar/smallprojects/php_xss_filter_function.php.
The function should take a string and sanitize so it is safe from cross side scripting attacks (XSS).
It adds <x> to dangerous keywords and removes non viewable characters.
The main function of the class is #sanitizeHtml:aString.
[instance variables:]
[class variables:]
api
-
sanitizeHtml: aString
-
return a html string that has all potential crossscripting attacks disabled.
This version does not take Unicode into account
usage example(s):
self sanitizeHtml:(self encodeToHtmlHex:'äabcdef').
self sanitizeHtml:''.
self sanitizeHtml:''.
|
encoding
-
encodeToHtmlDec: aStringOrStream
-
encoding aStringOrStream to HTML decimal encoding
usage example(s):
self encodeToHtmlDec:'123'
|
-
encodeToHtmlHex: aStringOrStream
-
encoding a aStringOrStream to HTML hex encoding
usage example(s):
self encodeToHtmlHex:'123'
|
-
encodeToUrlHex: aString
-
encoding a String to URL hex encoding
usage example(s):
self encodeToUrlHex:'123'
|
helpers
-
recursivelyDecodeHtml: aStringOrStream
-
recursively decode all characters
-
recursivelyDecodeHtml: aStringOrStream do: aTwoArgCallbackBlock
-
Recursively decode aString to decode hexadecimal and decimal encodings in
html strings. Decide how and what to decode by returning a string from the
aTwoArgCallbackBlock. The returned string is then set as the replaced character.
The aTwoArgCallbackBlock receives a Char and the encoding as arguments.
return: <String>
-
sanitizeKeywordsAndRemoveIllegalSpaces: aString
-
add <x> to dangerous keywords and remove the characters with ascii values: 9 10 13
usage example(s):
self sanitizeKeywordsAndRemoveIllegalSpaces:''.
|
initialization
-
emptyOrTNRWhitespaceAttacksRegex
-
a Regex that matches an emptyString or the hexadecimal or
decimal encodings of cr lf and tab
-
initializeSanitizedKeywordsNerfedKeywordsAndPatterns
-
-
keywordRegex
-
-
nerfedKeywords
-
keywords, that are changed to avoid a possible XSS security thread.
-
sanitizeKeywords
-
keywords that are considered a possible XSS attack
|