eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'HtmlSanitizer':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: HtmlSanitizer


Inheritance:

   Object
   |
   +--HtmlSanitizer

Package:
stx:goodies/webServer
Category:
Net-Communication-HTTP-Server
Version:
rev: 1.14 date: 2024/04/22 17:42:40
user: stefan
file: HtmlSanitizer.st directory: goodies/webServer
module: stx stc-classLibrary: webServer

Description:


A port of http://quickwired.com/kallahar/smallprojects/php_xss_filter_function.php.

The function should take a string and sanitize so it is safe from cross side scripting attacks (XSS).
It adds <x> to dangerous keywords and removes non viewable characters.

The main function of the class is #sanitizeHtml:aString.


[instance variables:]

[class variables:]

copyright

COPYRIGHT (c) 2007 by eXept Software AG All Rights Reserved This software is furnished under a license and may be used only in accordance with the terms of that license and with the inclusion of the above copyright notice. This software may not be provided or otherwise made available to, or used by, any other person. No title to or ownership of the software is hereby transferred.

Class protocol:

api
o  sanitizeHtml: aString
return a html string that has all potential crossscripting attacks disabled.
This version does not take Unicode into account

Usage example(s):

      self sanitizeHtml:(self encodeToHtmlHex:'äabcdef').  
      self sanitizeHtml:''.  
      self sanitizeHtml:''.  

encoding
o  encodeToHtmlDec: aStringOrStream
marked as obsolete by exept MBP at 07-03-2024

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  encodeToHtmlHex: aStringOrStream
marked as obsolete by exept MBP at 07-03-2024

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  encodeToUrlHex: aString
encoding a String to URL hex encoding

Usage example(s):

      self encodeToUrlHex:'123'

helpers
o  recursivelyDecodeHtml: aStringOrStream
recursively decode all characters

o  recursivelyDecodeHtml: aStringOrStream do: aTwoArgCallbackBlock
Recursively decode aString to decode hexadecimal and decimal encodings in
html strings. Decide how and what to decode by returning a string from the
aTwoArgCallbackBlock. The returned string is then set as the replaced character.

The aTwoArgCallbackBlock receives a Char and the encoding as arguments.

return: <String>

o  sanitizeKeywordsAndRemoveIllegalSpaces: aString
add <x> to dangerous keywords and remove the characters with ascii values: 9 10 13

Usage example(s):

     self sanitizeKeywordsAndRemoveIllegalSpaces:''.     

initialization
o  emptyOrTNRWhitespaceAttacksRegex
a Regex that matches an emptyString or the hexadecimal or
decimal encodings of cr lf and tab

o  initializeSanitizedKeywordsNerfedKeywordsAndPatterns

o  keywordRegex

o  nerfedKeywords
keywords, that are changed to avoid a possible XSS security thread.

o  sanitizeKeywords
keywords that are considered a possible XSS attack



ST/X 7.7.0.0; WebServer 1.702 at 20f6060372b9.unknown:8081; Thu, 02 Jan 2025 14:37:17 GMT