eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'CharacterEncoderImplementations::ISO10646_to_XMLUTF8':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: ISO10646_to_XMLUTF8 (in CharacterEncoderImplementations)


Inheritance:

   Object
   |
   +--CharacterEncoder
      |
      +--CharacterEncoderImplementations::VariableBytesEncoder
         |
         +--CharacterEncoderImplementations::ISO10646_to_UTF8
            |
            +--CharacterEncoderImplementations::ISO10646_to_XMLUTF8

Package:
stx:libbasic
Category:
Collections-Text-Encodings
Version:
rev: 1.4 date: 2018/01/04 00:12:44
user: stefan
file: CharacterEncoderImplementations__ISO10646_to_XMLUTF8.st directory: libbasic
module: stx stc-classLibrary: libbasic
Author:
Jan Vrany <jan.vrany@fit.cvut.cz>

Description:


This encoder encodes characters into utf8 characters that may
occur in XML document.

Not all UTF characters are valid in XML, whatever encoding
is used. For a reference, see 

  https://www.w3.org/TR/xml/#NT-Char

Invalid characters are replaced by ReplacementCharacter
with $? as default.


[instance variables:]

[class variables:]


Related information:

    [ttps]

Class protocol:

initialization
o  initialize
Invoked at system start or when the class is dynamically loaded.

queries
o  isValidXMLunicode: codePoint
Returns true, if given codePoint (Integer!!!) is
valid XML unicode.


Instance protocol:

encoding & decoding
o  encodeCharacter: aUnicodeCharacter on: aStream
return the UTF-8 representation of a aUnicodeCharacter.
The resulting string contains only valid XML unicode
characters. Invalid characters are replaced by a
ReplacementCharacter. For details, please see

https://www.w3.org/TR/xml/#NT-Char

o  encodeString: aUnicodeString
return the UTF-8 representation of a aUnicodeString.
The resulting string contains only valid XML unicode
characters. Invalid characters are replaced by a
ReplacementCharacter. For details, please see

https://www.w3.org/TR/xml/#NT-Char
usage example(s):
     (self encodeString:'hello') asByteArray                             #[104 101 108 108 111]
     (self encodeString:(Character value:16r40) asString) asByteArray    #[64]
     (self encodeString:(Character value:16r7F) asString) asByteArray    #[127]
     (self encodeString:(Character value:16r80) asString) asByteArray    #[194 128]
     (self encodeString:(Character value:16rFF) asString) asByteArray    #[195 191]
     (self encodeString:(Character value:16r100) asString) asByteArray   #[196 128]
     (self encodeString:(Character value:16r200) asString) asByteArray   #[200 128]
     (self encodeString:(Character value:16r400) asString) asByteArray   #[208 128]
     (self encodeString:(Character value:16r800) asString) asByteArray   #[224 160 128]
     (self encodeString:(Character value:16r1000) asString) asByteArray  #[225 128 128]
     (self encodeString:(Character value:16r2000) asString) asByteArray  #[226 128 128]
     (self encodeString:(Character value:16r4000) asString) asByteArray  #[228 128 128]
     (self encodeString:(Character value:16r8000) asString) asByteArray  #[232 128 128]
     (self encodeString:(Character value:16rFFFF) asString)             '?'

o  encodeString: aUnicodeString on: aStream
return the UTF-8 representation of a aUnicodeString.
The resulting string contains only valid XML unicode
characters. Invalid characters are replaced by a
ReplacementCharacter. For details, please see

https://www.w3.org/TR/xml/#NT-Char

queries
o  nameOfEncoding



ST/X 7.1.0.0; WebServer 1.663 at exept.de:8081; Fri, 20 Jul 2018 06:34:12 GMT