|
|
Class: ISO10646_to_UTF8 (in CharacterEncoderImplementations)
Object
|
+--CharacterEncoder
|
+--CharacterEncoderImplementations::TwoByteEncoder
|
+--CharacterEncoderImplementations::ISO10646_to_UTF8
- Package:
- stx:libbasic
- Category:
- Collections-Text-Encodings
- Version:
- rev:
1.16
date: 2009/09/22 09:08:09
- user: fm
- file: Encoder_ISO10646_to_UTF8.st directory: libbasic
- module: stx stc-classLibrary: libbasic
encoding & decoding
-
decode: aCode
-
-
decodeString: aStringOrByteCollection
-
given a string in UTF8 encoding,
return a new string containing the same characters, in 16bit (or more) encoding.
Returns either a normal String, a TwoByteString or a FourByteString instance.
Only useful, when reading from external sources.
This only handles up-to 30bit characters.
If you work a lot with utf8 encoded textFiles,
this is a first-class candidate for a primitive.
-
encode: aCode
-
-
encodeString: aUnicodeString
-
return the UTF-8 representation of a aUnicodeString.
The resulting string is only useful to be stored on some external file,
not for being used inside ST/X.
If you work a lot with utf8 encoded textFiles,
this is a first-class candidate for a primitive.
queries
-
private bytesToReadFor: firstByte
-
-
characterSize: charOrcodePoint
-
return the number of bytes required to encode codePoint
-
nameOfEncoding
-
stream support
-
readNext: charactersToRead charactersFrom: stream
-
-
readNextCharacterFrom: aStream
-
Encoding (unicode to utf8)
ISO10646_to_UTF8 encodeString:'hello'.
Decoding (utf8 to unicode):
|t|
t := ISO10646_to_UTF8 encodeString:'Helloś'.
ISO10646_to_UTF8 decodeString:t.
|