|
Class: TwoByteString
Object
|
+--Collection
|
+--SequenceableCollection
|
+--ArrayedCollection
|
+--UninterpretedBytes
|
+--CharacterArray
|
+--TwoByteString
|
+--BIG5EncodedString
|
+--GBEncodedString
|
+--JISEncodedString
|
+--KSCEncodedString
|
+--TwoByteSymbol
|
+--Unicode16String
- Package:
- stx:libbasic
- Category:
- Collections-Text
- Version:
- rev:
1.57
date: 2024/03/05 11:54:58
- user: cg
- file: TwoByteString.st directory: libbasic
- module: stx stc-classLibrary: libbasic
TwoByteStrings are like strings, but storing 16bits per character.
The integration of them into the system is not completed ....
copyrightCOPYRIGHT (c) 1993 by Claus Gittinger
All Rights Reserved
This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice. This software may not
be provided or otherwise made available to, or used by, any
other person. No title to or ownership of the software is
hereby transferred.
initialization
-
initialize
-
initialize the class - private
Usage example(s):
instance creation
-
basicNew: anInteger
-
return a new empty string with anInteger number of characters
-
uninitializedNew: anInteger
-
return a new empty string with anInteger characters
Usage example(s):
accessing
-
basicAt: index
-
return the character at position index, an Integer
- reimplemented here since we return 16-bit characters
-
basicAt: index put: aCharacter
-
store the argument, aCharacter at position index, an Integer.
Returns aCharacter (sigh).
- reimplemented here since we store 16-bit characters
-
intValAt: index
-
return the int16 at position index, an Integer
-
unsignedInt16At: index
-
return the short at position index, an Integer
-
unsignedShortAt: index
-
marked as obsolete by Stefan Vogel at 23-Jul-2021
** This is an obsolete interface - do not use it (it may vanish in future versions) **
filling and replacing
-
from: start to: stop put: aCharacter
-
fill part of the receiver with aCharacter.
- reimplemented here for speed
Usage example(s):
(Unicode16String new:10) from:1 to:10 put:$a
(Unicode16String new:20) from:10 to:20 put:$b
(Unicode16String new:20) from:1 to:10 put:$c
(Unicode16String new:20) from:1 to:10 put:$c
(Unicode16String new:100) from:2 to:99 put:$c
(Unicode16String new:10) from:0 to:9 put:$a
(Unicode16String new:10) from:1 to:11 put:$a
|
-
replaceFrom: start to: stop with: aStringOrBytes startingAt: repStart
-
replace the characters starting at index start, anInteger and ending
at stop, anInteger with characters from aStringOrBytes starting at repStart.
Return the receiver.
If aStringOrBytes is ExternalBytes, copy two byte values from src.
- reimplemented here for speed
Usage example(s):
'hello world' asUnicode16String replaceFrom:1 to:5 with:'123456' startingAt:2
'hello world' asUnicode16String replaceFrom:1 to:5 with:'123456' asUnicode16String startingAt:2
'hello world' asUnicode16String replaceFrom:1 to:0 with:'123456' startingAt:2
'hello' asUnicode16String replaceFrom:1 to:6 with:'123456' startingAt:2
'hello world' asUnicode16String replaceFrom:1 to:1 with:'123456' startingAt:2
'hello world' asUnicode16String replaceFrom:1 to:5 with:'123456' asByteArray startingAt:2
'hello world' asUnicode16String replaceFrom:1 to:5 with:'123456' asExternalBytes startingAt:2
|
queries
-
bitsPerCharacter
-
return the number of bits each character has.
Here, 16 is returned (storing double byte characters).
-
bytesPerCharacter
-
return the number of bytes each character has.
Here, 2 is returned (storing double byte characters).
Usage example(s):
'abc' asTwoByteString bytesPerCharacter => 2
'abc' asTwoByteString bytesPerCharacterNeeded => 1
|
-
characterSize
-
answer the size in bits of my largest character (actually only 7, 8 or 16)
Usage example(s):
'hello world' asUnicode16String characterSize
'hello worldüäö' asUnicode16String characterSize
'a' asUnicode16String characterSize
'ü' asUnicode16String characterSize
'aa' asUnicode16String characterSize
'aü' asUnicode16String characterSize
'aaa' asUnicode16String characterSize
'aaü' asUnicode16String characterSize
'aaaü' asUnicode16String characterSize
'aaaa' asUnicode16String characterSize
'aaaaü' asUnicode16String characterSize
|
-
containsNon7BitAscii
-
return true, if the underlying string contains 8-bit characters (or wider)
(i.e. if it is non-ascii)
Usage example(s):
'hello world' asUnicode16String containsNon7BitAscii
'hello worldüäö' asUnicode16String containsNon7BitAscii
'ü' asUnicode16String containsNon7BitAscii
'aü' asUnicode16String containsNon7BitAscii
'aaü' asUnicode16String containsNon7BitAscii
'aaaü' asUnicode16String containsNon7BitAscii
'aaaaü' asUnicode16String containsNon7BitAscii
'aaaaa' asUnicode16String containsNon7BitAscii
|
-
containsNon8BitElements
-
return true, if the underlying string contains characters wider than 8 bits
Usage example(s):
'hello worldüäö' asUnicode16String containsNon8BitElements
'hello worldΔüäö' asUnicode16String containsNon8BitElements
'Δ' asUnicode16String containsNon8BitElements
'aΔ' asUnicode16String containsNon8BitElements
'aaΔ' asUnicode16String containsNon8BitElements
'aaaΔ' asUnicode16String containsNon8BitElements
'aaaaΔ' asUnicode16String containsNon8BitElements
'aaaaa' asUnicode16String containsNon8BitElements
|
-
isWideString
-
true if I require more than one byte per character
-
occurrencesOf: aCharacter
-
count the occurrences of the argument, aCharacter in myself
- reimplemented here for speed
Usage example(s):
'hello world' occurrencesOf:$a
'hello world' occurrencesOf:$w
'hello world' occurrencesOf:$l
'hello world' occurrencesOf:$x
'hello world' occurrencesOf:1
Time millisecondsToRun:[
|s|
s := 'abcdefghijklmn' asUnicode16String.
1000000 timesRepeat:[ s occurrencesOf:$x ]
]. 60 60 60 70 (untuned: 670 710 740)
|
testing
-
isSingleByteCollection
-
return true, if the receiver has access methods for bytes;
i.e. #at: and #at:put: accesses a byte and are equivalent to #byteAt: and byteAt:put:
and #replaceFrom:to: is equivalent to #replaceBytesFrom:to:.
false is returned here since at: returns 2-byte characters and not bytes
- the method is redefined from UninterpretedBytes.
|