eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'String':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: String


Inheritance:

   Object
   |
   +--Collection
      |
      +--SequenceableCollection
         |
         +--ArrayedCollection
            |
            +--UninterpretedBytes
               |
               +--CharacterArray
                  |
                  +--String
                     |
                     +--ISO8859L1String
                     |
                     +--ImmutableString
                     |
                     +--JavaScriptEnvironment::String
                     |
                     +--Symbol

Package:
stx:libbasic
Category:
Collections-Text
Version:
rev: 1.452 date: 2019/08/10 15:12:42
user: cg
file: String.st directory: libbasic
module: stx stc-classLibrary: libbasic
Author:
Claus Gittinger

Description:


Strings are ByteArrays storing Characters.

Strings are kind of kludgy: to allow for easy handling by C functions,
there is always one 0-byte added at the end, which is not counted
in the string's size, and is not accessible from the Smalltalk level.
This guarantees, that a Smalltalk string can always be passed to a
C- or a system api function without danger 
(of course, this does not prevent a nonsense contents...)

You cannot add any instvars to String, since the run time system & compiler
creates literal strings and knows that strings have no named instvars.
If you really need strings with instVars, you have to create a subclass
of String (the access functions defined here can handle this).
A little warning though: not all Smalltalk systems allow subclassing String,
so your program may become unportable if you do so.

Strings have an implicit (assumed) encoding of ISO-8859-1.
For strings with other encodings, either keep the encoding separately,
or use instances of encodedString.

Be careful when using the 0-byte in a String. This is not prohibited, but
the implementations of some String methods use C functions and may
therefore yield unexpected results (e.g. compareWith:collating:) when
processing a String containing the 0-byte.


Related information:

    Text
    StringCollection
    TwoByteString
    JISEncodedString
    Symbol

Class protocol:

Compatibility-Dolphin
o  lineDelimiter
Dolphin compatibility: answer CR LF

Compatibility-Squeak
o  cr
return a string consisting of the cr-Character

usage example(s):

and all cr's are really returns (instead of nl's).

o  crlf
return a string consisting of the cr-lf Characters

o  lf
return a string consisting of the lf Character

o  return
return a string consisting of the cr-Character

o  space
return a string consisting of a single space Character

o  stringHash: aString initialHash: speciesHash
for squeak compatibility only; this is NOT the same hash as my instances use

o  tab
return a string consisting of the tab-Character

Javascript support
o  fromCharCode: code
( an extension from the stx:libjavascript package )
return a string consisitng of a single character, given its code

o  js_new: argument
( an extension from the stx:libjavascript package )
redefinable JS-new:

instance creation
o  basicNew: anInteger
return a new empty string with anInteger characters.
In contrast to other smalltalks, this returns a string filled
with spaces (instead of a string filled with 0-bytes).
This makes much more sense, in that a freshly created string
can be directly used as separator or for formatting.

o  new: n
return a new empty string with n characters.
In contrast to other smalltalks, this returns a string filled
with spaces (instead of a string filled with 0-bytes).
This makes much more sense, in that a freshly created string
can be directly used as separator or for formatting.

Redefined here with exactly the same code as in Behavior for
better performance.

o  readFrom: aStreamOrString onError: exceptionBlock
read & return the next String from the (character-)stream aStream;
skipping all whitespace first; return the value of exceptionBlock,
if no string can be read. The sequence of characters as read from the
stream must be one as stored via storeOn: or storeString.

usage example(s):

     String readFrom:('''hello world''' readStream)
     String readFrom:('''hello '''' world''' readStream)
     String readFrom:('1 ''hello'' ' readStream)
     String readFrom:('1 ''hello'' ' readStream) onError:['foobar']

o  uninitializedNew: anInteger
return a new string with anInteger characters but undefined contents.
Use this, if the string is filled anyway with new data, for example, if
used as a stream buffer.

usage example(s):

     String uninitializedNew:100

queries
o  defaultPlatformClass
dummy for ST-80 compatibility

o  isBuiltInClass
return true if this class is known by the run-time-system.
Here, true is returned for myself, false for subclasses.


Instance protocol:

Compatibility - Squeak
o  asSwikiLink
( an extension from the stx:goodies/webServer/comanche package )

o  skipDelimiters: delimiters startingAt: start
( an extension from the stx:goodies/webServer/comanche package )
Answer the index of the character within the receiver, starting at start, that does NOT match one of the delimiters. If the receiver does not contain any of the delimiters, answer size + 1. Assumes the delimiters to be a non-empty string.

o  squeakAsInteger
( an extension from the stx:goodies/webServer/comanche package )
Answer the Integer created by interpreting the receiver as the string representation of an integer. Answer nil if no digits, else find the first digit and then all consecutive digits after that

o  translateWith: aTable
( an extension from the stx:goodies/webServer/comanche package )
'Hallo' translateWith:(String withAll: (Character allCharacters collect: [:c | c asLowercase]))

o  trimNullAndStar
( an extension from the stx:goodies/webServer/comanche package )
' * string *** ' -------> 'string'

o  unescapePercents
( an extension from the stx:goodies/webServer/comanche package )
change each %XY substring to the character with ASCII value XY in hex. This is the opposite of #encodeForHTTP

Compatibility-Squeak
o  asEnglishPlural
( an extension from the stx:libcompat package )
Answer the plural of the receiver. Assumes the receiver is an English noun.
For a more comprehensive algorithm please refer to ''An Algorithmic Approach
to English Pluralization'' by Damian Conway.

o  deepFlattenInto: stream
( an extension from the stx:libcompat package )

o  piecesCutWhere: aBlock
( an extension from the stx:libcompat package )
Evaluate testBlock for successive pairs of the receiver elements,
breaking the receiver into pieces between elements where
the block evaluated to true, and return an OrderedCollection of
those pieces.

usage example(s):

'A sentence. Another sentence... Yet another sentence.'
		piecesCutWhere: [:each :next | each = $. and: [next = Character space]]

o  piecesCutWhereCamelCase
( an extension from the stx:libcompat package )
Breaks apart words written in camel case.

It's not simply using piecesCutWhere: because we want
to also deal with abbreviations and thus we need to
decide based on three characters, not just on two:
('FOOBar') piecesCutWhereCamelCase asArray = #('FOO' 'Bar').
('FOOBar12AndSomething') piecesCutWhereCamelCase asArray = #('FOO' 'Bar' '12' 'And' 'Something')

o  replaceSuffix: suffix with: replacement
( an extension from the stx:libcompat package )

o  withInternetLineEndings
( an extension from the stx:libcompat package )
generate a copy with all cr's replaced by crnl

Compatibility-VW5.4
o  asByteString
( an extension from the stx:libcompat package )

o  asGUID
( an extension from the stx:libcompat package )
return self as a GUID (or UUID if not present)

usage example(s):

     '{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}' asGUID

accessing
o  at: index
return the character at position index, an Integer.
Reimplemented here to avoid the additional at:->basicAt: send
(which we can do here, since at: is obviously not redefined in a subclass).
This method is the same as at:.

o  at: index put: aCharacter
store the argument, aCharacter at position index, an Integer.
Return aCharacter (sigh).
Reimplemented here to avoid the additional at:put:->basicAt:put: send
(but only for Strings, since subclasses may redefine basicAt:put:).
This method is the same as basicAt:put:.

o  basicAt: index
return the character at position index, an Integer
- reimplemented here since we return characters

o  basicAt: index put: aCharacter
store the argument, aCharacter at position index, an Integer.
Returns aCharacter (sigh).
- reimplemented here since we store characters

o  first
return the first character.
Reimplemented here for speed

usage example(s):

     'abc' first
     '' first

character searching
o  identityIndexOf: aCharacter
return the index of the first occurrences of the argument, aCharacter
in the receiver or 0 if not found - reimplemented here for speed.

usage example(s):

     'hello world' identityIndexOf:(Character space)
     'hello world' identityIndexOf:$d
     'hello world' identityIndexOf:1
     #[0 0 1 0 0] asString identityIndexOf:(Character value:1)
     #[0 0 1 0 0] asString identityIndexOf:(Character value:0)

o  identityIndexOf: aCharacter startingAt: index
return the index of the first occurrences of the argument, aCharacter
in the receiver or 0 if not found - reimplemented here for speed.

usage example(s):

     'hello world' identityIndexOf:(Character space)
     'hello world' identityIndexOf:$d
     'hello world' identityIndexOf:1
     #[0 0 1 0 0] asString identityIndexOf:(Character value:1)
     #[0 0 1 0 0] asString identityIndexOf:(Character value:0)

o  includes: aCharacter
return true, if the receiver includes aCharacter.
- redefined here for speed

usage example(s):

     'hello world' includes:$l
     'hello world' includes:$W

     |s|
     s := String new:1024.
     s atAllPut:$a.
     s at:512 put:(Character space).
     Time millisecondsToRun:[
	1000000 timesRepeat:[ s includes:(Character space) ]
     ]

     timing (ms):
	    bcc                 OSX(2007 powerbook)
				 110

o  includesAny: aCollection
return true, if the receiver includes any of the characters in the
argument, aCollection.
- redefined for speed if the argument is a String; especially optimized,
if the searched collection has less than 6 characters.

usage example(s):

     'hello world' includesAny:'abcd'
     'hello world' includesAny:'xyz'
     'hello world' includesAny:'xz'
     'hello world' includesAny:'od'
     'hello world' includesAny:'xd'
     'hello world' includesAny:'dx'
     'hello world' includesAny:(Array with:$a with:$b with:$d)
     'hello world' includesAny:(Array with:$x with:$y)
     'hello world' includesAny:(Array with:1 with:2)

     |s|
     s := String new:1000 withAll:$a.
     Time millisecondsToRun:[
	1000000 timesRepeat:[
	    s includesAny:'12'
	]
     ].540 680 550 850 890 850

     |s|
     s := String new:2000 withAll:$a.
     Time millisecondsToRun:[
	1000000 timesRepeat:[
	    s includesAny:'12'
	]
     ]. 1030 1060 1650 1690

     |s|
     s := 'hello world'.
     Time millisecondsToRun:[
	1000000 timesRepeat:[
	    s includesAny:'12'
	]
     ].70 60

o  indexOf: aCharacter startingAt: start
return the index of the first occurrence of the argument, aCharacter
in myself starting at start, anInteger or 0 if not found;
- reimplemented here for speed

usage example(s):

     'hello world' indexOf:$0 startingAt:1
     'hello world' indexOf:$l startingAt:1
     'hello world' indexOf:$l startingAt:5
     'hello world' indexOf:$d startingAt:5
     #[0 0 1 0 0] asString indexOf:(Character value:1) startingAt:1
     #[0 0 1 0 0] asString indexOf:(Character value:0) startingAt:3

     '1234567890123456a' indexOf:$a
     '1234567890123456a' indexOf:$b

     |s|
     s := '12345678901234b'.
     self assert:(s indexOf:$x) == 0.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$0) == 10.
     self assert:(s indexOf:$b) == 15.

     |s|
     s := ''.
     self assert:(s indexOf:$1) == 0.
     s := '1'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 0.
     s := '12'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 0.
     s := '123'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 0.
     s := '1234'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 0.
     s := '12345'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$6) == 0.
     s := '123456'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$6) == 6.
     self assert:(s indexOf:$7) == 0.
     s := '1234567'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$6) == 6.
     self assert:(s indexOf:$7) == 7.
     self assert:(s indexOf:$8) == 0.
     s := '12345678'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$6) == 6.
     self assert:(s indexOf:$7) == 7.
     self assert:(s indexOf:$8) == 8.
     self assert:(s indexOf:$9) == 0.
     s := '123456789'.
     self assert:(s indexOf:$1) == 1.
     self assert:(s indexOf:$2) == 2.
     self assert:(s indexOf:$3) == 3.
     self assert:(s indexOf:$4) == 4.
     self assert:(s indexOf:$5) == 5.
     self assert:(s indexOf:$6) == 6.
     self assert:(s indexOf:$7) == 7.
     self assert:(s indexOf:$8) == 8.
     self assert:(s indexOf:$9) == 9.

     self assert:(s indexOf:$0) == 0.
     self assert:(s indexOf:$b) == 0.

     |s|
     s := String new:1024.
     s atAllPut:$a.
     s at:512 put:(Character space).
     Time millisecondsToRun:[
	1000000 timesRepeat:[ s indexOf:(Character space) ]
     ]

     timing (ms):
	    bcc                 OSX(2007 powerbook)
	v1: 1763 normal
	    2340 +unroll
	    3308 memsrch !       90
	v2: 1045                150

o  indexOfAny: aCollectionOfCharacters startingAt: start
return the index of the first occurrence of any character in aCollectionOfCharacters,
in myself starting at start, anInteger or 0 if not found;
- reimplemented here for speed if aCollectionOfCharacters is a string.

usage example(s):

     'hello world' indexOfAny:'eoa' startingAt:1
     'hello world' indexOfAny:'eoa' startingAt:6
     'hello world' indexOfAny:'AOE' startingAt:1
     'hello world' indexOfAny:'o' startingAt:6
     'hello world' indexOfAny:'o' startingAt:6
     'hello world§' indexOfAny:'#§$' startingAt:6

o  indexOfControlCharacterStartingAt: start
return the index of the next control character;
that is a character with asciiValue < 32.
Return 0 if none is found.

usage example(s):

     'hello world'             indexOfControlCharacterStartingAt:1
     'hello world\foo' withCRs indexOfControlCharacterStartingAt:1
     '1\' withCRs indexOfControlCharacterStartingAt:1
     '1\' withCRs indexOfControlCharacterStartingAt:2

o  indexOfNonSeparatorStartingAt: start
return the index of the next non-whiteSpace character, 0 if none found

o  indexOfSeparatorStartingAt: start
return the index of the next separator (whitespace) character; 0 if none found

usage example(s):

123456789012
    'hello world ' indexOfSeparatorStartingAt:1 -> 6
    'hello world ' indexOfSeparatorStartingAt:3 -> 6
    'hello world ' indexOfSeparatorStartingAt:7 -> 12
    'hello world' indexOfSeparatorStartingAt:7  -> 0
    'helloworld' indexOfSeparatorStartingAt:1   -> 0

o  occurrencesOf: aCharacter
count the occurrences of the argument, aCharacter in myself
- reimplemented here for speed

usage example(s):

     'hello world' occurrencesOf:$a
     'hello world' occurrencesOf:$w
     'hello world' occurrencesOf:$l
     'hello world' occurrencesOf:$x
     'hello world' occurrencesOf:1
     Time millisecondsToRun:[
	1000000 timesRepeat:[ 'abcdefghijklmn' occurrencesOf:$x ]
     ]. 219 203 156 203 204 204 219 172 187 187 141

comparing
o  < aString
Compare the receiver with the argument and return true if the
receiver is greater than the argument. Otherwise return false.
No national variants are honored; use after: for this.
In contrast to ST-80, case differences are NOT ignored, thus
'foo' < 'Foo' will return false.
This may change.

o  = aString
Compare the receiver with the argument and return true if the
receiver is equal to the argument. Otherwise return false.
This compare is case-sensitive (i.e. 'Foo' is NOT = 'foo').
Use sameAs: to compare with case ignored.

usage example(s):

     'foo' = 'Foo'
     'foo' sameAs: 'Foo'
     #[0 0 1 0 0] asString = #[0 0 1 0 0] asString

o  > aString
Compare the receiver with the argument and return true if the
receiver is greater than the argument. Otherwise return false.
No national variants are honored; use after: for this.
In contrast to ST-80, case differences are NOT ignored, thus
'foo' > 'Foo' will return true.
This may change.

o  compareCaselessWith: aString
Compare the receiver against the argument, ignoring case.
Return 1 if the receiver is greater, 0 if equal and -1 if less than the argument.

usage example(s):

     'aaa' compareCaselessWith:'aaaa' -1
     'aaaa' compareCaselessWith:'aaa' 1
     
     'aaaa' compareCaselessWith:'aaaA' 0
     'aaaA' compareCaselessWith:'aaaa' 0
     'aaaAB' compareCaselessWith:'aaaa' 1
     'aaaaB' compareCaselessWith:'aaaA' 1
     'aaaa' compareCaselessWith:'aaaAB' -1
     'aaaA' compareCaselessWith:'aaaaB' -1
     'aaaa' compareCaselessWith:'aaax'  -1
     'aaaa' compareCaselessWith:'aaaX'  -1

o  compareCollatingWith: aString
Compare the receiver with the argument and return 1 if the receiver is
greater, 0 if equal and -1 if less than the argument in a sorted list.
The comparison is language specific, depending on the value of
LC_COLLATE, which is in the shell environment.

usage example(s):

     'hallo' compareWith:'hällo'
     'hbllo' compareWith:'hällo'

     'hallo' compareCollatingWith:'hällo'
     'hbllo' compareCollatingWith:'hällo'

o  compareWith: aString
Compare the receiver with the argument and return 1 if the receiver is
greater, 0 if equal and -1 if less than the argument.
This comparison is based on the elements' codepoints -
i.e. upper/lowercase & national characters are NOT treated specially.
'foo' compareWith: 'Foo' will return 1.
while 'foo' sameAs:'Foo' will return true

o  compareWith: aString collating: collatingBoolean
Compare the receiver with the argument and return 1 if the receiver is
greater, 0 if equal and -1 if less than the argument.
If the collatingBoolean is true, the comparison will be based on the
current setting of LC_COLLATE in the locale (which is set in the shell environment);
otherwise, it will be a simple string-compare.
This comparison is based on the elements' codepoints -
i.e. upper/lowercase & national characters are NOT treated specially.
'foo' compareWith: 'Foo' will return 1.
while 'foo' sameAs:'Foo' will return true

o  endsWith: aStringOrChar
return true, if the receiver ends with something, aStringOrChar.
If aStringOrChar is an empty string, true is returned

usage example(s):

     'hello world' endsWith:'world'
     'hello world' endsWith:'earth'
     'hello world' endsWith:$d
     'hello world' endsWith:$e
     '' endsWith:$d
     'hello world' endsWith:#($r $l $d)
     'hello world' endsWith:''

o  hash
return an integer useful as a hash-key.
This default method uses whichever hash algorithm
used in the ST/X VM (which is actually fnv-1a)

usage example(s):

     'a' hash
     'ab' hash = 'ab' asUnicode16String hash

o  hash_dragonBook
return an integer useful as a hash-key.
This method implements the dragon-book algorithm (aho, ullman).

o  hash_fnv1a
return an integer useful as a hash-key.
This method uses the fnv-1a algorithm
(which is actually a pretty good one).
Notice: this returns a 31bit value,
even on 64bit CPUs, only small 4-byte hashvalues are returned,
(so hash values are independent from the architecture)

usage example(s):

     'a' hash_fnv1a

o  hash_fnv1a_64
return an integer useful as a hash-key.
This method uses the fnv-1a algorithm
(which is actually a pretty good one).
Notice: this returns 64 bit hashvalues

usage example(s):

     '' hash_fnv1a_64
     'a' hash_fnv1a_64
     '77kepQFQ8Kl' hash_fnv1a_64

o  hash_java
return an integer useful as a hash-key.
This method uses the same algorithm as used in
the java virtual machine (which is actually a bad one).

usage example(s):

     'a' hash_java

o  hash_sdbm
return an integer useful as a hash-key.
This method implements the sdbm algorithm.

o  levenshteinTo: aString s: substWeight k: kbdTypoWeight c: caseWeight i: insrtWeight d: deleteWeight
parametrized levenshtein. arguments are the costs for
substitution, case-change, insertion and deletion of a character.

usage example(s):

     'ocmprt' levenshteinTo:'computer'
     'computer' levenshteinTo:'computer'
     'ocmputer' levenshteinTo:'computer'
     'cmputer' levenshteinTo:'computer'
     'computer' levenshteinTo:'cmputer'
     'computer' levenshteinTo:'vomputer'
     'computer' levenshteinTo:'bomputer'
     'Computer' levenshteinTo:'computer'

o  sameAs: aString
Compare the receiver with the argument like =, but ignore case differences.
Return true or false.

usage example(s):

     'hello' sameAs:'hello'
     'hello' sameAs:'Hello'
     'hello' sameAs:''
     '' sameAs:'Hello'
     'hello' sameAs:'hellO'
     'hello' sameAs:'Hellx'

o  startsWith: aStringOrChar
return true, if the receiver starts with something, aStringOrChar.
If the argument is empty, true is returned.
Notice, that this is similar to, but slightly different from VW's and Squeak's beginsWith:,
which are both inconsistent w.r.t. an empty argument.

usage example(s):

     'hello world' startsWith:'hello'
     'hello world' startsWith:'hella'
     'hello world' startsWith:'hi'
     'hello world' startsWith:$h
     'hello world' startsWith:$H
     'hello world' startsWith:(Character value:16rFF00)
     'hello world' startsWith:60
     'hello world' startsWith:#($h $e $l)
     'hello world' startsWith:''

o  ~= aString
Compare the receiver with the argument and return true if the
receiver is not equal to the argument. Otherwise return false.
This compare is case-sensitive (i.e. 'Foo' is NOT = 'foo').
Actually, there is no need to redefine that method here,
the default (= not as inherited) works ok.
However, this may be heavily used and the redefinition saves an
extra message send.

converting
o  asAsciiZ
if the receiver does not end with a 0-valued character, return a copy of it,
with an additional 0-character. Otherwise return the receiver. This is sometimes
needed when a string has to be passed to C, which needs 0-terminated strings.
Notice, that all singleByte strings are already 0-terminated in ST/X, whereas wide
strings are not.

usage example(s):

     'abc' asAsciiZ
     'abc' asWideString asAsciiZ

o  asByteArray
return a new ByteArray with the receiver's elements.
This redefined method is faster than Collection>>#asByteArray

usage example(s):

     'fooBar' asByteArray.

o  asDenseUnicodeString
return the receiver as single-byte, double byte or 4-byte unicode string,
depending on the number of bits required to hold all characters in myself.
Use this to extract non-wide parts from a wide string,
i.e. after a substring has been copied out of a wide string

o  asExternalBytes
return a 0-terminated externalBytes collection containing
my characters.
The returned collection is save from being garbage collected;
i.t. it can be handed to a C-function, and must
(either there or here) be freed explicitly or unprotectedFromGC

o  asExternalBytesUnprotected
Like asExternalBytes, but does not register the bytes so
bytes are GARBAGE-COLLECTED!

o  asHttpResponseTo: request
( an extension from the stx:goodies/webServer/comanche package )

o  asImmutableCollection
return a write-protected copy of myself

o  asImmutableString
return a write-protected copy of myself

o  asLowercase
a tuned version for Strings with size < 255. Some apps call this very heavily.
We can do this for 8-bit strings, since the mapping is well known and lowercase chars
fit in one byte also.

usage example(s):

	'Hello WORLD' asLowercase
	(String new:300) asLowercase
	#utf8 asLowercase

o  asSingleByteString
I am a string

o  asSingleByteStringIfPossible
I am a single-byte string

o  asSingleByteStringReplaceInvalidWith: replacementCharacter
return the receiver converted to a 'normal' string,
with invalid characters replaced by replacementCharacter.
Can be used to convert from 16-bit strings to 8-bit strings
and replace characters above code-255 with some replacement.
Dummy here, because I am already a single byte string.

o  asSymbol
Return a unique symbol with the name taken from the receiver's characters.

usage example(s):

     'hello' asSymbol

o  asSymbolIfInterned
If a symbol with the receiver's characters is already known, return it. Otherwise, return nil.
This can be used to query for an existing symbol and is the same as:
self knownAsSymbol ifTrue:[self asSymbol] ifFalse:[nil]
but slightly faster, since the symbol lookup operation is only
performed once.

usage example(s):

     'hello' asSymbolIfInterned
     'fooBarBaz' asSymbolIfInterned

o  beImmutable
make myself write-protected

o  utf16Encoded
UTF-16 encoding is the same as UCS-2 (Unicode16String)

o  withTabsExpanded: numSpaces
return a string with the characters of the receiver where all tabulator characters
are expanded into spaces (assuming numSpaces-col tabs).
Notice: if the receiver does not contain any tabs, it is returned unchanged;
otherwise a new string is returned.
This does handle multiline strings.
Rewritten for speed - because this is very heavily used when reading
big files in the FileBrowser (and therefore speeds up fileReading considerably).

copying
o  , aStringOrCharacter
return the concatenation of myself and the argument, aStringOrCharacter as a String.
- reimplemented here for speed

usage example(s):

     'hello' , ' world' asImmutableString
     'hello ' , #world
     'hello ' , $w
     #[0 0 0 1] asString, #[0 0 0 2 0] asString

o  concatenate: string1 and: string2
return the concatenation of myself and the arguments, string1 and string2.
This is equivalent to self , string1 , string2
- generated by compiler when such a construct is detected

o  concatenate: string1 and: string2 and: string3
return the concatenation of myself and the string arguments.
This is equivalent to self , string1 , string2 , string3
- generated by compiler when such a construct is detected

o  copy
return a copy of the receiver

o  copyFrom: start
return a new collection consisting of receiver's elements from startIndex to the end of the collection.
This method will always return a string, even if the receiver
is a subclass-instance. This might change if there is a need.
- reimplemented here for speed

usage example(s):

	'12345' copyFrom:3
	'12345678' copyFrom:9 -> empty string
	'12345678' copyFrom:0 -> error

o  copyFrom: start to: stop
return the substring starting at index start, anInteger and ending
at stop, anInteger. This method will always return a string, even
if the receiver is a subclass-instance. This might change if there is a need.
- reimplemented here for speed

usage example(s):

	'12345678' copyFrom:3 to:7
	'12345678' copyFrom:3 to:3
	'12345678' copyFrom:3 to:2 -> empty string

	'12345678' copyFrom:9 to:9 -> error
	'12345678' copyFrom:3 to:9 -> error
	'12345678' copyFrom:0 to:8 -> error

	(Unicode16String with:(Character value:16r220) with:$a with:$b with:(Character value:16r221) with:(Character value:16r222))
	    copyFrom:2 to:3
	((Unicode16String with:(Character value:16r220) with:$a with:$b with:(Character value:16r221) with:(Character value:16r222))
	    copyFrom:2 to:3) asSingleByteString

o  copyWith: aCharacter
return a new string containing the receiver's characters
and the single new character, aCharacter.
This is different from concatentation, which expects another string
as argument, but equivalent to copy-and-addLast.
Reimplemented here for more speed

usage example(s):

     '1234567' copyWith:$8
     '1234567' copyWith:(Character value:16r220)

o  deepCopy
return a copy of the receiver

usage example(s):

     could be an instance of a subclass which needs deepCopy
     of its named instvars ...

o  deepCopyUsing: aDictionary postCopySelector: postCopySelector
return a deep copy of the receiver - reimplemented to be a bit faster

o  shallowCopy
return a copy of the receiver

o  simpleDeepCopy
return a copy of the receiver

filling & replacing
o  atAllPut: aCharacter
replace all elements with aCharacter
- reimplemented here for speed

usage example(s):

     (String new:10) atAllPut:$*
     String new:10 withAll:$*

o  from: start to: stop put: aCharacter
fill part of the receiver with aCharacter.
- reimplemented here for speed

usage example(s):

     (String new:10) from:1 to:10 put:$a
     (String new:20) from:10 to:20 put:$b
     (String new:20) from:1 to:10 put:$c
     (String new:20) from:1 to:10 put:$c 
     (String new:100) from:2 to:99 put:$c 

o  replaceAll: oldCharacter with: newCharacter
replace all oldCharacters by newCharacter in the receiver.

Notice: This operation modifies the receiver, NOT a copy;
therefore the change may affect all others referencing the receiver.

usage example(s):

     'helloWorld' copy replaceAll:$o with:$O
     'helloWorld' copy replaceAll:$d with:$*
     'helloWorld' copy replaceAll:$h with:$*

o  replaceFrom: start to: stop with: aString startingAt: repStart
replace the characters starting at index start, anInteger and ending
at stop, anInteger with characters from aString starting at repStart.
Return the receiver.

- reimplemented here for speed

o  withoutSeparators
return a string containing the chars of myself
without leading and trailing whitespace.
If there is no whitespace, the receiver is returned.
Notice, this is different from String>>withoutSpaces.

usage example(s):

     'hello' withoutSeparators
     '    hello' withoutSeparators
     '    hello ' withoutSeparators
     '    hello  ' withoutSeparators
     '    hello   ' withoutSeparators
     '    hello    ' withoutSeparators
     '        ' withoutSeparators

o  withoutSpaces
return a string containing the characters of myself
without leading and trailing spaces.
If there are no spaces, the receiver is returned unchanged.
Notice, this is different from String>>withoutSeparators.

usage example(s):

     '    hello' withoutSpaces
     '    hello ' withoutSpaces
     '    hello  ' withoutSpaces
     '    hello   ' withoutSpaces
     '    hello    ' withoutSpaces
     '        ' withoutSpaces

printing & storing
o  _errorPrint
Do not use this in user code.
Print the receiver on standard error.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  _errorPrintCR
Do not use this in user code.
Print the receiver on standard error.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  _print
Do not use this in user code.
Print the receiver on standard output.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  _printCR
Do not use this in user code.
Print the receiver on standard output.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  errorPrint
print the receiver on standard error, if the global Stderr is nil;
otherwise, fall back to the inherited errorPrint, which sends the string to
the Stderr stream or to a logger.
Redefined to be able to print during early startup,
when the stream classes have not yet been initialized (i.e. Stderr is nil).

usage example(s):

      'hello world' asUnicode16String errorPrint
      (Character value:356) asString errorPrint
      'Bönnigheim' errorPrint
      'Bönnigheim' asUnicodeString errorPrint

o  errorPrintCR
print the receiver on standard error, followed by a cr,
if the global Stderr is nil; otherwise, fall back to the inherited errorPrintCR,
which sends the string to the Stderr stream or to a logger.
Redefined to be able to print during early startup,
when the stream classes have not yet been initialized (i.e. Stderr is nil).

o  lowLevelErrorPrint
Do not use this in user code.
Print the receiver on standard error.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  lowLevelErrorPrintCR
Do not use this in user code.
Print the receiver on standard error.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  lowLevelPrint
Do not use this in user code.
Print the receiver on standard output.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  lowLevelPrintCR
Do not use this in user code.
Print the receiver on standard output.
This method does NOT (by purpose) use the stream classes and
will therefore work even in case of emergency during early startup
or in a crash situation (MiniDebugger).

o  print
print the receiver on standard output, if the global Stdout is nil;
otherwise, fall back to the inherited print,
which sends the string to the Stdout stream.
Redefined to be able to print during early startup,
when the stream classes have not yet been initialized (i.e. Stdout is nil).

o  printCR
print the receiver on standard output, followed by a cr,
if the global Stdout is nil; otherwise, fall back to the inherited errorPrintCR,
which sends the string to the Stdout stream.
Redefined to be able to print during early startup,
when the stream classes have not yet been initialized (i.e. Stdout is nil).

o  printfPrintString: formatString
non-standard but sometimes useful.
Return a printed representation of the receiver as specified by formatString,
which is defined by printf.

If you use this, be aware, that the format string must be correct and something like %s.

This method is NONSTANDARD and may be removed without notice.
WARNNG: this goes directly to the C-printf function and may therefore be inherently unsafe.
Please use the printf: method, which is both safe
and completely implemented in Smalltalk.

usage example(s):

     'hello' printfPrintString:'%%s -> %s'
     (String new:900) printfPrintString:'%%s -> %s'
     'hello' printfPrintString:'%%10s -> %10s'
     'hello' printfPrintString:'%%-10s -> %-10s'
     'hello' printfPrintString:'%%900s -> %900s'
     'hello' printfPrintString:'%%-900s -> %-900s'

o  storeOn: aStream
put the storeString of myself onto a aStream

o  storeString
return a String for storing myself

queries
o  basicSize
return the number of characters in myself.
Redefined here to exclude the 0-byte at the end.

o  bitsPerCharacter
return the number of bits each character has.
Here, 8 is returned (storing single byte characters).

o  bytesPerCharacter
return the number of bytes each character has.
Here, 1 is returned (storing single byte characters).

o  bytesPerCharacterNeeded
return the actual underlying string's required bytesPerCharacter
(i.e. checks if all characters really need that depth)

o  characterSize
answer the size in bits of my largest character (actually only 7 or 8)

usage example(s):

     'hello world' characterSize
     'hello world' asUnicode16String characterSize
     ('hello world' , (Character value:16r88) asString) characterSize

o  containsNon7BitAscii
return true, if the underlying string contains 8BitCharacters (or widers)
(i.e. if it is non-ascii)

usage example(s):

     'hello world' containsNon7BitAscii
     'hello world' asTwoByteString containsNon7BitAscii
     ('hello world' , (Character value:16r88) asString) containsNon7BitAscii

o  containsNon8BitElements
return true, if the underlying string contains elements larger than a single byte

o  isBlank
return true, if the receiver's size is 0 or if it contains only spaces.
Q: should we care for whiteSpace in general here ?

o  isEmpty
return true if the receiver is empty (i.e. if size == 0)
Redefined here for performance

o  isWideString
true if I require more than one byte per character

o  knownAsSymbol
return true, if there is a symbol with same characters in the
system.
Can be used to check for existance of a symbol without creating one

usage example(s):

     'hello' knownAsSymbol
     'fooBarBaz' knownAsSymbol

o  notEmpty
return true if the receiver is not empty (i.e. if size ~~ 0)
Redefined here for performance

o  size
return the number of characters in myself.
Reimplemented here to avoid the additional size->basicSize send
(which we can do here, since size is obviously not redefined in a subclass).
This method is the same as basicSize.

o  stringSpecies

o  utf8DecodedMaxBytes
return the number of characters needed when this string is
decoded from UTF-8.

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  utf8DecodedSize
return the number of characters needed when this string is
decoded from UTF-8.

usage example(s):

     'hello world' utf8DecodedSize
     'ä' utf8Encoded utf8DecodedSize
     'äΣΔΨӕἤῴ' utf8Encoded utf8DecodedSize

sorting & reordering
o  reverseFrom: startIndex to: endIndex
in-place reverse the characters of the string.
WARNING: this is a destructive operation, which modifies the receiver.
Please use reversed (with a d) for a functional version.

usage example(s):

     '1234567890' copy reverseFrom:2 to:5
     '1234567890' copy reverse
     '1234567890' copy reversed

     |t|
     t := '1234567890abcdefghijk' copy.
     t reverseFrom:1 to:10.
     t reverseFrom:11 to:t size.
     t reverseFrom:1 to:t size.
     t

     |t|
     t := '1234567890abcdefghijk' copy.
     t reverseFrom:1 to:2.
     t reverseFrom:3 to:t size.
     t reverseFrom:1 to:t size.
     t

substring searching
o  caseInsensitiveIndexOfSubCollection: aSubString startingAt: startIndex ifAbsent: exceptionValue
naive search fallback (non-BM).
Private method to speed up caseInSensitive searches

usage example(s):

     'abcdefg' caseInsensitiveIndexOfSubCollection:'abc' startingAt:1 ifAbsent:nil
     'abcdefg' caseInsensitiveIndexOfSubCollection:'bcd' startingAt:1 ifAbsent:nil
     'abcdefg' caseInsensitiveIndexOfSubCollection:'cde' startingAt:1 ifAbsent:nil
     'abcabcg' caseInsensitiveIndexOfSubCollection:'abc' startingAt:2 ifAbsent:nil

     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'abc' startingAt:1 ifAbsent:nil
     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'Abc' startingAt:1 ifAbsent:nil
     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'aBC' startingAt:1 ifAbsent:nil
     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'ABC' startingAt:1 ifAbsent:nil

     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'a' startingAt:1 ifAbsent:nil
     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'A' startingAt:1 ifAbsent:nil

     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'bcd' startingAt:1 ifAbsent:nil
     'ABCDEFG' caseInsensitiveIndexOfSubCollection:'cde' startingAt:1 ifAbsent:nil
     'ABCABCG' caseInsensitiveIndexOfSubCollection:'abc' startingAt:2 ifAbsent:nil

     '1234567890' caseInsensitiveIndexOfSubCollection:'abc' startingAt:1 ifAbsent:nil
     '1234567890' caseInsensitiveIndexOfSubCollection:'123' startingAt:1 ifAbsent:nil

o  indexOfSubCollection: aSubString startingAt: startIndex ifAbsent: exceptionValue caseSensitive: caseSensitive
redefined as primitive for maximum speed (BM).
Compared to the strstr libc function, on my machine,
BM is faster for caseSensitive compares above around 8.5 searched characters.
For much longer searched strings, BM is much faster; 5times as fast for 20chars.
For caseInsensitive compares, strstr was found to be slower than caseInsensitiveIndexOf.

testing
o  isLiteral
return true, if the receiver can be used as a literal constant in ST syntax
(i.e. can be used in constant arrays)

o  isSingleByteString
returns true only for strings and immutable strings.
Must replace foo isMemberOf:String and foo class == String

tracing
o  traceInto: aRequestor level: level from: referrer
double dispatch into tracer, passing my type implicitely in the selector



ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Fri, 19 Apr 2024 21:08:32 GMT