|
Class: Set
Object
|
+--Collection
|
+--Set
|
+--Dictionary
|
+--IdentitySet
|
+--OrderedSet
|
+--PluggableSet
- Package:
- stx:libbasic
- Category:
- Collections-Unordered
- Version:
- rev:
1.179
date: 2024/04/24 09:00:37
- user: cg
- file: Set.st directory: libbasic
- module: stx stc-classLibrary: libbasic
a Set is a collection where each element occurs at most once.
The inclusion test is done using equality (=) for comparison;
see IdentitySet for sets using identity compare.
Keep in mind, that a regular Set therefore treats 3.0 and 3 as equal
and therefore:
(Set with:3.0) includes:3
will return true (since 3.0 and 3 are equal).
In contrast, an IdentitySet will return false, because 3.0 and 3 are not
identical.
Sets use hashing for fast access, this access is considerably faster,
if a good hash-number is returned by the elements.
Notice that the default hash (Object>>hash) is not perfect; due to
the implementation of hash-keys in ST/X, increased hash collisions
are to be expected for large sets (say: > 20000 element).
If your objects are heavyly used in sets or dictionaries, and you need
big collections, your instances may provide a better hash values.
Performance hints:
If only symbols or smallIntegers are used as keys,
use an instance of IdentitySet for slightly better performance,
since both hashing and comparison is faster.
If you have a rough idea how big the set is going to grow,
create it using #new: instead of #new. Even if the size given is a
poor guess (say half of the real size), there is some 20-30% performance
win to expect, since many resizing operations of the set are avoided
(resizing is expensive, as the set does a rehash).
Examples:
|s|
s := Set new.
s add:'hello'.
s add:'world'.
s add:#foo.
s add:1.2345678.
s add:'hello'.
s printCR.
's size -> ' print. s size printCR.
'(s includes:''hello'') -> ' print. (s includes:'hello') printCR.
'(s includes:#foo) -> ' print. (s includes:#foo) printCR.
'(s includes:''foo'') -> ' print. (s includes:'foo') printCR.
'(s includes:#bar) -> ' print. (s includes:#bar) printCR.
copyrightCOPYRIGHT (c) 1991 by Claus Gittinger
All Rights Reserved
This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice. This software may not
be provided or otherwise made available to, or used by, any
other person. No title to or ownership of the software is
hereby transferred.
initialization
-
initialize
-
initialize the Set class
Usage example(s):
instance creation
-
decodeFromLiteralArray: anArray
-
create & return a new instance from information encoded in anArray.
-
new
-
return a new empty Set
-
new: anInteger
-
return a new empty Set with space for anInteger elements
-
newWithCapacity: size
-
return a new empty Collection with capacity for n elements.
queries
-
goodSizeFrom: arg
-
return a good array size for the given argument.
Returns the next prime after arg, since prime sizes are good for hashing.
utilities
-
rehashAllSubInstances
-
rehash all sets & dictionaries.
Useful utility if hash/identityHash method was changed
of some object which is known to be kept in a set
Usage example(s):
Set rehashAllSubInstances
|
Compatibility-ST80
-
initialIndexFor: hashKey boundedBy: length
-
for ST-80 compatibility only; it is (currently) not used in this
implementation of sets. Therefore, in ST/X it does not make sense
to redefine it. (which may be a bad design decision, but slightly
improves performance, by avoiding an extra message send ...)
Compatibility-Squeak
-
flatCollect: aBlock
( an extension from the stx:libcompat package )
-
Evaluate aBlock for each of the receiver's elements and answer the
list of all resulting values flatten one level.
Assumes that aBlock returns some kind of collection for each element.
Equivalent to the lisp's mapcan,
Notice that this is different from gather, which recurses deeper into elements.
Usage example(s):
#((1 2) (3 4) (5 6) (7 8 (9))) flatCollect:[:el | el]
#((1 2) (3 4) (5 6) (7 8) (9)) gather:[:el | el]
|
-
like: anObject
-
Answer an object in the receiver that is equal to anObject,
nil if no such object is found.
Relies heavily on hash properties (i.e. that two object's hashes are equal
if the two object compare equal)
Usage example(s):
(Set withAll:#(10.0 20.0 30.0 40.0)) like:20
|
accessing
-
addFirst: anObject
-
add the argument, anObject to the receiver.
If the receiver is ordered, the new element will be added at the beginning.
An error is raised here - it does not make sense for unordered collections
-
at: index
-
report an error: at: is not allowed for Sets
-
at: index put: anObject
-
report an error: at:put: is not allowed for Sets
-
elementAt: anObject
-
return the element, if contained in the set.
If there is none, report an error.
This may seem confusing at first - however, it is useful with
non-identitysets, to find an existing element, for a
given equal (but not identical) object.
This is the same functionality as provided by the goodies/KeyedSet goody.
Notice, that for dictionary sub-instances, the equal key is returned
-
elementAt: anObject ifAbsent: exceptionBlock
-
return the element, if contained in the set.
If there is none, return the result from evaluating exceptionBlock.
This may seem confusing at first - however, it is useful with
non-identitysets, to find an existing element, for a
given equal (but not identical) object.
This is the same functionality as provided by the goodies/KeyedSet goody.
Notice, that for dictionary sub-instances, the equal key is returned
-
removeLast
-
remove the last element from the receiver.
Return the removed element.
An error is raised here - it does not make sense for unordered collections
-
reverseDo: aBlock
-
evaluate the argument, aBlock for each element in reverse order.
An error is raised here - it does not make sense for unordered collections
adding & removing
-
add: anElement
-
Add the argument, anElement to the receiver (if not already included).
Return the added element.
WARNING: do not add elements while iterating over the receiver.
Iterate over a copy to do this.
-
addOrReplace: anObject
-
add the argument, anObject to the receiver.
If it is already included, replace it by anObject.
Return nil, if anObject was not present in the receiver,
otherwise the element that has been replaced.
Notice: this sounds like a useless operation, but because sets compare by equality,
the old object may be replaced by another object when equeal, but not identical.
For example, a string by a symbol or a float by an integer...
WARNING: do not add elements while iterating over the receiver.
Iterate over a copy to do this.
Usage example(s):
Note that 1 is replaced by 1.0:
self new
addOrReplace:1;
addOrReplace:2;
addOrReplace:nil;
addOrReplace:1.0;
yourself
|
-
clearContents
-
remove all elements from the receiver, but do not resize the underlying keyArray.
Returns the receiver.
Similar to removeAll, but might behave better,
if the receiver is to be filled again afterwards.
-
remove: oldObjectArg ifAbsent: exceptionBlock
-
remove the first occurrence of oldObject from the collection and return it.
If it was not in the collection return the value of exceptionBlock.
Notice, that the returned object could be non-identical to the argument
(although it will always be equal).
WARNING: do not remove elements while iterating over the receiver.
See #safeRemove: to do this.
-
removeAll
-
remove all elements from the receiver and resize (shrink) the underlying container.
Returns the receiver.
Similar to clearContents, which but might behave better,
if the receiver is to be filled again afterwards.
Usage example(s):
self new removeAll capacity
|
-
removeIdentical: oldObjectArg ifAbsent: exceptionBlock
-
remove oldObject from the collection and return it.
If it was not in the collection return the value of exceptionBlock.
Uses identity compare (==) to search for an occurrence.
WARNING: do not remove elements while iterating over the receiver.
See #safeRemoveIdentical:ifAbsent: to do this.
-
replaceAll: oldObject with: newObject
-
replace all oldObjects by newObject in the receiver.
-
safeRemove: oldObject
-
remove the element, oldObject from the collection.
Return the element
(could be non-identical to oldObject, since I hash on equality, not on identity).
If it was not in the collection return nil.
In contrast to #remove:, this does not resize the underlying collection
and therefore does NOT rehash & change the elements order.
Therefor this can be used while enumerating the receiver,
which is not possible if #remove: is used.
WARNING: since no resizing is done, the physical amount of memory used
by the container remains the same, although the logical size shrinks.
You may want to manually resize the receiver using #emptyCheck.
(after the loop)
-
safeRemove: oldObjectArg ifAbsent: exceptionValueProvider
-
remove the element, oldObject from the collection.
Return the element
(could be non-identical to oldObject, since I hash on equality, not on identity).
If it was not in the collection return the value of exceptionValueProvider.
In contrast to #remove:, this does not resize the underlying collection
and therefore does NOT rehash & change the elements order.
Therefore this can be used while enumerating the receiver,
which is not possible if #remove: is used.
WARNING: since no resizing is done, the physical amount of memory used
by the container remains the same, although the logical size shrinks.
You may want to manually resize the receiver using #emptyCheck.
(after the loop)
-
safeRemoveIdentical: oldObjectArg ifAbsent: exceptionBlock
-
remove the element, oldObject from the collection.
Return the element
If it was not in the collection return the value of exceptionValueProvider.
In contrast to #remove:, this does not resize the underlying collection
and therefore does NOT rehash & change the elements order.
Therefore this can be used while enumerating the receiver,
which is not possible if #remove: is used.
WARNING: since no resizing is done, the physical amount of memory used
by the container remains the same, although the logical size shrinks.
You may want to manually resize the receiver using #emptyCheck.
(after the loop)
-
saveRemove: oldObject
-
bad spelling - kept for backward compatibility (2014-06-04)
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
saveRemove: oldObjectArg ifAbsent: exceptionValueProvider
-
bad spelling - kept for backward compatibility (2014-06-04)
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
testAndAdd: keyArg
-
Test, if the element is present in the receiver.
Answer true, if the element was already in the collection, false otherwise.
If the element does not exist, add it to the collection.
This saves an extra #findKeyOrNil: search against sending #includes: and add:.
WARNING: do not add elements while iterating over the receiver.
Iterate over a copy to do this.
comparing
-
= anObject
-
return true, if the argument is a Set containing the same elements
as I do
Usage example(s):
#(1 2 3 4 5) asSet = #(2 3 4 5 1) asSet
#(nil 1 2 3 4 5) asSet = #(2 3 4 5 1) asSet
#(1 2 3 4 5) asSet = #(2 3 4 5 1.0) asSet
#(1 2 3 4 5) asSet = #(2 3 4 5 'one') asSet
|
Usage example(s):
|d1 d2|
d1 := Dictionary new.
d2 := Dictionary new.
d1 at:1 put:'one'.
d1 at:'one' put:1.
d1 at:2 put:#two.
d1 at:1 put:'one'.
d1 at:'one' put:1.
d1 at:2 put:#two.
d2 at:1 put:'one'.
d2 at:'one' put:1.
d2 at:2 put:#two.
d2 at:1 put:'one'.
d2 at:'one' put:1.
d2 at:2 put:#two.
d1 = d2
|
Usage example(s):
|d1 d2|
d1 := Dictionary new.
d2 := Dictionary new.
d1 at:1 put:'uno'.
d1 at:'one' put:1.
d1 at:2 put:#two.
d1 at:1 put:'one'.
d1 at:'one' put:1.
d1 at:2 put:#two.
d2 at:1 put:'one'.
d2 at:'one' put:1.
d2 at:2 put:#two.
d2 at:1 put:'one'.
d2 at:'one' put:1.
d2 at:2 put:#two.
d1 = d2
|
-
hash
-
return a hash key for the receiver
Usage example(s):
this hash is stupid - but for larger collections, the hashing
time can become much bigger than the time lost in added probing.
Time will show ...
Notice & warning:
if the #= method is ever changed to compare non-dictionaries equal,
the code below must be changed to assert that the same hash-value is
still returned.
(which may be hard to accomplish)
|
Usage example(s):
|d|
d := Dictionary new.
d at:1 put:'one'.
d at:'one' put:1.
d at:2 put:#two.
d at:'two' put:2.
d hash
|
Usage example(s):
|d|
d := Dictionary new.
d at:1 put:'uno'.
d at:'one' put:1.
d at:2 put:#two.
d at:'two' put:2.
d hash
|
converting
-
asNewSet
-
make sure to return a unique new set
Usage example(s):
|s|
s := #(1 2 3 4) asSet.
self assert:(s ~~ s asNewSet).
self assert:(s = s asNewSet).
|
-
asSet
-
return the receiver as a Set
copying-private
-
postCopy
-
have to copy the keyArray too
enumerating
-
do: aBlock
-
perform the block for all members in the collection.
Notice: the order of evaluation is not defined (and may appear random, actually)
WARNING: do not add/remove elements while iterating over the receiver.
Iterate over a copy to do this.
-
keysAndValuesCollect: aTwoArgBlock
-
for each key-value pair in the receiver, evaluate the argument, aBlock
and return a collection with the results.
See also:
#associationsCollect: (which passes keys->value pairs)
#collect: (which only passes values)
This is much like #associationsCollect:, but aBlock gets the
key and value as two separate arguments.
#associationsCollect: is a bit slower.
WARNING: do not add/remove elements while iterating over the receiver.
Iterate over a copy to do this.
Usage example(s):
#(1 2 3) keysAndValuesCollect:[:k :v|
e'key: {k} value: {v}'.
].
|ages|
ages := Dictionary new
at:'cg' put:37;
at:'ca' put:33;
at:'sv' put:36;
at:'tk' put:28;
yourself.
ages keysAndValuesCollect:[:name :age |
name , '''s age is ' , age printString]
|
initialization
-
initialize
-
this is defined to make 'self basicNew initialize' in subclasses work
-
initializeForCapacity: minSize
-
initialize the contents array (for at least minSize slots)
and set tally to zero.
The size is increased to the next prime for better hashing behavior.
misc ui support
-
inspectorClass
( an extension from the stx:libtool package )
-
redefined to use SetInspector
(instead of the default Inspector).
obsolete set operations
-
+ aCollection
-
Kept for backward compatibility.
Use #union: instead, to isolate arithmethic and set operations
** This is an obsolete interface - do not use it (it may vanish in future versions) **
-
- aCollection
-
Kept for backward compatibility.
Use #\ instead, to isolate arithmethic and set operations
** This is an obsolete interface - do not use it (it may vanish in future versions) **
private
-
find: key ifAbsent: aBlock
-
Look for the key in the receiver. If it is found, return
the index of the slot containing the key, otherwise
return the value of evaluating aBlock.
-
findIdentical: key ifAbsent: aBlock
-
Look for the key in the receiver. If it is found, return
the index of the slot containing the key, otherwise
return the value of evaluating aBlock.
-
findKeyOrNil: key
-
Look for the key in the receiver.
If it is found, return the index of the first unused slot.
Grow the receiver, if key was not found, and no unused slots were present
Warning: an empty slot MUST be filled by the sender - it is only to be sent
by at:put: / add: - like methods.
-
findKeyOrNilOrDeletedEntry: key
-
Look for the key in the receiver.
If it is found, return the index of the first unused slot.
Grow the receiver, if key was not found, and no unused slots were present.
The answer is the index into the keyArray where the (keyArray at:index)
may contain:
nil - an empty slot
DeletedEntry - an empty slot, but preceeded and followed by non-empty
slots with keys hashing to the same value (hash collisions)
key - key is already present in the slot.
-
findNil: key
-
Look for the next slot usable for key.
WARNING:
This method assumes that key is not already in the receiver
AND that keyArray does not have previously removed entries
AND that there is an empty slot.
To be used ONLY while growing/rehashing to enter elements into a fresh
collection - if any of the above conditions is not met, the method
loops forever.
-
hashFor: aKey
-
return the arguments hash value.
Redefined in subclasses, which use a different comparison (i.e. identity compare)
-
initialIndexForKey: aKey
-
return an initial index given a key.
-
keyArray
-
-
keyContainerOfSize: n
-
return a container for keys of size n.
Extracted to make life of weak subclasses easier ...
-
rehash
-
rehash is done by re-adding all elements to a new empty set.
Rehash is needed after a binaryRead, for example.
private-grow & shrink
-
grow
-
change the number of element slots of the collection to a useful
new size
Usage example(s):
(self new:300) capacity
(self new:600) capacity
(self new:300) grow capacity
|
-
grow: newSize
-
change the number of element slots of the collection - to do this,
we have to rehash (which is done by re-adding all elements to a new
empty set).
-
possiblyGrow
-
check if collection is full (after an add); grow if so.
Definition of 'full' is currently: 'filled more than 75% (i.e. 3/4th)'
-
possiblyShrink
-
check if the receiver has become too empty (after a remove)
and shrink if it makes sense.
Definition of 'too empty' is: 'filled less than 12.5% (i.e. 1/8th)'
-
shrinkToZero
-
the tally went down to zero.
throw away any big container and start anew.
Usage example(s):
(Set new:50) shrinkToZero capacity
|
queries
-
capacity
-
return the number of elements, that the receiver is prepared to take w.o. resizing.
Notice, that Sets do automatically resize as required,
so knowing the capacity is of no real use.
-
collisionCount
-
return the number of key collisions in the set.
There is a collision if two keys hash to the same value.
Usage example(s):
self allSubInstances
collect:[:each| each collisionCount -> each]
thenSelect:[:each| each key > 0]
|
-
collisionsFor: key
-
Return the number of searches - 1 required for key
-
includes: anObject
-
return true if the argument anObject is in the receiver
-
isEmpty
-
return true if the receiver is empty
-
isEmptyOrNil
-
return true if the receiver is empty
-
maxChainLength
-
return the number of the maximum chain length in the set.
This is the worst case overhead when accessing a key.
-
notEmpty
-
return true if the receiver is not empty
-
notEmptyOrNil
-
return true if the receiver is not empty
-
occurrencesOf: anObject
-
return the number of occurrences of anObject in the receiver.
As I am a Set, this can only return 0 or 1.
-
size
-
return the number of set elements
searching
-
findFirst: aBlock ifNone: exceptionValue
-
find the index of the first element, for which evaluation of the argument, aBlock returns true;
return its index or the value from exceptionValue if none detected.
This is much like #detect:ifNone:, however, here an INDEX is returned,
while #detect:ifNone: returns the element.
Sets do not have indices.
-
findLast: aBlock ifNone: exceptionValue
-
Sets do not have indices.
testing
-
isFixedSize
-
return true if the receiver cannot grow - this will vanish once
Arrays and Strings learn how to grow ...
-
isOrdered
-
return true, if the receiver's elements are ordered.
Redefined to return false here, because the order of keys (and values in dictionaries)
may change due to rehashing, when elements are added/removed
visiting
-
acceptVisitor: aVisitor with: aParameter
-
dispatch for visitor pattern; send #visitSet:with: to aVisitor.
EmptySlot
NilKey
|