eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'Set':

Home

everywhere
www.exept.de
for:
[back]

Class: Set


Inheritance:

   Object
   |
   +--Collection
      |
      +--Set
         |
         +--Dictionary
         |
         +--IdentitySet
         |
         +--OrderedSet

Package:
stx:libbasic
Category:
Collections-Unordered
Version:
rev: 1.105 date: 2010/02/26 10:53:48
user: cg
file: Set.st directory: libbasic
module: stx stc-classLibrary: libbasic
Author:
Claus Gittinger

Description:


a Set is a collection where each element occurs at most once.
The inclusion test is done using = for comparison; 
see IdentitySet for sets using identity compare.
Keep in mind, that a regular Set therefore treats 3.0 and 3 as equal
and therefore:
    (Set with:3.0) includes:3
will return true (since 3.0 and 3 are equal).
In contrast, an IdentitySet will return false, because 3.0 and 3 are not
identical.

Sets use hashing for fast access, this access is considerably faster,
if a good hash-number is returned by the elements.

Notice that the default hash (Object>>hash) is not perfect; due to
the implementation of hash-keys in ST/X, increased hash collisions
are to be expected for large sets (say: > 20000 element). 
If your objects are heavyly used in sets or dictionaries, and you need
big collections, your instances may provide a better hash values.

Performance hints: 
  If only symbols or smallIntegers are used as keys, 
  use an instance of IdentitySet for slightly better performance, 
  since both hashing and comparison is faster.

  If you have a rough idea how big the set is going to grow,
  create it using #new: instead of #new. Even if the size given is a
  poor guess (say half of the real size), there is some 20-30% performance
  win to expect, since many resizing operations of the set are avoided.

Examples:

    |s|
    s := Set new.
    s add:'hello'.
    s add:'world'.
    s add:#foo.
    s add:1.2345678.
    s add:'hello'.

    s printCR.
    's size -> ' print. s size printCR.
    '(s includes:''hello'') -> ' print. (s includes:'hello') printCR.
    '(s includes:#foo)    -> ' print. (s includes:#foo) printCR.
    '(s includes:''foo'')   -> ' print. (s includes:'foo') printCR.
    '(s includes:#bar)    -> ' print. (s includes:#bar) printCR.


Class protocol:

initialization
o  initialize
initialize the Set class

instance creation
o  decodeFromLiteralArray: anArray
create & return a new instance from information encoded in anArray.

o  new
return a new empty Set

o  new: anInteger
return a new empty Set with space for anInteger elements

queries
o  goodSizeFrom: arg
return a good array size for the given argument.
Returns the next prime after arg, since prime sizes are good for hashing.

utilities
o  rehashAllSubInstances
rehash all sets & dictionaries.
Useful utility if hash/identityHash method was changed
of some object which is known to be kept in a set


Instance protocol:

Compatibility-ST80
o  initialIndexFor: hashKey boundedBy: length
for ST-80 compatibility only; it is (currently) not used in this
implementation of sets. Therefore, in ST/X it does not make sense
to redefine it. (which may be a bad design decision, but slightly
improves performance, by avoiding an extra message send ...)

Compatibility-Squeak
o  copyWithout: elementToSkip
return a new collection consisting of a copy of the receiver, with
ALL elements equal to elementToSkip are left out.
No error is reported, if elementToSkip is not in the collection.

o  copyWithoutAll: aCollection

o  like: anObject
Answer an object in the receiver that is equal to anObject,
nil if no such object is found.
Relies heavily on hash properties (i.e. that two object's hashes are equal
if the two object compare equal)

accessing
o  addFirst: anObject
add the argument, anObject to the receiver.
If the receiver is ordered, the new element will be added at the beginning.
An error is raised here - it does not make sense for unordered collections

o  at: index
report an error: at: is not allowed for Sets

o  at: index put: anObject
report an error: at:put: is not allowed for Sets

o  elementAt: anObject
return the element, if contained in the set.
If there is none, report an error.
This may seem confusing at first - however, it is useful with
non-identitysets, to find an existing element, for a
given equal (but not identical) object.
This is the same functionality as provided by the goodies/KeyedSet goody.

o  elementAt: anObject ifAbsent: exceptionBlock
return the element, if contained in the set.
If there is none, return the result from evaluating exceptionBlock.
This may seem confusing at first - however, it is useful with
non-identitysets, to find an existing element, for a
given equal (but not identical) object.
This is the same functionality as provided by the goodies/KeyedSet
goody.

o  removeLast
remove the last element from the receiver.
Return the removed element.
An error is raised here - it does not make sense for unordered collections

o  reverseDo: aBlock
evaluate the argument, aBlock for each element in reverse order.
An error is raised here - it does not make sense for unordered collections

adding & removing
o  add: anObject
add the argument, anObject to the receiver.

WARNING: do not add elements while iterating over the receiver.
Iterate over a copy to do this.

o  remove: oldObject ifAbsent: exceptionBlock
remove oldObject from the collection and return it.
If it was not in the collection return the value of exceptionBlock.
Notice, that the returned object could be non-identical to the argument
(although it will always be equal).

WARNING: do not remove elements while iterating over the receiver.
See #saveRemove: to do this.

o  removeAll
remove all elements from the receiver. Returns the receiver.

o  saveRemove: oldObject
remove the element, oldObject from the collection.
Return the element
(could be non-identical to oldObject, since I hash on equality, not on identity).
If it was not in the collection return nil.

In contrast to #remove:, this does not resize the underlying collection
and therefore does NOT rehash & change the elements order.
Therefor this can be used while enumerating the receiver,
which is not possible if #remove: is used.

WARNING: since no resizing is done, the physical amount of memory used
by the container remains the same, although the logical size shrinks.
You may want to manually resize the receiver using #emptyCheck.
(after the loop)

o  testAndAdd: anObject
add the argument, anObject to the receiver.
Answer true, if the element did already exist in the collection,
false otherwise.

WARNING: do not add elements while iterating over the receiver.
Iterate over a copy to do this.

binary storage
o  readBinaryContentsFrom: stream manager: manager
must rehash after reload

comparing
o  = aCollection
return true, if the argument is a Set containing the same elements
as I do

o  hash
return a hash key for the receiver

copying
o  postCopy
have to copy the keyArray too

enumerating
o  do: aBlock
perform the block for all members in the collection.

WARNING: do not add/remove elements while iterating over the receiver.
Iterate over a copy to do this.

inspecting
o  inspectorClass
redefined to use SetInspector
(instead of the default Inspector).

obsolete set operations
o  + aCollection
Kept for backward compatibility.
Use #union: instead, to isolate arithmethic and set operations

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  - aCollection
Kept for backward compatibility.
Use #\ instead, to isolate arithmethic and set operations

** This is an obsolete interface - do not use it (it may vanish in future versions) **

private
o  find: key ifAbsent: aBlock
Look for the key in the receiver. If it is found, return
the index of the slot containing the key, otherwise
return the value of evaluating aBlock.

o  findKeyOrNil: key
Look for the key in the receiver.
If it is found, return the index of the first unused slot.
Grow the receiver, if key was not found, and no unused slots were present

o  findNil: key
Look for the next slot usable for key.
WARNING:
This method assumes that key is not already in the receiver
AND that keyArray does not have previously removed entries
AND that there is an empty slot.
To be used ONLY while growing/rehashing to enter elements into a fresh
collection - if any of the above conditions is not met, the method
loops forever.

o  hashFor: aKey
return the arguments hash value.
Redefined in subclasses, which use a different comparison (i.e. identity compare)

o  initialIndexForKey: aKey
return an initial index given a key.

o  invalidElementError
this is send when a nil is used with a Set.
For now this is just an info-Message. Will become an error sometimes.

o  keyArray

o  keyContainerOfSize: n
return a container for keys of size n.
Extracted to make life of weak subclasses easier ...

o  rehash
rehash is done by re-adding all elements to a new empty set.
Rehash is needed after a binaryRead, for example.

o  rehashFrom: startIndex
rehash elements starting at index - after a remove.
Notice: due to the new implementation of remove,
this is no longer needed

o  setTally: count
initialize the contents array (for at least count slots)
and set tally to zero.
The size is increased to the next prime for better hashing behavior.

o  slotIsEmpty: aSlotValue
only redefined in weak subclasses, since they treat a 0-value
as being empty

private-grow & shrink
o  emptyCheck
check if the receiver has become too empty (after a remove)
and shrink if it makes sense.
Definition of 'too empty' is 'filled less than 12.5% (i.e. 1/8th)'

o  fullCheck
check if collection is full (after an add); grow if so.
Definition of 'full' is currently: 'filled more than 75% (i.e. 3/4th)'

o  grow
change the number of element slots of the collection to a useful
new size

o  grow: newSize
change the number of element slots of the collection - to do this,
we have to rehash (which is done by re-adding all elements to a new
empty set).

queries
o  collisionCount
return the number of key collisions in the set.
There is a collision if two keys hash to the same value.

o  collisionsFor: key
Return the number of searches - 1 required for key

o  maxChainLength
return the number of the maximum chain length in the set.
This is the worst case overhead when accessing a key.

o  size
return the number of set elements

testing
o  capacity
return the number of elements, that the receiver is
prepared to take.
Not used by the system; added for ST-80 compatibility.

o  includes: anObject
return true if the argument anObject is in the receiver

o  isEmpty
return true if the receiver is empty

o  isFixedSize
return true if the receiver cannot grow - this will vanish once
Arrays and Strings learn how to grow ...

o  notEmpty
return true if the receiver is not empty

o  occurrencesOf: anObject
return the number of occurrences of anObject in the receiver.
As I am a Set, this can only return 0 or 1.

o  sameContentsAs: aCollection
answer true, if all the elements in self and aCollection
are common. This is not defined as #=, since we cannot redefine #hash
for aCollection.

visiting
o  acceptVisitor: aVisitor with: aParameter


Private classes:

    EmptySlot


ST/X 6.1.1; WebServer 1.620 at exept:8081; Wed, 23 May 2012 21:01:39 GMT