|
|
Class: SortedCollection
Object
|
+--Collection
|
+--SequenceableCollection
|
+--OrderedCollection
|
+--SortedCollection
- Package:
- stx:libbasic
- Category:
- Collections-Sequenceable
- Version:
- rev:
1.69
date: 2009/09/28 15:45:59
- user: cg
- file: SortedCollection.st directory: libbasic
- module: stx stc-classLibrary: libbasic
- Author:
- Claus Gittinger
I keep my elements sorted. The sort order is defined by a sortblock,
a two-argument block which, when given two elements of the collection,
should return true if the element given as first arg has to come before the
element given as second arg.
Equal elements may occur multiple times.
SortedCollection uses quickSort to resort and a binary search when adding/removing elements.
Because insertion/removal may require that remaining elements have to
be shifted within the container, adding many individual elements may be done faster
by creating a completely new collection from the unsorted elements.
(see examples)
A sortBlock of [:a :b | a < b] defines ascending sort-order,
while [:a :b | a > b] defines descening order.
The default sortBlock for SortedCollections is the first one.
Compatibility Warning:
VW seems to use a default sortBlock which compares a<=b, wheras ST/X uses a<b.
That means, that elements which are compared MUST understand #< in ST/X
while the minumum protocol is #<= in VW.
This may be changed in a future release of ST/X
(it is not yet, to not confuse existing applications ...
... be aware, that the sortBlock has an effect on a few algorithms
found here; especially #indexForInserting is critical.)
Compatibility-Dolphin
-
sortBlock: aBlock withAll: aCollection
-
helpers
-
binarySearch: aSortedArray for: anObject
-
search for an existing element; return true if found.
by checking if the element at the returned index is the one we look for.
Uses a binarySearch since we can depend on the elements being on sorted order.
The returned index is a physical one, for accessing contentsArray.
This is only to be used for arrays which are known to be sorted.
-
binarySearch: aSortedArray forIndexOf: anObject
-
search the index at which to insert anObject.
Can also be used to search for an existing element
by checking if the element at the returned index is the one we look for.
Uses a binarySearch since we can depend on the elements being on sorted order.
The returned index is a physical one, for accessing contentsArray.
This is only to be used for arrays which are known to be sorted.
-
binarySearch: aSortedArray from: firstIndex to: lastIndex forIndexOf: anObject usingSortBlock: sortBlock
-
search the index at which to insert anObject.
Can also be used to search for an existing element
by checking if the element at the returned index is the one we look for.
Uses a binarySearch since we can depend on the elements being on sorted order.
The returned index is a physical one, for accessing contentsArray.
This is only to be used for arrays which are known to be sorted.
initialization
-
initialize
-
setup the default sortBlock.
Use #<, since this is the base method in Magnitude.
instance creation
-
forStrings
-
this is supposed to return a sortedCollection, which sorts using
the current locales collating sequence. For now, simply use a
normal string compare.
This will change
-
forStrings: size
-
this is supposed to return a sortedCollection, which sorts using
the current locales collating sequence. For now, simply use a
normal string compare.
This will change
-
new
-
return a new sortedCollection, the sorting is done using
a compare for a < b, in ascending order
-
new: size
-
return a new sortedCollection with preallocated size.
The sorting is done using a compare for a < b, in ascending order
-
sortBlock: aBlock
-
return a new sortedCollection, whe the sort order is defined
by aBlock.
This must be a two-argument block which returns true if its arg1 has to come before
its arg2 in the collection.
-
withAll: aCollection sortBlock: aBlock
-
initialize from aCollection and set the sort-block
accessing
-
largest: n
-
return the n largest elements
-
max
-
Return the largest element.
-
median
-
Return the middle element, or as close as we can get.
-
min
-
Return the smallest element.
adding & removing
-
add: anObject
-
add the argument, anObject at the proper place in the receiver.
Returns the argument, anObject (sigh).
-
addAll: aCollection
-
add all elements of the argument, aCollection to the receiver.
Returns the argument, aCollection (sigh).
-
mergeSorted: aSortedCollection
-
add all elements of the argument, aSortedCollection to the receiver.
This leads to an error, if the argument is not sorted.
This should be used when a number of elements is to be added.
Notice, that quicksort degenerates to a veeery slow bubble-like
sort when a collection of already sorted elements is given to it.
Therefore, adding chunks is better done by sorting the chunk and merging
it in, instead of resorting the receiver.
blocked
-
add: newObject after: oldObject
-
catch this - its not allowed for sortedCollections
-
add: newObject before: oldObject
-
catch this - its not allowed for sortedCollections
-
addFirst: anObject
-
catch this - its not allowed for sortedCollections
-
addLast: anObject
-
catch this - its not allowed for sortedCollections
converting
-
asSortedCollection
-
return the receiver as a sorted collection
-
asSortedCollection: aSortBlock
-
return the receiver as a sorted collection, using aSortBlock.
Return the receiver, if its a sortedCollection and the sortBlock
is the same as the argument-sortBlock
copying
-
copyEmpty: size
-
return a copy of the receiver with no elements, but
the same size. This method has been be redefined to
preserve the sortBlock.
-
postCopyFrom: anOriginal
-
sent after a copy or when a new collection species has been created.
The new collection should have the same sortBlock as the original.
enumerating
-
collect: aBlock
-
evaluate the argument, aBlock for every element in the collection
and return a collection of the results. Redefined to return an OrderedCollection;
see X3J20 spec. (SortedCollection>>collect: should return an OrderedCollection)
instance protocol
-
sort: aSortBlock
-
redefined from superclass to change the sortBlock
-
sortBlock
-
return the block used for sorting
-
sortBlock: aSortBlock
-
change the sort criteria for a sorted collection, resort the elements of
the collection, and return the receiver. The argument, aSortBlock must
be a two-argument block which returns true if its arg1 has to come before
its arg2 in the collection.
private
-
indexForInserting: anObject
-
search the index at which to insert anObject.
Can also be used to search for an existing element
by checking if the element at the returned index is the one we look for.
Uses a binarySearch since we can depend on the elements being on sorted order.
The returned index is a physical one, for accessing contentsArray.
-
setSortBlock: aSortBlock
-
set the sortblock without resorting - private only
queries
-
isSorted
-
return true. if my elements are sorted
-
isSortedBy: aBlock
-
return true, if my elements are sorted (already) by the given criterion (sortBlock).
-
isSortedCollection
-
return true. if I am a sorted collection
searching
-
after: anObject ifAbsent: exceptionBlock
-
return the element after the argument, anObject; or nil if there is none.
If anObject is contained multiple times in the collection, return the
the first element which is non-equal to anObject.
If the receiver does not contain anObject, return the result from evaluating
exceptionBlock.
-
before: anObject ifAbsent: exceptionBlock
-
return the element before the argument, anObject; or nil if there is none.
If the receiver does not contain anObject, return the result from evaluating
exceptionBlock.
-
indexOf: anElement
-
Return the index of anElement within the receiver.
If the receiver does not contain anElement, return 0.
-
indexOf: anElement ifAbsent: exceptionBlock
-
Return the index of anElement within the receiver.
If the receiver does not contain anElement,
return the result of evaluating the argument, exceptionBlock.
testing
-
includes: anObject
-
return true, if the argument, anObject is in the collection.
Redefined, since due to being sorted, the inclusion check can
be done with log-n compares i.e. much faster.
-
occurrencesOf: anObject
-
return how many times the argument, anObject is in the collection.
Redefined, since due to being sorted, the range of checked objects
can be limited i.e. it can be done much faster.
Uses #= (i.e. equality) compare.
when many elements are to be added, it may be better to add them
all en-bloeque instead of individually.
The reason is that for each individual #add:, the contents has to be
shifted, to create an empty slot for the new element.
timing example:
|o rnd|
o := SortedCollection new.
rnd := Random new.
10000 timesRepeat:[
o add:rnd next.
]
takes 1365 ms on a P5 (admitted: this is a fast machine ;-)
Times are a'changing:
Just came around 2005: on my 1.5Ghz laptop, this now takes 260ms...
an another comment 2008: on a 1.8Ghz laptop, it takes now 105ms...
In contrast:
|o rnd|
o := OrderedCollection new.
rnd := Random new.
10000 timesRepeat:[
o add:rnd next.
].
o := o asSortedCollection
takes 383 ms on the same machine.
2005: on my 1.5Ghz laptop, this now takes 100ms
2008: on a 1.8Ghz laptop, this now takes 47ms
If you already have a big sortedCollection at hand, adding multiple
items may be better done with #addAll:, which resorts all elements, if
the number of added items is more than some threshold number.
However, the break-even point where bulk-adding is faster depends
on the machine ... (and ST/X version ;-).
adding elements in order:
|c|
Time millisecondsToRun:[
10 timesRepeat:[
|c|
c := SortedCollection new.
(1 to:100000) do:[:e | c add:e].
]
].
(5.4.1: 2031 2187)
(5.4.3: 484 516)
|c|
c := SortedCollection new.
(1 to:100000) do:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).
adding elements in reverse order:
Time millisecondsToRun:[
10 timesRepeat:[
|c|
c := SortedCollection new.
(1 to:100000) reverseDo:[:e | c add:e].
]
].
(5.4.1: 201969)
(5.4.3: 1609 1766)
|c|
c := SortedCollection new.
(1 to:100000) reverseDo:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).
adding elements in random order:
|toAdd|
toAdd := (1 to:100000) asOrderedCollection randomShuffle.
Time millisecondsToRun:[
10 timesRepeat:[
|c|
c := SortedCollection new.
toAdd do:[:e | c add:e].
]
].
(5.4.1: 108484)
(5.4.3: 75734)
|c|
c := SortedCollection new.
(1 to:100000) asOrderedCollection randomShuffle do:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).
|