eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'SortedCollection':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: SortedCollection


Inheritance:

   Object
   |
   +--Collection
      |
      +--SequenceableCollection
         |
         +--OrderedCollection
            |
            +--SortedCollection
               |
               +--ParseTreeIndex

Package:
stx:libbasic
Category:
Collections-Sequenceable
Version:
rev: 1.106 date: 2023/11/24 08:26:35
user: cg
file: SortedCollection.st directory: libbasic
module: stx stc-classLibrary: libbasic

Description:


I keep my elements sorted. The sort order is defined by a sortblock,
a two-argument block which, when given two elements of the collection,
should return true if the element given as first arg has to come before the
element given as second arg.

Equal elements may occur multiple times.

SortedCollection uses quickSort to resort and a binary search when adding/removing elements.
Because insertion/removal may require that remaining elements have to
be shifted within the container, adding many individual elements will be faster above a
particular number of added elements by creating a completely new collection from the
unsorted elements.
(see examples)

A sortblock, given arguments a,b should return true, if a is to be shown before b.
A sortBlock of [:a :b | a < b] defines ascending sort-order,
while [:a :b | a > b] defines descening order.
The default sortBlock for SortedCollections is the first one (i.e. sorting in ascending order).

Compatibility Warning:
    VW seems to use a default sortBlock which compares a<=b, whereas ST/X uses a<b.
    That means, that elements which are compared MUST understand #< in ST/X
    while the minumum protocol is #<= in VW.
    This may be changed in a future release of ST/X
    (it is not yet, to not confuse existing applications ...
     ... be aware, that the sortBlock has an effect on a few algorithms
     found here; especially #indexForInserting is critical.)

[complexity:]
     access by index:                O(1)          
     insertion at either end:        O(1) mostly    
     removal at either end:          O(1) mostly
     insertion in the middle:        O(n)    
     removal in the middle:          O(n)
     searching:                      O(log(n))      uses binary search
     min/max:                        O(1)

[caveat:]
    a balanced tree may be a better choice, as inserting may be slow when
    elements have to be shifted. See the performance measurements in AATree.

copyright

COPYRIGHT (c) 1993 by Claus Gittinger All Rights Reserved This software is furnished under a license and may be used only in accordance with the terms of that license and with the inclusion of the above copyright notice. This software may not be provided or otherwise made available to, or used by, any other person. No title to or ownership of the software is hereby transferred.

Class protocol:

Compatibility-Dolphin
o  sortBlock: aBlock withAll: aCollection

initialization
o  initialize
setup the default sortBlock.
Use #<, since this is the base method in Magnitude.

Usage example(s):

     SortedCollection initialize

instance creation
o  forNaturalSortedStrings
this returns a sortedCollection, which sorts strings naturallay,
i.e. numbers in the strings are compared numerical.
It is supposed to sorts using the current locales collating sequence.
For now, simply use a normal natural string compare.
This will change

o  forNaturalSortedStrings: size
this returns a sortedCollection, which sorts strings naturallay,
i.e. numbers in the strings are compared numerical.
It is supposed to sorts using the current locales collating sequence.
For now, simply use a normal natural string compare.
This will change

o  forStrings
this is supposed to return a sortedCollection, which sorts using
the current locales collating sequence. For now, simply use a
normal string compare.
This will change

o  forStrings: size
this is supposed to return a sortedCollection, which sorts using
the current locales collating sequence. For now, simply use a
normal string compare.
This will change

o  forStrings: size collatedBy: aCollationPolicy
Create & return a SortedCollection that sorts the receiver's
elements according to the specified locales collating policy.
This is currently not really supported - strings are sorted
without caring for the locale.

o  sortBlock: aBlock
return a new sortedCollection, where the sort order is defined by aBlock.
This must be a two-argument block which returns true if its arg1 has to come before
its arg2 in the collection.

o  withAll: aCollection sortBlock: aBlock
initialize from aCollection and set the sort-block

Usage example(s):

     SortedCollection withAll:#(1 2 3 4 5 6 7 8 9 0)
		      sortBlock:[:a :b | a > b]


Instance protocol:

adding & removing
o  add: anObject
add the argument, anObject at the proper place in the receiver.
Returns the argument, anObject.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection add:5; yourself
     #(7 3 9 10 99) asSortedCollection add:1; yourself
     #(7 3 9 10 99) asSortedCollection add:1000; yourself
     #(7 3 9 10 99) asSortedCollection add:1; yourself   
     #(7 3 9 10 99) asSortedCollection add:99.0; yourself
Use <=, to make sure that elements with the same value are added last:    
     (#(7 3 9 10 99) asSortedCollection sortBlock:[:a :b| a <= b]) add:99.0; yourself   
     (#(7 3 9 10 99) asSortedCollection sortBlock:[:a :b| a <= b]) add:9.0; yourself   

o  addAll: aCollection
add all elements of the argument, aCollection to the receiver.
Returns the argument, aCollection.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection addAll:#(77 0 1 16 5); yourself

     ( #(7 3 9 10 99) asSortedCollection:[:a :b | a > b])
         addAll:#(77 0 1 16 5); yourself

     #(7 3 9 10 99) asSortedCollection
        addAll:( #(77 0 1 16 5) asSortedCollection:[:a :b | a > b]); yourself

     (#(7 3 9 10 99) asSortedCollection:[:a :b | a > b])
        addAll:( #(77 0 1 16 5) asSortedCollection:[:a :b | a < b]); yourself

Usage example(s):

     |x e|

     e := (1 to:100) asSortedCollection.
     TimeDuration toRun:[
         10000 timesRepeat:[
            x := SortedCollection new:100.
            x addAll:e. 
         ].
     ].      

o  mergeSorted: aSortedCollection
add all elements of the argument, aSortedCollection to the receiver.
This leads to an error, if the argument is not sorted.

This should be used when a number of elements is to be added.
Notice, that quicksort degenerates to a veeery slow bubble-like
sort when a collection of already sorted elements is given to it.
Therefore, adding chunks is better done by sorting the chunk and merging
it in, instead of resorting the receiver.

aSortedCollection must be sorted by the same sortBlock as myself!

Usage example(s):

     #(7 3 9 10 99) asSortedCollection
        mergeSorted:#(77 0 1 16 5) asSortedCollection

     #(7 3 9 10 99) asSortedCollection
        mergeSorted:#(77 0 3 1 98 99 16 5) asSortedCollection

     #() asSortedCollection
        mergeSorted:#(77 0 3 1 98 99 16 5) asSortedCollection

     #(7 3 9 10 99) asSortedCollection
        mergeSorted:#() asSortedCollection

converting
o  asSortedCollection
return the receiver as a sorted collection

o  asSortedCollection: aSortBlock
return the receiver as a sorted collection, using aSortBlock.
Return the receiver, if it's a sortedCollection and the sortBlock
is the same as the argument - aSortBlock

copying
o  copyEmpty
return a copy of the receiver with no elements.
This method has been be redefined to preserve the sortBlock.

Usage example(s):

      (#(4 7 1 99 -1 17) asSortedCollection:[:a :b| a <= b]) copyEmpty inspect

o  copyEmpty: size
return a copy of the receiver with no elements, but with size initial capacity.
This method has been be redefined to preserve the sortBlock.

o  copyWith: additionalElement
Return a copy of the dictionary that is 1 bigger than the receiver and
including the argument, additionalElement.

Usage example(s):

         (#(4 7 1 99 -1 17) asSortedCollection copyWith:15) inspect

o  postCopyFrom: anOriginal
sent after a copy or when a new collection species has been created.
The new collection should have the same sortBlock as the original.

enumerating
o  collect: aBlock
evaluate the argument, aBlock for every element in the collection
and return a collection of the results. Redefined to return an OrderedCollection;
see X3J20 spec. (SortedCollection>>collect: should return an OrderedCollection)

o  flatCollect: aBlock
Evaluate aBlock for each of the receiver's elements and answer the
list of all resulting values flatten one level. Assumes that aBlock returns some kind
of collection for each element. Equivalent to the lisp's mapcan

initialization
o  initContents: size
setup the receiver-collection to hold size entries

instance protocol
o  sort: aSortBlock
redefined from superclass to change the sortBlock

o  sortBlock
return the block used for sorting

o  sortBlock: aSortBlock
change the sort criteria for a sorted collection, resort the elements of
the collection, and return the receiver. The argument, aSortBlock must
be a two-argument block which returns true if its arg1 has to come before
its arg2 in the collection.

Usage example(s):

     #(9 8 7 6 5 4 3) asSortedCollection
     #(9 8 7 6 5 4 3) asSortedCollection sortBlock:[:a : b | a < b]
     #(9 8 7 6 5 4 3) asSortedCollection sortBlock:[:a : b | a > b]
     #($f $G $z $Y $o $H) asSortedCollection
     #($f $G $z $Y $o $H) asSortedCollection sortBlock:[:a : b | a asUppercase < b asUppercase]

printing & storing
o  storeNamedInstvarsOn: aStream
Trying to store the named instvars of the receiver on aStream;
Nothing is stored here, but check for a non-default sortBlock.
Cannot store Blocks, so raise an error if we try to store an non-default sortBlock.
(returns false, since no expression was generated; which implies that no #yourself is needed)

private
o  indexForInserting: anObject
search the index at which to insert anObject.
Can also be used to search for an existing element
by checking if the element at the returned index is the one we look for.
Uses a binarySearch since we can depend on the elements being in sorted order.
The returned index is a physical one, for accessing contentsArray.

Usage example(s):

     #(1 2 3 4 7 99 1313 981989 898989898) asSortedCollection indexForInserting:50

     #(1.0 2.0 3 4 7 49.0 51.0 99 1313 981989 898989898) asSortedCollection indexForInserting:50

o  setSortBlock: aSortBlock
set the sortblock without resorting - private only

queries
o  includes: anObject
return true, if the argument, anObject is in the collection.
Redefined, since due to being sorted, the inclusion check can
be done with log-n compares i.e. much faster.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection includes:50
     #(7 3 9 10 99) asSortedCollection includes:10

o  isSorted
return true. if my elements are sorted

o  isSortedBy: aBlock
return true, if my elements are sorted (already) by the given criterion (sortBlock).

o  isSortedCollection
return true. if I am a sorted collection

o  occurrencesOf: anObject
return how many times the argument, anObject is in the collection.
Redefined, since due to being sorted, the range of checked objects
can be limited i.e. it can be done much faster.
Uses #= (i.e. equality) compare.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection occurrencesOf:50
     #(7 3 9 10 99) asSortedCollection occurrencesOf:10
     #(7 10 3 10 9 10 10 10 10 10 10 10 10 99) asSortedCollection occurrencesOf:10

o  speciesForCollecting
like species, but used when doing collect operations.
Redefined to return an OrderedCollection;
see X3J20 spec. (SortedCollection>>collect: should return an OrderedCollection)

searching
o  after: anObject ifAbsent: exceptionBlock
return the element after the argument, anObject; or nil if there is none.
If anObject is contained multiple times in the collection, return the
the first element which is non-equal to anObject.
If the receiver does not contain anObject, return the result from evaluating
exceptionBlock.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection after:50
     #(7 3 9 10 99) asSortedCollection after:3
     #(7 3 9 10 99) asSortedCollection after:1
     #(7 3 9 10 99) asSortedCollection after:10
     #(7 3 9 10 99) asSortedCollection after:7
     #(7 3 9 10 99) asSortedCollection after:99
     #(7 10 3 10 9 10 10 99) asSortedCollection after:9
     #(7 10 3 10 9 10 10 99) asSortedCollection after:10

o  before: anObject ifAbsent: exceptionBlock
return the element before the argument, anObject; or nil if there is none.
If the receiver does not contain anObject, return the result from evaluating
exceptionBlock.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection before:50
     #(7 3 9 10 99) asSortedCollection before:3
     #(7 3 9 10 99) asSortedCollection before:1
     #(7 3 9 10 99) asSortedCollection before:10
     #(7 3 9 10 99) asSortedCollection before:7
     #(7 3 9 10 99) asSortedCollection before:99
     #(7 10 3 10 9 10 10 99) asSortedCollection before:9
     #(7 10 3 10 9 10 10 99) asSortedCollection before:10

o  indexOf: anObject
return the index of the anObject or 0 if anObject is not in the collection.
Redefined, since due to being sorted,
this can be done with log-n compares i.e. much faster.

Usage example(s):

     #(7 3 9 10 99) asSortedCollection indexOf:50
     #(7 3 9 10 99) asSortedCollection indexOf:10

     #('aa' 'bb' 'cc' 'dd') asSortedCollection indexOf:'bb'
     #('aa' 'bb' 'cc' 'dd' 'aa' 'bb' 'cc' 'dd') asSortedCollection indexOf:'bb'

     |allSyms indices|
     allSyms := Symbol allInstances asSortedCollection.
     Time millisecondsToRun:[
         indices := allSyms collect:[:el | allSyms indexOf:el].
     ].
     indices = (1 to:allSyms size)

o  indexOf: anElement ifAbsent: exceptionBlock
Return the index of anElement within the receiver.
If the receiver does not contain anElement,
return the result of evaluating the argument, exceptionBlock.

o  largest: n
return a collection containing the n largest elements, largest last.
Raises an exception, if the receiver does not contain at least n elements

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection largest:3  
     (#(10 35 20 45 30 5) asSortedCollection:[:a :b | a > b]) largest:3    

     #(10 35 20 45 30 5) asSortedCollection largest:6  
     #(10 35 20 45 30 5) asSortedCollection largest:7  

o  max
return the maximum value in the receiver collection,
redefined, since this can be easily computed.
Raises an error, if the receiver is empty.

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection max 
     #(10 35 20 45 30 5) max 
     #() asSortedCollection max 

o  min
return the minimum value in the receiver collection,
redefined, since this can be easily computed.
Raises an error, if the receiver is empty.

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection min
     #(10 35 20 45 30 5) min

o  minMax
return the minimum and maximum values in the receiver collection
as a two element array, using #< to compare elements.
Raises an error, if the receiver is empty.

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection minMax
     #(10 35 20 45 30 5) minMax

o  smallest: n
return a collection containing the n smallest elements, largest last.
Raises an exception, if the receiver does not contain at least n elements

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection smallest:3  
     (#(10 35 20 45 30 5) asSortedCollection:[:a :b | a > b]) smallest:3    

     #(10 35 20 45 30 5) asSortedCollection smallest:6  
     #(10 35 20 45 30 5) asSortedCollection smallest:7  

statistical functions
o  median
Return the middle element, or as close as we can get.

Usage example(s):

     #(10 35 20 45 30 5) asSortedCollection median
     #(10 35 20 45 30 5) median


Examples:


when many elements are to be added, it may be better to add them
all en-bloeque instead of individually.
The reason is that for each individual #add:, the contents has to be
shifted, to create an empty slot for the new element.

timing example:

    |o rnd|

    o := SortedCollection new.
    rnd := Random new.
    10000 timesRepeat:[
        o add:rnd next.
    ].
    o

takes 1365 ms on a P5 (admitted: this is a fast machine ;-)
Times are a'changing:
    Just came around 2005: on my 1.5Ghz laptop, this now takes 260ms...
    an another comment 2008: on a 1.8Ghz laptop, it takes now 105ms...

In contrast:

    |o rnd|

    o := OrderedCollection new.
    rnd := Random new.
    10000 timesRepeat:[
        o add:rnd next.
    ].
    o := o asSortedCollection

takes 383 ms on the same machine.
    2005: on my 1.5Ghz laptop, this now takes 100ms
    2008: on a 1.8Ghz laptop, this now takes 47ms

If you already have a big sortedCollection at hand, adding multiple
items may be better done with #addAll:, which resorts all elements, if
the number of added items is more than some threshold number.
However, the break-even point where bulk-adding is faster depends
on the machine ... (and ST/X version ;-).


  adding elements in order:

|c|
Time millisecondsToRun:[
    10 timesRepeat:[
        |c|
        c := SortedCollection new.
        (1 to:100000) do:[:e | c add:e].
    ]
].

(5.4.1: 2031 2187)
(5.4.3: 484 516)

|c|
c := SortedCollection new.
(1 to:100000) do:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).

  adding elements in reverse order:

Time millisecondsToRun:[
    10 timesRepeat:[
        |c|
        c := SortedCollection new.
        (1 to:100000) reverseDo:[:e | c add:e].
    ]
].

(5.4.1: 201969)
(5.4.3: 1609 1766)

|c|
c := SortedCollection new.
(1 to:100000) reverseDo:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).

  adding elements in random order:

|toAdd|

toAdd := (1 to:100000) asOrderedCollection randomShuffle.

Time millisecondsToRun:[
    10 timesRepeat:[
        |c|
        c := SortedCollection new.
        toAdd do:[:e | c add:e].
    ]
].

(5.4.1: 108484)
(5.4.3: 75734)

|c|
c := SortedCollection new.
(1 to:100000) asOrderedCollection randomShuffle do:[:e | c add:e].
self assert:(c asBag = (1 to:100000) asBag).


ST/X 7.7.0.0; WebServer 1.702 at 20f6060372b9.unknown:8081; Mon, 18 Nov 2024 02:47:48 GMT