[prev] [up] [next]

How to Write Inefficient Code

Contents

Introduction

This could have been a separate chapter of the previous document, "Hints on Writing Eficient Code", but to emphasize on my frustration, it is presented in a separate paper.

Sometimes I come along code, which brings tears to my eyes. Code which is both almost incomprehensable and unbelievably slow.

Almost always, this is due to a complete misunderstanding of what happens, when collections are getting bigger, and a bad (O(n^2) or worse) algorithm is used.

In this document, I will list "goodies" as I find them, together with (maybe) an explanation of what is going on.

Bad Uses of Collections

Updating/Merging a List

Recently, I encountered the following code, which maintains a list of file names, reads a directory and updates the in-memory list from what it finds in the folder. It was part of a UI which monitors a folder and presents an up-to-date view of its contents:
    ...
    oldListOfFiles := self listOfFiles copy.
    (currentFilenames asSortedCollection: [:f1 :f2| f1 baseName < f2 baseName]) asSet
    do: [:fileName|
	(oldListOfFiles detect: [:fileRow| fileRow fileName = fileName] ifNone: nil) isNil
	ifTrue: [
	    |nearestFileRow r|

	    nearestFileRow := self listOfFiles indexOf: (self listOfFiles detect: [:fileRow| fileRow baseName > fileName baseName] ifNone: nil).
	    nearestFileRow = 0
		ifTrue: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename)]
		ifFalse: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename) beforeIndex: nearestFileRow].
	    monitoring ifTrue: [self selectionOfFile value: r].
	]
    ].
    self listOfFiles reverseDo: [:fileRow|
	(currentFilenames includes: fileRow fileName)
	ifFalse: [self listOfFiles remove: fileRow]
    ]
    ...
Can you understand what is going on?
To be honest: I have problems reading and understanding that code; it is not only inefficient, it is also ugly.

How does the above algorithm behave with 100 files in a folder? What happens, if showing my holiday-image folder, with 4000 jpeg files?

And it really does behave badly!

The first thing to check is, if "self listOfFiles" is expensive.
It could (in theory) read the folder's contents.
However, inspection shows, that it does not; instead it asks a builder for the valueHolder with:

listOfFiles
    |holder|
    (holder := builder bindingAt:#listOfFiles) isNil ifTrue:[
	builder aspectAt:#listOfFiles put:(holder :=  List new).
    ].
    ^ holder
Well, it is called 3 times in the inner loop; with 4000 files, this is 12000 calls to the bindingAt: method.
This results in a Dictionary lookup with O(1) complexity. So no real problem here -although I would remember the list in a local variable.

Tuning this, will only give us a few milliseconds in speedup. So let's care for this later, when the milliseconds do really count and let's go for the big time-eaters first.

Problematic are always non-linear (i.e. square or worse) complexities.

Here is an estimate of the call-counts, for an N element folder, when n+ files have been encountered to be new, and n- have been encountered to be gone:

	...
1       oldListOfFiles := self listOfFiles copy.
1       (currentFilenames asSortedCollection: [:f1 :f2| f1 baseName < f2 baseName]) asSet
1       do: [:fileName|
N           (oldListOfFiles detect: [:fileRow| fileRow fileName = fileName] ifNone: nil) isNil
N           ifTrue: [
n+              |nearestFileRow r|
n+
n+              nearestFileRow := self listOfFiles indexOf: (self listOfFiles detect: [:fileRow| fileRow baseName > fileName baseName] ifNone: nil).
n+              nearestFileRow = 0
n+                  ifTrue: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename)]
n+                  ifFalse: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename) beforeIndex: nearestFileRow].
n+              monitoring ifTrue: [self selectionOfFile value: r].
	    ]
	].
1       self listOfFiles reverseDo: [:fileRow|
N           (currentFilenames includes: fileRow fileName) ifFalse:[
n-              self listOfFiles remove: fileRow
	    ]
	]
	...
n+ and n- are typically small; but there are situations, when they do become large, and these should also be handled gracefully (in case many files are copied from the camera, or bulk-moved to another folder).

Let's break the lines, to see more details on the execution counts:

	...
1       oldListOfFiles := self listOfFiles copy.
1       (currentFilenames asSortedCollection:
NlogN       [:f1 :f2| f1 baseName < f2 baseName])
1           asSet
1       do: [:fileName|
N           (oldListOfFiles
N               detect:
N*N                 [:fileRow| fileRow fileName = fileName] ifNone: nil) isNil
N           ifTrue: [
n+              |nearestFileRow r|
n+
n+              nearestFileRow := self listOfFiles
n+                                  indexOf: (self listOfFiles
n+                                      detect:
N*n+                                        [:fileRow| fileRow baseName > fileName baseName]
				    ifNone: nil).
n+              nearestFileRow = 0
n+                  ifTrue: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename)]
n+                  ifFalse: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename) beforeIndex: nearestFileRow].
n+              monitoring ifTrue: [self selectionOfFile value: r].
	    ]
	].
1       self listOfFiles reverseDo: [:fileRow|
N           (currentFilenames includes: fileRow fileName) ifFalse:[
n-              self listOfFiles remove: fileRow
	    ]
	]
	...
first of all, the bad indentation makes it hard to see that the "detect:ifNone:" messages are obfuscating the intention:
the first wants to see if an element is already present in the oldListOfFiles, the second is even worse: it seems to find the first fileName which is collated after some file, and then searches for the index of it. Actually searching twice within the inner loop.

Rewritig the first "detect:ifNone:" for readability makes it clear, that a set is the collection of choice here; from:

	...
1       oldListOfFiles := self listOfFiles copy.
1       (currentFilenames asSortedCollection:
NlogN       [:f1 :f2| f1 baseName < f2 baseName])
1           asSet
1       do: [:fileName|
N           (oldListOfFiles
N               contains:
N*N                 [:fileRow| fileRow fileName = fileName])
N           ifTrue: [
n+              |nearestFileRow r|
n+
n+              nearestFileRow := self listOfFiles
n+                                  indexOf: (self listOfFiles
n+                                      detect:
N*n+                                        [:fileRow| fileRow baseName > fileName baseName]
				    ifNone: nil).
n+              nearestFileRow = 0
n+                  ifTrue: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename)]
n+                  ifFalse: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename) beforeIndex: nearestFileRow].
n+              monitoring ifTrue: [self selectionOfFile value: r].
	    ]
	].
1       self listOfFiles reverseDo: [:fileRow|
N           (currentFilenames includes: fileRow fileName) ifFalse:[
n-              self listOfFiles remove: fileRow
	    ]
	]
	...
we get:
	...
*N      oldListOfFiles := self listOfFiles copy.
*N      setOfOldListOfFileNames := oldListOfFiles collect:#fileName as:Set.
1       (currentFilenames asSortedCollection:
NlogN       [:f1 :f2| f1 baseName < f2 baseName])
1           asSet
1       do: [:fileName|
N           (setOfOldListOfFileNames includes:fileName)
N           ifTrue: [
n+              |nearestFileRow r|
n+
n+              nearestFileRow := self listOfFiles
n+                                  indexOf: (self listOfFiles
n+                                      detect:
N*n+                                        [:fileRow| fileRow baseName > fileName baseName]
				    ifNone: nil).
n+              nearestFileRow = 0
n+                  ifTrue: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename)]
n+                  ifFalse: [self listOfFiles add: (r := FileRow new fileName: fileName asFilename) beforeIndex: nearestFileRow].
n+              monitoring ifTrue: [self selectionOfFile value: r].
	    ]
	].
1       self listOfFiles reverseDo: [:fileRow|
N           (currentFilenames includes: fileRow fileName) ifFalse:[
n-              self listOfFiles remove: fileRow
	    ]
	]
	...
the "collect:as:" operation is O(N), although it is only called once, but the followup send of "includes:" goes from N*N down to N.

If we can ensure, that only a small number of files will ever be added or removed to the folder, that might already be good enough - however, is that realistic. Noone wants to wait a long time after dumping the photo folder from a camera. incomplete - continued later... ... actually, this code was so bad, it had to be rewritten from scratch.


Copyright © 2016 Claus Gittinger, all rights reserved

<cg at exept.de>

Doc $Revision: 1.5 $ $Date: 2021/03/13 18:24:51 $