[prev] [up] [next]

Things you can do in Smalltalk
(which are hard or impossible for others)

Introduction

Many things which are difficult or even impossible to solve in other programming languages are simple or trivial in a highly reflective system such as Smalltalk. Many of the patterns as collected and described by the gang of four are not needed or are much simpler in Lisp or in Smalltalk.
This document will show some of this kind - it is not about things you can do in Smalltalk which you can also do in any other language, but about things which are very easily done in Smalltalk but drive you mad in other languages.
Even if you are already a smalltalker, it may also be interesting to read and to get new ideas.

Updates, Fixing without Downtime (24/7 operation)

This is almost trivial to implement in a system where packages, classes and even individual methods can be added or removed dynamically at any time. Some of our projects are required to operate without downtime - even when updates, patches or other maintenance operations are to be performed. You can add support for that into your product relatively easily. To load and compile patches from a directory into a running program is as simple as:
    'patches' asFilename directoryContentsDo:[:eachFile |
	Smalltalk fileIn:eachFile
    ].
And, each patchFile may contain a single method, a class or a complete package; both as source or as compiled machine code.

Of course, in addition to new classes, you can add code extensions for any class - even system classes or Object. You can of course also add code for classes of which instances already exist. (and, because in ST/X there are no invisible primitives, you can even dynamically load or update primitive C coded methods)

Reflecting on Object References

This may sound strange, but it seems to be an obvious problem for many other systems: "how can you get rid of unwanted object references?". One such example is Firefox which, even after years of development is still uncapable of dealing properly with instantiated JavaScript objects (yes, did you notice that your Firefox's memory footage is ever-growing with some JavaScript code ?).

Smalltalk provides very powerful reflection and manipulation features on the object reference level. For example, it is possible to find all instances of a class:

    SomeClass allInstances
all derived instances:
    SomeClass allSubInstances
and get rid of them:
    SomeClass allInstancesDo:[:inst | inst becomeNil]
That single line would help firefox in its tab-remove code:
    SomeClass
	allInstancesDo:[:inst |
	    (inst isOwnedBy:tabToBeClosed) ifTrue:[
		inst becomeNil]]
Be assured that I am totally aware that the above is only a hack to fix bugs which are actually elsewhere. However, quite obviously it seems to be *VERY* hard to fix those bugs in the first place. Therefore, a cleanup action at tab-closing time would help a lot, and in my opinion, a workaround and repair is much better than the current way to handle this kind of memory leak via a "crashing garbage collect".

Iterating over Private Variables (i.e. Instance Variables)

In Paul Graham's paper "Being Popular", we find the following citation:

In Common Lisp I have often wanted to iterate through the fields of a struct-- to comb out references to a deleted object, for example, or find fields that are uninitialized. I know the structs are just vectors underneath. And yet I can't write a general purpose function that I can call on any struct. I can only access the fields by name, because that's what a struct is supposed to mean.

In Smalltalk, you can access an object's slot via the "instVarAt:", "instVarAt:put:" and "instVarNamed:" messages. The names of the slots are retrieved with "allInstanceVariableNames", the number of slots via "instSize".

Thus, a debugging method to dump *ANY* object's contents could be:

    dump: someObject
	someObject class allInstanceVariableNames
	    doWithIndex:[:name :idx |
		Transcript
		    show:name;
		    show:' is ';
		    showCR:(someObject instVarAt:idx).
	    ]
of course, you can also write a block (aka-function) for this:
    dumper :=
	[:someObject |
	    someObject class allInstanceVariableNames
		doWithIndex:[:name :idx |
		    Transcript
			show:name;
			show:' is ';
			showCR:(someObject instVarAt:idx).
		]
	].
and iterate over a collection of objects to be dumped with:
    objectsDoBeDumped do:dumper
to dump the Transcript, try:
    dumper value:Transcript
Please DO NOT use such things in regular code - it should be restricted only to debugging and support code. If overused, it may make the program hard to understand, hard to debug and very hard to maintain. Also, you loose many of the IDE's nice and useful help functions (senders, implementors, access-finders etc.).

Generating Code Dynamically

One of the nice features of an IDE being "really" integrated is that the compiler tools are still around at program execution time. This can be useful to create code on the fly - a useful feature both for intelligent programs which learn new tricks, and to provide some scripting facility to the user.

Dynamically generated Blocks

Lets start with a dynamic block (aka a closure or function), created from a string:
    |s b|

    s := '[:a :b | (a squared + b squared) sqrt ]'.
    b := (Block fromString:s).
    Transcript showCR:(b value:3 value:4)
It's a bit of a pity, that the internal represenation differs much more from the textual one. Things are more coherent in Lisp-like languages. But fair enough for our needs...

You can (and should) analyze the code for the messages being sent, to make sure that no bad messages (i.e. only allowed ones) are introduced if the codestring originates from a user:

    |s b allMessages|

    s := '[:a :b | (a squared + b squared) sqrt ]'.
    b := (Block fromString:s).
    allMessages := b homeMethod literals.
    Transcript show:'block contains messages: '; showCR:allMessages.
for example, to verify that the user does not inject bad code into a scripting engine:
    |s b codeString allowedMessages|

    allowedMessages := #( + - * / sqrt squared sin cos value ).
    codeString := Dialog request:'Give an expression on a and b.
Use parenthesis as in (a*5) + (b sin):'.
    codeString notEmptyOrNil ifTrue:[
	s := '[:a :b | ',codeString,']'.
	b := (Block fromString:s).
	((b homeMethod literals)
	    contains:[:msg |
		(allowedMessages includes:msg) not
	    ])
	ifTrue:[
	    Transcript showCR:'Sorry - the block contains a bad message'.
	] ifFalse:[
	    Transcript show:'The value of "',codeString,'" for a=4,b=5 is '; showCR:(b value:4 value:5).
	].
    ].
Use this as a basis to write your own spread-sheet; if required, write your own parser which adds proper operator precedence, or use the built-in JavaScript parser.

For check-code like the above, the builtin Parser or RBParser frameworks can be used. Then, reflect on the parseTree instead of either source- or byte-code.

Dynamically generated Methods

Of course, a class can also learn (and forget) new tricks. Many rule based AI (Artificial Intelligence) algorithms depend upon a system which can learn and enhance itself (which is one reason why Lisp, Prolog and Smalltalk were so popular in these areas). Here, a new method is added to an existing class:
    Number
	compile:'cubed ^ self * self * self'
and can be used immediately as in:
    5 cubed
or with a floating point number, as in
    5.0 cubed
or, it can be forgotten:
    Number removeSelector:#cubed
(retry the above example after the removal, to see that integers really no longer know how to compute volumes...)

Dynamically generated Classes

Anonymous classes are not known to anyone, but implement some interface. Let us dynamically generate a class to represent people with first and lastName, and the create an instance of it (which is shown in an inspector). Notice, that you will not find the class in the browser, and that it will be garbage collected automatically when you release the reference to it (by closing the inspector):
    |cls|

    cls := Object
	      subclass:'anonymous'
	      instanceVariableNames:'firstName lastName'
	      classVariableNames:nil
	      poolDictionaries:nil
	      category:nil
	      inEnvironment:nil.
    cls compile:'firstName ^firstName'.
    cls compile:'firstName:s firstName := s'.
    cls compile:'lastName ^lastName'.
    cls compile:'lastName:s lastName := s'.

    ((cls new firstName:'hello') lastName:'world') inspect.
Anonymous classes are very helpful to represent objects as read from external specifications. For example, instances of IDL, XML or ASN1 specified types can be represented as instances of such dynamically created classes. Of course, you would also generate access methods to the individual fields dynamically.

Dynamically generating Code without affecting the ChangeFile/ChangeList

When any of the above examples is executed, a changeList entry is added to both the in-memory changeSet and the external changeFile. This can be turned off, by wrapping the compiling code inside:
    Class withoutUpdatingChangesDo:[
	...
	cls compile:'...'
	...
    ].
... more to be added here ...

Singletons

Because classes are objects where the protocol is defined by the metaclass, overriding the "new" method allows for all kinds of additional functionality. A singleton class is simply one, which remembers the very first instance it ever created and returns that again. The best place to remember that instance is a classInstance variable (called "theOneAndOnlyInstance" in the example below).
A corresponding instance creation method could be:
    new
	theOneAndOnlyInstance isNil ifTrue:[
	    theOneAndOnlyInstance := super new.
	].
	^ theOneAndOnlyInstance

Tracing, Counting, Limiting the Number of Instances

From the above, it should be obvious, how all of the above features are implemented by either redefining the instance creation method or adding extra instance creators.
The fact, that we can add our own additional instance creation methods (with a name other than the common "new") is often overseen by non-Smalltalkers:
    newCounted
	instanceCount := instanceCount + 1.
	^ self new

A Nice Syntax for Lambda Closures

Lambdas (aka closures) have been known for decades (Scheme and Smalltalk have had them since the very beginning, in the 1970s) to be very useful, powerful and help to implement many very convenient functions: exceptions, unwinding, callbacks, enumeration of collections, UI-actions, lazy evaluation, and more.

In Smalltalk, closures are called blocks, and are written simply by placing the expression(s) to be closed over in square brackets, possibly with arguments. eg:

    b := [ do something ]
or:
    b := [:arg | do something with arg ]
This is much more convenient than writing:
    b = (function(arg) {
	    return do something with(arg);
	});
and even a little more convenient than the modern variant:
    b = arg => do something with(arg);
which has the limitation of not allowing multiple statements inside.

But Smalltalk blocks can do more than just evaluate an expression: they can also return from their containing method context. For example, consider a method which has to search for something and return the first match, given a collection and a block to do the matching. Eg.:

    findFirstMatch:matchBlock in:aCollection
	aCollection do:[:eachElement |
	    (match value:eachElment) ifTrue:[
		^ eachElement
	    ]
	].
	^ self proceedableError:'nothing found'.
There are a whole number of things to look at in the above code: It should be mentioned, that you can write the above code even more terse and readable in Smalltalk, if you know your collection protocol a bit better:
    findFirstMatch:matchBlock in:aCollection
	^ aCollection
	    findFirst:matchBlock
	    ifNone:[ self proceedableError:'nothing found' ]
How much code is needed for that in your favourite programming language?

Look Ma, No Factory Pattern Needed

The fact that classes are first-class citizens and can therefore be passed around as argument or returned as return value, makes the factory pattern almost obsolete. For example, the code to let a view-object dynamically decide for itself which class to instantiate for its controller-object looks is as trivial as:
    controller
	^ self controllerClass new
where controllerclass can be as simple as returning the class reference:
    controllerClass
	^ VeryStrictController
or do some fancy decision making:
    controllerClass
	(Time now hour between:18 and:20) ifTrue:[
	    ^ FriendlyControllerForHappyHour
	].
	^ super controllerClass
(where "VeryStrictController" and "FriendlyControllerForHappyHour" would be the classes to instantiate)

The advantage of being a fully dynamically typed language ensures that this code even works unchanged in ten years, when fifty new classes and subclasses for controlling have been added in various parts of a bigger system.

Powerful Exception Handling

The exception handling system in Smalltalk is much more powerful than anything available in C, C++, Java or C#. The one feature which is missing in all of them is called "proceedable exceptions". This means that an exception handler is allowed to perform some operation (or do nothing, if it likes) AND let the program continue execution at the point after the raise.

Some have argued "why would one want to proceed after a raise" - but that's the typical "I am a hammer - everything must be a nail" attitude - or in other words: "if all I can use raising for is to signal non-proceedable hard error situations, why would I want to proceed after an exception?".

A more intelligent aproach is to see an exception as "something unexpected happened - can anyone help?". or even: "something unexpected happened - is anyone interested?".

If you look at exceptions from this perspective, it makes sense to send notifications, warnings, progress-information etc. all using the exception mechanism. For example, in Smalltalk/X, all info- and progress notifications are performed by raising a Notification.

Even more convenient is the situation if some object deep down in the calling hierarchy needs additional information to handle an unnormal situation. For example, a compiler might need to know if it is ok to compile code with ugly or old style code in it. If the compiler is executed in the interactive IDE, and a user has originated the compile operation, it is convenient to open a little dialog window and ask the user. However, if the compilation is within a batch operation, and 3000 files are to be compiled, you better not ask the user for every method. In Smalltalk/X, a so called Query is used - this is a kind of exception which - if unhandled, proceeds with a default value, but can be handled and return a value as requested from the user. In all languages without proceedable exceptions, you would have to pass such information down either via arguments along every called function, or by setting global or other static flags, which will later make it hard to use the compiler in a multithreaded operation.

Terminating a Thread

It seems strange, but most language environments are not able to properly terminate a thread; either it is not at all supportd, or it is a highly unsave operation, which leaves system resources, locks and other critical state in an undefined or inconsisten state.
This does include old dinosaurs like C++ and more recent languages like Python or Java.

It may sound esotheric, but in practice a thread may run into an endless loop if your program allows for user code to be executed which does so, or f you have a programming error in a system which should not be stopped as a whole, but only the affected thread should so (a controlling application or server, with a bug in a less critical print utility or menu function).

In Smalltalk you can interrupt any thread and push a continuation stack from onto its call chain while it is stopped, and let it execute eg. an exception raise or call stack unwind.

Of course, this is a completely save operation, and the thread will perform all of its unwind (i.e. ensure) operations. And even if any of those are missing, the garbage collector will eventually cleanup any leftovers.

Powerful Numeric Class Hierarchy

The fact that integer operations cannot overflow and divisions automatically generate fractions if required is one feature which is often overlooked. Smalltalk as a very powerful, highly polymorphic numeric class hierarchy which is not even aproached by most programming languages (Lisp being a noticable exception, again).

If you compute values in a 32bit environment, you have to be very careful to not generate incorrect results. The situation is less of a problem with 64bit integers (but still present), However, it requires the programmer to always think about the consequences and never forget to write "long long" instead of "int". Programmers must also always know the value range of their integers - something which can be hard if you write a reusable library component, and you have no control over the incoming values. Another problem are rounding errors due to floating point arithmetic. A programmer has to always think not only about the type ("int", vs, "long", vs. "long long") but also the values of intermediate results.

To illustrate this, try to evaluate a trivial expression like "a / b * c" with a,b,c being integers. Without much thinking, you'd write in C:

    int a, b, b;

    ...
    result = a / b * c;
    ...
(yes, you would write it that way, wouldn't you ?)

Now, how about "a=1", "b=2", "c=5" ?.

Every kid tells you: "well, the result is 5/2 (or 2.5, if you like).
But not so the advanced programmer; he'd say: "well, in theory. But in the real world, the result is zero, because of integer truncation". Programmers tend to "make virtues out of necessities" and blame themself for obvious deficiencies of the language system. Real programmers know their system's limitations. BTW, the situation is not better in the C++, Java and C# world.

Let us continue for our amusement; you might think about how to fix this and write:

    int a, b, b;

    ...
    result = (int)((float)a / (float)b * c);
    ...
good try.
But what do you expect as answer, if you give it "a=100000001", "b=100000000", "c=100000000" ?

Surprise, surprise: we get "100000000". Well, fair enough; who cares for that little error. Who cares for that lost penny - simply transfer it to my account (yeah, that's what banks do usually) !

Ok, you say, lets get rid of the rounding error, and go back to integer code. To avoid the truncation, let's multiply first:

    int a, b, b;

    ...
    result = a * c / b;
    ...
sounds better, what do we get ?

Wow, 277 !

The problem with these languages is that the code looks ok, but computes something radically different from what you read. The operation which is performed in the above code is actually:

    result = trunc( (int)(unsigned(a * c) mod 4294967295) / b);
In Smalltalk, you can compute such a wierd thing - but then you'd also write it down that way !

Conclusion: without a proper numeric class hierarchy, it is VERY difficult (but not impossible) to write mathematical correct computations.

Finally, for your amusement: Quining - A Program which Generates its own Source

Not really being what we'd call a day-to-day problem, but fun to try: self-reproducing (quining) programs. Here is one possible Smalltalk solution:
|a| a := '[:a | Transcript show:''|a| a := '';showCR:a storeString,''.'';showCR:''(Block readFrom:a) value:a'' ]'.
(Block readFrom:a) value:a


Copyright © 2007-2009 eXept Software AG, all rights reserved

<info@exept.de>

Doc $Revision: 1.15 $ last modified $Date: 2021/03/19 00:17:02 $