Please try to understand them.
It may even be interesting to read, if you are a smalltalker !
The variable is untyped - it can hold a reference to any object.
However, the object itself "knows" what it is.
It knows the class of which it is an instance of.
Smalltalk is a dynamic but strong typed language,
in contrast to e.g. C and C++, which are static but weak typed languages
(if you don't see that, think of casting a pointer to another pointer type,
or to an integer or even to an array of 4 characters).
Smalltalk is very similar to JavaScript in how they deal with variables and types.
var foo, bar, baz;
var a, b, c;
is equivalent to:
|foo bar baz a b c|
Variable declarations must be placed at the very beginning of a scope (i.e. method or block).
In contrast to JavaScript, where variable declarations can also occur later in a function
However, in JavaScript, the scope is also "the whole scope", so some coding guidelines
prefer to also force programmers to put those declarations at the beginning, to not
confuse readers of the code.
In Smalltalk, the pseudo variable "self" refers to this object. In Java/JavaScript, this is called "this".
As Smalltalk classes are also instances of some (meta-) class,
the above also holds if a class method is called for (i.e. if a message is sent to a class object).
There, "self
" refers to the receiving class.
There is no corresponance to this in Java. Even if static functions might look similar to Smalltalk's class methods, they are not: in Smalltalk, class methods can be redefined in subclasses just like other ordinary instance methods, and the class method sees the current receiver as "self".
Inside a block (an "inner function"), "self" refers to the receiver of the method in which the block was defined. Thus blocks are actually closures and behave much like inner functions in JavaScript (or lambdas in new Java).
So, the javaScript:
looks in Smalltalk like:
<statement1> ;
<statement2> ;
<statement3> ;
with the last period being optional.
<statement1> .
<statement2> .
<statement3> .
The semicolon has a meaning in Smalltalk, which is described below.
In Smalltalk, to avoid confusion with the equality comparison operators,
assignment is written as ":=", whereas in Java/JavaScript, you would simply write "=".
So:
becomes in Smalltalk:
var a, b, c;
...
a = 5;
b = 4;
c = a;
...
This is actually quite nice,
if you think of how often you mistyped "=" instead of "==" in C/Java/JavaScript.
Because this happens so often, experienced C/Java programmers often write
"if (const == expr)" instead of "if (expr == const)"
to ensure that this kind of typo is caught by the compiler.
|a b c|
...
a := 5.
b := 4.
c := a.
...
...
return ( <someExpression> );
...
is written in Smalltalk as:
...
^ <someExpression>.
...
In Smalltalk, every method invocation returns a value.
If you don't return anything via the above return statement
(i.e. if execution reaches the end of a method's code),
the receiver object "self" is returned implicitly.
In Smalltalk, a method return always returns a value from the currently executing method. Even if that return is inside an inner block, which is described below.
Every such message send invokes a function (called "method" in Smalltalk) of the "receiving" object; it is the receiver's class, which determines which method will actually be executed. That is almost the same behavior for all three languages.
The syntax is slightly different:
receiver.foo();
becomes in Smalltalk:
receiver foo.
(without parenthesis).
var a, b, c;
...
a = b.abs();
c = a.foo();
b = a.foo().bar().baz();
...
which becomes in Smalltalk:
|a b c|
...
a := b abs.
c := a foo.
b := a foo bar baz.
...
Such messages (function calls) are called "unary messages", and the corresponding methods,
which implement the response to such a call are called "unary methods".
"unary" means, that there is no argument passed with the call.
The name of that function is actually the concatenation of
multiple parts, including the colon(s), to distinguish it from the above unary messages.
In Smalltalk, every function which expects
an argument must have colons in its name (one for each argument).
Thus, the number and position of the arguments is fixed and defined by the function's name.
So, a call to a "fooBar" function, with 2 arguments,
becomes in Smalltalk:
var a, b, c;
...
a = b.fooBar(1, 2);
c = a.fooBarBaz(1, 2, a);
...
There is no direct 1-to-1 mapping of names possible: the colon is not a valid
character in a name in Java/JavaScript.
|a b c|
...
a := b foo:1 bar:2.
c := a foo:1 bar:2 baz:a.
...
Notice, that it is possible in JavaScript, to call a function with a wrong number of
arguments:
in Java, this is cought by the compiler, which will bark at you.
In Smalltalk, you cannot use the same name for 1 or 3 arguments; one of the names must
be "fooBar:", the other something like "foo:bar:anyThingElse:" because there are three arguments.
var a, b, c;
...
a = b.fooBar(1);
c = b.fooBar(1, 2, 3);
...
Sometimes, especially when code is generated automatically from Java/JavaScript like languages,
methods use a single underscore character to separate arguments, as in:
But still: do not forget that "foo:" and "foo:_:" and "foo:_:_:" are three different
names for three different methods.
x.someGeneratedName:arg1 _:arg2 _:arg3.
Also notice that in Smalltalk, "foo:bar:" is a different message name than "bar:foo:", and that these will invoke different methods. There is no "automatic reordering" of message arguments and no "automatic renaming" of message names.
(x add:1) mul:5.
For this, a special syntax for binary messages like +, -, * etc. is provided, which
allows for those special characters to be written like infix operators.
However, as these are still messages (virtual function calls), the language does
not imply any meaning into those.
Especially, the language does not imply any arithmetic meaning and
the compiler treats all such messages the same.
Therefore, no precedence or grouping as in Java/Javascript
is associated to those. They are simply evaluated from left to right.
That may make mathematical expressions hard to read, as you have to use parenthesis for
grouping (and you often should, even in case the left-to-right order is the desired one).
So, the Java expression:
MUST be written as:
a + b * 5 + 4
in Smalltalk. Otherwise, the expression would simply be evaluated left to right,
giving a wrong answer.
a + (b * 5) + 4
This may be a bit confusing or annoying for beginners. Experienced Smalltalkers use parenthesis and don't even think about it. Actually, making the evaluation order explicit using parenthesis is considered good style anyway - even in C, Java or JavaScript: you don't have to think if the shift operator has precedence over a comparison or not.
Notice that not only "+", "-", "*" and the other arithmetic operators are allowed as binary message names. Almost any other non-letter can be used, and even multi-character combinations are allowed (the original Smalltalk-80 version did not explicitly define a limit on the number of characters, but actually limited this to a max. of 2. Modern Smalltalk implementations usually allow up to 3).
Now you understand the meaning of the comma-operator ",": it is
a binary message implemented in many collection classes which
creates a concatenation of the receiver and the argument.
So 'hello','world'
creates a new string containing 'helloworld', and
because it groups left to right, you can create longer strings as in
'hello',' ',(OperatingSystem getLoginName)
.
var t = new Executor;
t doThis();
t doThat(someArg);
t doSomethingElseWith(arg2, arg3);
...
this looks of course similar in Smalltalk:
|t|
t := Executor new.
t doThis.
t doThat:someArg.
t doSomethingElse:arg2 with:arg3.
...
however, in the above, there is a sequence of messages being sent to the same object,
and the variable t is only needed for this.
In Smalltalk, the semicolon (;) means: "send the following message to the same receiver".
And the above code can also be written as:
(Executor new)
doThis;
doThat:someArg;
doSomethingElse:arg2 with:arg3.
...
i.e. you don't need the extra temporary variable, and also not to write it multiple times.
This is purely syntactic sugar, not adding any new semantic feature. But it is useful in some
situations (as seen below, in the exception handler set example).
Such constructs are called "cascaded messages",
and the value of the cascade is the value returned by the
last cascaded message send.
So you have to be a little careful, when assigning the value
of a cascade, as in:
because afterwards, the returned value from "
e := (Executor new)
doThis;
doThat:someArg;
doSomethingElse:arg2 with:arg3.
...
doSomethingElse:with:
" is
assigned the variable "e". But often we do not want to
depend on what this returns - it could be something
else but the receiver object.
To ensure that the original receiver value gets assigned,
add a "yourself
" as the last message of the cascade:
This little trick works because "
e := (Executor new)
doThis;
doThat:someArg;
doSomethingElse:arg2 with:arg3;
yourself
...
yourself
" is implemented in
the Object class (i.e. every other object understands it),
and is guaranteed to return itself. Actually, the "yourself
" message is
implemented there for this very reason.
myBlock := [ a foo. b bar ].
which corresponds to the JavaScript code:
myFunc = function () { a.foo(); return b.bar(); };
as you can see, Smalltalk is a bit shorter and less confusing in its syntax.
By the way, there is nothing comparable to this in Java (*), because blocks are not only wrapping
the contained statements, but also allow full access to the visible variables, can be
assigned to variables, be stored in other objects or be returned from a method call
or block evaluation and especially return from their containing method (as we'll see later).
|m1 m2 outerBlock|
m1 := 1233.
outerBlock :=
[
|o1 o2 innerBlock|
o1 := 1.
o2 := m1 + 1.
innerBlock :=
[
|i1 i2|
i1 := o2 + m1 + 4.
[ i1 + 1 ]
].
innerBlock
].
corresponds to:
var m1, m2, outerFunc;
m1 = 1233;
outerFunc = function() {
var o1, o2, innerFunc;
o1 = 1;
o2 = m1 + 1;
innerFunc = function() {
var i1, i2;
i1 = o2 + m1 + 4;
return function() { i1 + 1; };
};
return innerFunc;
};
(*): newer versions of Java added a lambda feature, which provides a
subset of the block semantics. As you will read below,
blocks can return from their enclosing method, which is not possible in Java.
If a block gets evaluated, its inner statemens are evaluated, and the block's return value is the last inner statement's expression-value. So, inside a Smalltalk block, there is no return-from-this-block-statement, only expressions which are evaluated in sequence, for the last one to provide the return value of the block. You can (and often will) put a return inside a block, but it has a completely different behavior and is described below in more detail.
Blocks can have arguments, these are listed at the beginning, each with a trailing colon,
before a vertical bar, as in:
As already mentioned, blocks can be passed to other code as argument.
The collection classes in Smalltalk make heavy use of that,
in that they provide an extensive set of enumeration functions,
which iterate on the collection arguments,
using a block argument which performs the operation.
For example,
here is a general iterator (which is implemented by every collection class),
which enumerates the elements of an array.
The array is written as an array-literal (that is a compile-time generated object):
|block1 block2 block3|
block1 := [ self helloWorld ].
block2 := [:arg1 | self helloWorld ].
block3 := [:arg1 :arg2 :arg3 | self helloWorld ].
ignore the code inside the block for a moment - the "
|myCollection|
myCollection := #(1 2 3 4 5).
...
myCollection do:[:el | Stdout show:el].
...
show:
" message, if sent to a stream-like
object will print its argument (el in the above example) on the stream.
Look at how the block is passed as an argument of the "do:
" message, which is sent to
the array-instance (myCollection).
The above is also possible in JavaScript,
but is relatively unreadable there, because it requires a full-featured
function definition:
Notice that the Java community saw the power and usability of such constructs
and added syntactic sugar to newer Java/JavaScript versions.
Now, these also support a more comprehensive "arg => expr" form.
However, this only allows for a single expression as lambda body,
without control structures (which of course is a consequence of
having control structures as syntax, whereas in Smalltalk, all control
is via message sends).
var myCollection;
myCollection = [1, 2, 3, 4, 5];.
...
myCollection.forEach( function(el) { Stdout.show(el); };
...
^ expression
") is evaluated inside a block,
the containing method is returned from (not the block).
For C-programmers, this looks like a kind of longjmp out of the block's containing
method.
Thus, the following code searches a collection for the first element for which
a filter returns a true value:
myMethod
...
someCollection do:[:el |
(aMatchBlock value:el) ifTrue:[^ el]
].
...
this corresponds to:
function myMethod() {
...
someCollection.forEach( function(el) {
if (aMatchBlock(el)) return el from myMethod;
}
...
or with some syntactic sugar:
function myMethod() {
...
forEach( el in someCollection) {
if (aMatchBlock(el)) return el from myMethod;
}
...
in Java, you have to simulate that behavior using exceptions,
which might look quite complicated.
Passing a returning-block down to some other call corresponds technically to a
non-local return, from however deep down that block was passed.
If you think carefully, you'll see that it does not really make sense to return from the block alone in the above enumeration code, as this would simply leave the inner block, and continue enumeration with the next element.
If at all, it sometimes makes sense to break out the whole enumeration
loop. For this, Smalltalk collection classes offer an additional
enumerator called "doWithExit:
". This passes an additional
"exit"-argument to the block:
...
someCollection doWithExit:[:element :exit |
...
someCondition ifTrue:[ exit value ]
...
].
...
which corresponds to:
...
forEach (el in someCollection) {
...
if (someCondition) break;
...
}
...
try {
...
some computation
} catch(<someErrorClass> e) {
...
handler action
...
} finally {
...
cleanup
...
}
is written in Smalltalk as:
[
...
some computation
...
] on:<someErrorClass> do:[:e |
...
handler action
...
] ensure:[
...
cleanup
...
]
there are also variations without ensure block (on:do:
)
and a version which corresponds to a "finally
",
without exception handler (ensure:
).
In addition, Smalltalk offers a variant of the finally,
which is ONLY invoked in the non-normal-return situation (called "ifCurtailed:"
),
but not in a regular return. So the programmer can differentiate between
exceptional and normal situations, which may be useful in some situations.
In both cases, "someErrorClass" is the class of the error to handle, and "e" is object providing exception information details (the exception instance). Also, in both languages, the Exception class forms a hierarchy, and a handler for an exception class E will also catch exceptions for any of its subclasses.
In addition to this hierarchical organisation, Smalltalk also offers handler sets, which catch a bunch of possibly unrelated error types. For example, to catch both arithmetic and file-operation errors in a single handler. And, another useful feature are ad-hoc exceptions (called "Signal"), which do not need a class, but are created "on-the-fly" and allow for purely private, completely anonymous exceptions, queries and notifications.
This subtle difference has many implications:
[
...
some computation
...
] on:OpenError, ArithmeticError do:[:e |
...
handler action
...
]
and of course, you can provide different handlers for each class individually:
(ExceptionHandlerSet new)
on:ZeroDivide do:[:ex | 'division by zero' printCR. ex proceed];
on:HaltInterrupt do:[:ex | 'halt encountered ' printCR. ex proceed];
on:DomainError do:[:ex | 'domain error ' printCR. ex proceed];
handleDo:[
...
your code here
...
]
There is no such thing as a "native type", and you can add new methods to Integer, Character, String and all the other classes, which are native types in Java.
In Smalltalk, classes can be passed as argument, returned as result, stored in variables, etc. Everything that can be done with regular objects can also be done with classes.
One major effect of this,
is that makes many design patterns which deal with
factories, facades etc. obsolete in Smalltalk.
For example, to instantiate either a FileLogger
or a DummyLogger,
depending on some condition,
you'd simply write:
and start logging.
logger := (debugging ifTrue:[FileLogger] ifFalse:[DummyLogger]) new.
Or you can pass a class as parameter:
being real objects, classes respond to class methods,
just like ordinary instances respond to instance methods.
Classes also inherit methods from their superclass(es).
makeLogger:whichLoggerClass
^ whichLoggerClass new
The instance creation method "new" is therefore a regular
class method, which can be redefined in a subclass.
And the subclass may do something completely different in
its own new method.
For example, a singleton class may want to redefine new,
to ensure that only one single instance of it is ever instantiated,
by redefining new as:
Be aware, that super new still returns an instance of the
receiver's class - in contrast to java, where "super new" is not
even possible, and if it was, it would return an instance
of the superclass!
new
TheOneAndOnlyInstance isNil ifTrue:[
TheOneAndOnlyInstance := super new
].
^ TheOneAndOnlyInstance
Another example is a class which redefines "new" to
count the number of instantiated instances:
new
Count := Count + 1.
^ super new
Having redefinable class protocol makes another big part of the design patterns obsolete. After all, some (cynic) tongues say that after all, "most design patterns are workarounds for bad language design".
Class Variables have a global live scope, but are only visible in the class and its subclasses. A class variable exists exactly once, and all references to it refer to the same binding.
Class Instance Variables are additional private slots in the class object, and inherited by subclasses. However, each subclass gets its own binding slot. Thus, Class Instance Variables are for the class, what "private variables" (Instance Variables) are for regular objects. The code to work on them is shared by a class and its subclasses, but each (sub-)class has its own slot and may have a different value for it.
In the above Count-example, if "Count" was a class variable, it would exist exactly once, and count the number of instantiations of the class and all of its subclasses.
In contrast, if "Count" was a class instance variable, the class and each subclass would have a private counter, which counted the number of instantiations for each class separately.
Class instance variables are very useful to provide per class caches, configurations etc. which have a similar functionality, but require different state as per subclass.
There is no corresponding mechanism in Java, and actually it would be harder to simulate. An implementation might use a HashTable, using a combination of (sub-)class and variable name as index.
Thus, they can be seen as static variables, with a restricted visibility among multiple classes which need not be subclasses from a common superclass. (in Java, a static variable can never by seen by an outside class).
Mixed mode arithmetic is provided, and operations return a result as appropriate for the operands.
Smalltalk automatically cares for out-of-range values, and returns exact results for integer and fractional arithmetic (i.e. when adding two integers, it checks the result and automatically converts the result to a LargeInteger (which is Smalltalk's equivalent of BigInteger).
Integer overflows are not possible in Smalltalk.
In Smalltalk, all numeric objects are real objects (with a class, which can be extended to provide additional methods). As a Java programmer, think of all numbers being always boxed. In Smalltalk, there is no such concept as a native type, and every number is a full-blown object which can be put into containers, passed as argument or returned as value without special declarations.
Smalltalk | Java | Notes |
Array | Object[] | |
ByteArray | byte[] | primitive type in Java |
WordArray | unsigned short[] | primitive type in Java |
SignedWordArray | short[] | primitive type in Java |
IntegerArray | unsigned int[] | primitive type in Java |
SignedIntegerArray | int[] | primitive type in Java |
FloatArray | float[] | primitive type in Java |
DoubleArray | double[] | primitive type in Java |
HalfFloatArray | - | |
OrderedCollection | ArrayList<Object> | |
Dictionary | Hashtable<Object,Object> | |
Set | HashSet<Object> | |
SortedCollection | ?? | |
String | String |
A little annoying in Java is the different syntax and protocols of primitive
type arrays vs. object collections:
length vs. length() vs. size()
and "[]" vs ".get()" and ".put()".
In Smalltalk, all getters are named "at:",
all setters are named "at:put:"
and to get the size of a container, always use "size".
With the exception of a few special collections (ByteArray, IntegerArray, FloatArray etc.), all collections can hold any object in Smalltalk. There is no need/support for generics or templates. However, as all collections can be subclassed or wrapped, it is possible to define typed collection classes, which may restrict the set of accepted element types.
The above mentioned exceptions (ByteArray,...)
are space-efficient variations.
For example, ByteArray stores
byte-valued integer objects and needs 1 byte of storage per element.
In contrast, a full blown Array instance would require one pointer
(4 or 8 bytes, depending on the CPU) per element.
A similar "space" efficient container is "String" (which stores 1-byte characters),
"Unicode16String" (which stores 2-byte characters),
"Unicode32String" (storing 4-byte characters)
"FloatArray" and a bunch of other classes (IntegerArray, WordArray, DoubleArray etc.)
Smalltalk | Java |
ReadStream | ... |
WriteStream | ... |
FileStream | ... |
ExternalStream | ... |
PipeStream | ... |
Socket | ... |
SplittingWriteStream | ... |
FilteringStream | ... |
ActorStream | ... |
Random | ... |
In practice, this has resulted in a lot of duplicate code in typical Java projects. You will often find utility container classes which duplicate existing functionality only because the original class cannot be extended.
For bigger projects, this shows up in the number of classes and methods (and lines of code); Java projects usually require many more lines of code than corresponding Smalltalk projects (we have seen Java projects which take up 5 times as many classes as the corresponding Smalltalk code).
Be prepared for more to come on reflection, the class library, extensions etc.
Copyright © 2002-2020 Claus Gittinger, all rights reserved
<info@exept.de>