To a hammer, everything looks like a nail.
Before describing the language and how you can create your own programs, we should explain a few basics - both to give you some background and to define the technical terms used in the documentation (and literature).
Keep in mind that this text is only a short introduction - we recommend reading of a standard textbook on the language for more detailed information on the language (-> 'literature').
In contrast to hybrid systems like C++ or Java,
"everything" means really "everything" in Smalltalk.
This includes integers, characters, arrays, classes
and even a program's stackframes, which hold the local variables
during execution.
In Smalltalk, there are no such things as "builtin" types or classes,
which have to be treated different, or which do not behave exactly like
other objects with respect to message sending, inheritance or debuggability.
For example, in Smalltalk, classes like integer, character, string or classes
themself can be given new or modified methods -
even at runtime, by dynamically loading new code.
We will therefore use the term "message" or "message send" for the act of asking for an operation by name, and, as we will see later, the term "method" for the actual code which will eventually perform the operation.
To the outside world, any internals of an object are hidden - all interaction happens only via messages. The set of messages an object understands is called its "message protocol", or "protocol" for short.
+
, -
, *
, etc. messages.
asUppercase
, asLowercase
, etc. messages.
Therefore, theoretically, an object may add the "+" message to its protocol and perform an operation which has nothing to do with the mathematical concept of adding numbers.
In practice, this is never done in Smalltalk,
since it makes programs less understandable.
For example, the Java operator to concatenate strings is "+",
whereas in Smalltalk it is "," (comma).
This was done by purpose, to make the code easier to understand.
However, it is useful to keep in mind that only the message's receiver is responsible for the outcome, and in theory, any operator or message selector can be redefined by any object. (As we will see, this is also the reason for the uncommon precedence rules in binary operations.)
On the other hand, it makes the system very flexible.
For example, it is very easy to extend the numeric class hierarchy with additional
things like Complex numbers, Matrices, Functional Objects etc.
All that is required for those new objects to be handled correctly is
that they respond to some basic mathematical protocol for arithmetic,
comparison etc.
Existing mathematical code is usually not affected by such extensions,
which makes Smalltalk one of the best environments for code reuse and sharing.
Classes may have zero, one or many instances.
You may wonder how a class without instances could be
useful - this will become clear when inheritance and abstract
classes are described further down in this document.
1
, 99
and -8
are instances of the Integer
class
1.0
and 3.14159
are instances of the Float
class
'hello'
, 'foo'
are instances of the String
class
Button
class
nil
is the one and only instance of the UndefinedObject
class
Every class keeps a table (called "MethodDictionary") which
associates the name of the message (the so called message selector) to a method.
When a message is sent to an object, the classes method table
is searched for a corresponding entry and - if found - the associated
method is invoked (more details below ...).
Since Smalltalk is a pure object oriented language,
this table is also an object and accessible at execution time;
it may even be modified during execution
and allows objects to learn about new messages dynamically.
Of course, the interactive programming environment heavily depends on this;
for example, the browser is a tool which adds new items to this table when
a method's new or changed code is to be installed.
A class inherits all protocol as defined by its superclass(es) and may optionally redefine individual methods or provide additional protocol.
Therefore, a message send performs the following actions (***):
#doesNotUnderstand:
) to
the receiver with the message object as argument.
Object
, there is no need to.
Actually, it may occasionally make sense for a class
to inherit from no class at all (i.e. to have no superclass).
The effect is that instances of such classes do not inherit ANY protocol
and will therefore trigger an error for all received messages.
All instances of a class provide the same message protocol,
but typically contain different internal state.
It is actually the class, which
provides the definition of the protocol and amount of internal
state of its instances.
String
class
and respond to the same set of messages. But the internal state of the first
string consists of the characters "h" and "i", whereas the second contains
the characters "w", "o", "r", "l", "d".
An object's instance variables are only accessible via protocol,
which is provided by the object - there is no way to access an object's
internals except by sending messages to it.
This is true for every object - even for the strings in the example above.
There is no need for the sender of a message to actually know the class of
the receiver - as long as it responds to the message and performs the
appropriate action.
'at:'
message. You could write
an ExternalString class, which fetches characters from a file
and returns them from this message.
The sender of the 'at:'
message would not be affected at all by this
(except for a possible performance degration ;-).
#basicSize
, #identityHash
etc.).
Thus, when we send a message to some `normal' object, the corresponding class
object provides the behavior - when some message is sent to a class object,
the corresponding metaclass provides the behavior.
Technically, messages to classes are treated exactly the same way as
messages to non-class objects: take the receiver's class, lookup the method in its
method table, execute the method's code.
Since different metaclasses may provide different protocol for their class
instances, it is possible to add or redefine class messages just like any other
message.
As a concrete example, take instance creation which is done in Smalltalk
by sending a "new"-message to a class.
In Smalltalk, there is no such thing as a built-in "new" (or any other built-in)
instance creation message
- the behavior of those instance creation (class) messages is defined exclusively by metaclass protocol.
Therefore, it is possible (and often done) to redefine the "new" method for special handling;
for example singletons (classes which have only a single unique instance), caching and pooling
(the "new" message returns an existing instance from a cache), tracing and many more are easily
implemented by redefining class protocol.
Object
-Class,
which provides a rich protocol useful for all kinds of objects
(comparing, dependency mechanism, reflection etc.).
As we will see shortly, Smalltalk programs only consist of messages being sent to objects.
Since even control structures
(i.e. conditional evaluation, loops etc.)
are conceptionally implemented as messages,
a common syntax is used in your programs both for
the programs flow control and for manipulating objects.
Once you know how to send messages to an object,
you also know how to write and use fancy control structures.
Smalltalk's power (and difficulty to learn) does not lie in the language itself, but instead in the huge protocol provided by the class libraries.
Let's start with languages building blocks...
:=
"
"some comment"
"this
is
a
multiline comment"
"
another multiline comment
"
"/ this is an end-of-line comment
If the remaining line contains comment characters, these are ignored.
As such, End-of-line comments are especially useful to comment-out code which contains comments.
"<< END
some comment line
more lines
a line with "another comment"
and followed by
"/ an end of line comment
plus more stuff here
END
will all be ignored and treated as a comment, even if those lines contain other comments.
As such, Token comments are highly useful to comment-out code which contains any other
comment.
#new
-message sent to a class or the #copy
-message sent to an instance.
The following literal constant types are allowed:
Integer
constants (possibly negative):
6
,
-1
,
12345678901234567890
with a radix (number base):
8r0777
,
16r80000000000
,
16rAFFE
, -16r1000
and 16r-1000
,
16r123456789abcdef0123456789abcdef
,
2r0111000
There is no limit on the integer constant's value; eg.
1234567890123456789012345678901234567890
is a valid integer literal
(and NOT truncated, overflowing or leading to an error).
Fraction
) constants:
1/3
,
-1/3
,
Fractions consist of an integer numerator and integer denominator. Both being arbitrary integers (i.e. unlimited in size).
Float
) constants:
1.234
,
1e10
,
1.5e15
Float constants with radix (i.e. "16r10.1
" or "2r10.1
") are allowed,
but should not be used in practice.
(because the 'e'-exponential character is a valid numeric character in hex;
and therefore, float constants with a radix-base greater than 14 cannot have an exponent).
The name "Float" is a historic leftover - internally the IEEE double precision floating point representation is used (independent of the exponent character).
For compatibility with other Smalltalk systems, the "d"-character is also recognized
as an exponential character. I.e. 1d10
has the same value as 1e10
.
1.234s4
,
10s4
,
FixedPoint constants are rational numbers which print themself as a scaled decimal number with the given number of post-decimal-point digits. Thus "1.234s4" prints as "1.2340" and "10s4" as "10.0000". Scaled Decimals are mostly used for monetary values and to format tabular data in a nice way.
Because scaled decimals are not supported by all Smalltalk systems, the compiler can be configured to treat them as errors via the settings dialog. If you want to ensure that your program is portable, disable them.
Boolean
constants:
true
, false
UndefinedObject
constant:
nil
Character
constants from the 8-bit iso8859-1 character set:
$c
ST/X also allows unicode character constants with a codepoint above 16rFF,
of up to 30 bit (i.e. up to 16r3FFFFFFF).
Therefore, $≠
is also a valid character constant in ST/X
and represents a character with a codePoint of 16r2260 (8800).
Be aware, that not all Smalltalk dialects support unicode. Most noteworthy is VisualAge Smalltalk, which does not. However, most modern Smalltalks do (Squeak, Visualworks and GNU-Smalltalk). So your program may be less portable if you use them. If portability against such old Smalltalk versions is an issue, we recommend at least extracting unicode specific code into easy maintainable extra methods.
String
constants:
'foo'
or
'a long string constant'
String constants may spawn multiple lines.
The Smalltalk standard does not define any special escapes
or other mechanisms to represent unprintable characters
(such as <cr>, <tab> or <backspace>) in a string.
This is certainly a major missing feature and
something that ought to be added in a future Smalltalk standard.
ST/X provides two mechanisms, one being compatible with other Smalltalk's syntax,
the other being a language extension, which will make your code non-portable.
For portability, use the "withoutCEscapes
"-message,
(i.e. 'foo\nbar\tbaz' withoutCEscapes
).
If you do not care for portability, prefix the string constant with a "c"-character,
as in:
c'foo\nbar\tbaz'
.
Both will unescape in a C-language like fashion, handling "\n" (newline), "\r" (return), "\t" (tab), "\b" (backspace), "\f" (formfeed), "\g" (bell), "\0" (null) and "\xXX" (hex-byte). However, the withoutCEscapes message will unescape at execution time, whereas the c-prefixed string will be at compilation time. Thus the later will execute faster.
ST/X also allows unicode string constants where individual characters may have a codepoint of up to 30 bit (i.e. up to 16r3FFFFFFF). However, the above mentioned character portability issues apply.
Symbol
constants:
#'bar'
,
#'++'
or
#'foo bar baz'
#foo
- see below
Symbols are unique immutable strings - that is, the system arranges that for a given sequence of characters, at most one corresponding symbol object exists. (Lispers call them Atoms)
Symbols can be used much like readonly Strings, with the big advantage that they can be compared using identity compare (== / ~~) whereas Strings usually have to be compared using equality (i.e. contents-) compare operators (= / ~=).
If the symbol's characters are all alphanumeric or all from the set of binary special
characters (+, -, *, and a few others), the quotes can be omitted and
the short form #bar
can be used instead of #'bar'
.
Until you've learned the exact details, always place those quotes around, to be sure.
Symbols are limited to the Latin-1 character set, and we do not intend to change this. The reason is that we do not want class names and method names to be written in non-English (it is hard enough, if some programmers do not follow that rule and write their stuff in different east-european languages...). Sorry to non-western natives; but as you are currently reading this, you obviously understand English better than and prefer it to German. And that a Chinese programmer will probably have more trouble reading (say) Hindu than English ;-) So this is one way to enforce at least a western language (the compiler is not smart enough to detect and complain about non-english).
More information on symbols is found in
"collection classes".
Array
constants:
#(1 2 $b 'hello' 3.14159)
The elements of an array constant, must be literal constants, and can be any of the literals described in this section.
Elements can themself be array literals - i.e. it can be a nested array literal, as in:
#(1 #two #(3 4) #( #(5 6) 7) )
.
For simple symbol constants (identifiers) and nested arrays,
the leading '#' may be ommitted
within an array constant if it is not one of 'true', 'false' or 'nil'
(however, we do not recommend doing so).
Also, array constants within an array constant are allowed to be written without
the leading '#'-character. Therefore, the above array constant can also
be written as:
#(1 two (3 4) ( (5 6) 7) )
ByteArray
constants:
#[0 1 2 3 4]
The elements must be integer constants in the range 0..255. ByteArrays can be seen as more memory friendly, compact version of Arrays, and are often used when bulk data (bitmap images) is processed.
Identifiers must start with a letter or an underscore character.
The remaining characters may be letters, digits or the underline character (*).
Examples:
foo
foo123
foo_123
aVeryLongIdentifier
anIdentifier_with_underline_characters
For portability with some (VMS-)VisualWorks Smalltalk variants, a dollar character ($) can also be allowed inside an identifier as a compiler option (the $ was used in the VMS Smalltalk version of ST/X).
nil
true
and false
self
super
thisContext
here
Since "here" is a Smalltalk/X language extension, its builtin-ness is less strict than that of the other special variables: if a variable named "here" is defined and visible in the current variable scope, here will refer to that variable; otherwise, it refers to the receiver (with different lookup semantics).
1 negative
sends the message "negative"
to the number 1, which is the receiver of the
message.
Unary messages, like all other messages, return a result,
which is simply another object.
In the above case, the answer from the "negative" message is the
boolean false
object.
Evaluate this in a workspace (using printIt); try different receivers (especially: try a negative number).
Unary messages parse left to right, so, for example:
first sends the "
1 negative not
negative
"-message to the number 1.
Then, the "not
"-message is sent to the returned value.
The response of this second message is returned as the final value.
If you evaluate this in a workspace,
the returned value will be the boolean true
.
Try a few unary messages/expressions in a workspace:
1 negated
-1 negated
false not
false not not
-1 abs
1 abs
10 factorial
10 factorial sqrt
5 sqrt
1 isNumber
$a isNumber
$a isNumber not
1 isCharacter
$a isCharacter
'someString' first
'hello world' size
'hello world' asUppercase
'hello world' copy
'hello world' copy sort
#( 17 99 1 57 13) copy sort
1 class name
1 class name asUppercase
WorkspaceApplication open
Notice, that in the above examples, you already encountered polymorphy: both strings and
arrays respond to the sort
message and sort their contents in place.
Also notice, that classes also respond to messages, just like any other object.
The last example sends the "open"-message to the WorkspaceApplication class.
5 between:3 and:8
"between:
" and "and:
" are the keywords,
the numbers 3 and 8 are the arguments and the number 5 is the receiver of the message.
The message's actual selector (i.e. the message name) is formed by the concatenation of all individual
keywords; in the above example, the message selector is "between:and:
".
As a beginner, keep in mind that
this is different to both a "between:
" and an "and:
"-message.
And of course, also "between:and:
" and "and:between:
"
are different messages.
In the browser, the method will be listed under the name: "between:and:
".
Keyword messages parse left to right,
but if another keyword follows a keyword message, the expression is parsed as
a single message (taking the keywords concatenation as selector).
Thus, the expression:
would send a "
a max: 5 min: 3
max:min:
"-message to the object referred to by the variable
"a".
This is not the same as:
which first sends the "
(a max: 5) min: 3
max:
"-message to "a",
then sends the "min:
"-message to the result.
Try these in a
workspace
(don't fear the error...)
To avoid ambiguity you must place parentheses around.
Try a few keyword messages/expressions in a workspace (also see what happens, if you ommit
or change the parenthesis):
1 max: 2
1 min: 2
(2 max: 3) between: 1 and: 3
(1 max: 2) raisedTo: (2 min: 3)
'Hello' at: 1
#(100 200 300) at: 2
#(10 20 30 40 50 60) indexOf: 30
#(10 20 30 40 50 60) at:('Hello' indexOf: $e)
Unary messages have higher precedence than keyword messages,
thus:
evaluates to 9.
9 max: 16 sqrt
(because it is evaluated as: "9 max: (16 sqrt)" which is "9 max:4".
It is not "(9 max: 16) sqrt", which is "16 sqrt" and would give 4 as answer.)
Binary messages are typically used for arithmetic operations - although, this is not enforced by the system. No semantic meaning is known or implied by the Smalltalk compiler, and binary messages could be defined and used for any class and any operation.
A typical example of a binary message is the one which implements arithmetic addition
for numeric receivers (it is implemented in the Number classes):
This is interpreted as a message sent to the object 1 with the selector '+'
and one argument, the object 5.
In a browser, the message will be listed under the name "+".
1 + 5
Binary messages
parse left to right (like unary messages).
Therefore,
results in 21, not 17.
2 + 5 * 3
(because of left-to-right evaluation,
first '+' is sent to 2, with 5 as argument.
This first message returns 7.
Then, '*' is sent to 7, with 3 as argument, resulting in 21 being answered.)
To change the execution order or to avoid ambiguity you should place parentheses around:
Now, the execution order has changed and the new result will be 17.
2 + (5 * 3)
Unary messages have higher precedence than binary messages, thus
evaluates as "9 + (16 sqrt)", not "(9 + 16) sqrt".
(notice, that sqrt returns a float result,
and '+' is sent to the integer 9, with a float 4.0 as argument.
All numeric operations support such "mixed-mode" operations
and return an appropriate result object.)
9 + 16 sqrt
On the other hand, binary messages have higher precedence than
keyword messages, thus
evaluates as "(9 + 16) max: (3 + 4)" which is "25 max: 7" and answers 25.
9 + 16 max: 3 + 4
It is not the same as "9 + (16 max: 3) + 4" (which results in 29) or
"((9 + 16) max: 3) + 4" (which in this case also results in 29)
Again, we highly recommend the use of parentheses - even when the default evaluation order matches the desired order; it makes your code much more readable, and helps beginners a lot.
To practice, try a few binary messages/expressions in a workspace:
1 + 2
1 + 2 * 3
(1 + 2) * 3
1 + (2 * 3)
-1 * 2 abs
(-1 * 2) abs
5 between:1 + 2 and:64 sqrt
5 between:(1 + 2) and:(64 sqrt)
#(100 200 300) at: (1+1)
The second example above shows why parentheses are so useful:
from reading the code, it is not apparent, if the evaluation
order was intended or is wrong.
You will be happy to see parentheses when you have to debug
or fix a program which contains a lot of numeric computations.
Here are a few more "difficult" examples:
1 negated min: 2 negated
1 + 2 min: 2 + 3 negated
,
(comma)
,
"-message is understood by collections,
and mostly used for strings (which are collections of characters).
As a binary message, it expects a single argument
and returns the concatenation of the receiver and argument
(i.e. a collection which contains the receiver's elements
and those of the argument).
'Hello','World'
"
returns the new string: 'HelloWorld'.
#(10 20 30),#(50 60 70)
"
@
@
"-message is understood by numbers. As a binary message, it expects
a single argument. It returns a Point-object (coordinate in 2D space) with the receiver
as x, and the argument as y value.
10 @ 20
"
returns the same as
"(Point new x:10 y:20)"
.
->
->
"-message is similar to the above "@
" in that it is a shorthand instance creation message.
It is understood by any object and returns an association (a pair) object.
10 -> 20
"
returns the same as
"(Association new key:10 value:20)"
.
?
?
"-message returns the receiver if it is non-nil, and the argument otherwise.
It is used to deal with possibly uninitialized variables
in assignments or as message argument.
a ? 20
"
returns the same as
"(a notNil ifTrue:[a] ifFalse:[20])"
.
In ST/X, the actual set of allowed characters can be queried from the system
by evaluating (and printing) the expression
"Scanner binarySelectorCharacters
".
If you compare your favorite programming language
against regular English,
you will find Smalltalk to be much more similar to plain English
than most other programming languages.
For example, consider the order to a person called "tom",
to send an email message to a person called "jane":
(assuming that tom, jane, theEmail refer to objects)
English | Smalltalk | Java / C++ |
---|---|---|
tom, send an email to jane. | tom sendEmailTo: jane. | tom.sendEmail(jane);
tom->sendEmail(jane); |
tom, send theEmail to jane. | tom send: theEmail to: jane. | tom.sendEmail(theEmail, jane);
tom->sendEmail(theEmail, jane); |
tom, send theEmail to jane with subject: 'hi'. | tom send: theEmail to: jane withSubject: 'hi'. | tom.sendEmail(theEmail, jane, "hi");
tom->sendEmail(theEmail, jane, "hi"); |
album play.
album playTrack: 1.
album repeatTracksFrom: 1 to: 10.
and it does exactly what it looks like.
Another plus in Smalltalk is that the meaning of an argument is described by the keyword before it. Whereas in Java or C++ you have to look at a function's definition to get information on the order and type of argument, unless you use fancy function names like "sendEmail_to_withSubject()" which actually mimics the Smalltalk way.
Smalltalk was originally designed to be easily readable by both programmers AND non-programmers. Humor says, that this is one reason why some programmers do not like Smalltalk syntax: they fear to loose their "guru" aura if others understand their code ;-) .
1 negated
"negated"
to the number 1, which gives
us a -1 (minus one) as result.
1 negated abs
"negated"
to the number 1, which gives
us an intermediate result of -1 (minus one);
then, the message "abs"
is sent to it, giving us
a final result of 1 (positive one).
-1 abs negated
"abs"
to the number -1 (minus one), which gives
us a 1 (positive one) as intermediate result. Then this object
gets a "negated"
message.
1 + 2
"+"
to the number 1, passing it
the number 2 as argument. The returned object is 3.
"+"
message.
1 + 2 + 3
"+"
is sent to the number 1, passing it
the number 2 as argument. Then, another "+"
message is sent to
the intermediate result, passing the integer-object 3 as argument.
1 + 2 * 3
-1 abs + 2
"abs"
to the number -1 (minus one), then sends "+"
to the result, passing 2 as argument.
1 + -2 abs
"abs"
to the number -2, then sends "+"
to the number 1, passing the result of the first message as argument.
-1 abs + -2 abs
"abs"
to the number -1 (minus one) and remembers the result.
Then sends "abs"
to the number -2 and passes this as argument
of the "+" message to the remembered object.
1 + 2 sqrt
"sqrt"
to the number 2, then passes this as argument
of the "+" message to the number 1.
(1 + 2) sqrt
"+"
to the number 1, passing 2 as argument.
Then sends "sqrt"
to the result.
1 min: 2
"min:"
(minimum)
message to the number 1, passing 2 as argument.
(1 max: 2) max: 3
"max:"
(maximum)
message to the number 1, passing 2 as argument. Then sends "max:"
to the returned value, passing 3 as argument.
(1 + 2 max: 3 + 4) min: 5 + 6
"+"
to the number 1 passing 2
as argument and remembers the result.
Then, "+"
is sent to the
number 3, passing 4 as argument.
Then, "max:"
is sent to the remembered first result,
passing the second result as argument. The result is again
remembered.
Then, "+"
is sent to the number 5, passing
6 as argument.
Finally, the "min:"
message is sent to the
remembered result from the first max: message, passing
the result from the "+"
message.
1 max: 2 max: 3
"max:max:"
message to the number 1, passing the two arguments, 2 and 3.
"max:max:"
message,
this leads to an error (message-not-understood).
This example illustrates why parentheses are highly recommended - especially with concatenated keyword messages.
'hello' at:1
"at:"
message to the string constant.
'hello' , ' world'
","
binary message to the first string constant, passing another string as argument.
'hello' , ' ' , 'world'
","
binary message to the first string constant, passing ' ' as argument.
Then, the result gets another ","
message, passing 'world' as
argument.
#(10 2 15 99 123) min
"min"
unary message to an array object (in this case: a constant array literal).
All collections respond to the "min"
message by searching for its smallest
element and returning it.
WorkspaceApplication new open
new
"
unary message to the WorkspaceApplication class object, which returns a new instance of itself.
Then, this new instance gets the "open
" message, which asks for a window
to be shown.
-1 negated.
1 + 2.
first sends the "negated
" message to -1 (minus one), ignoring the result.
Then, the "+
" message is sent to 1 (positive one), passing the number 2 as argument.
Notice that there is actually no need for a period after the last statement
(it is a statement-separator) - it does not hurt, though.
We will encounter more (useful) examples for multiple statements below.
nil
, when created.
Important Note to C, C++ and C# programmers:
Smalltalk variables always hold a reference (pointer) to some object. Every object "knows" its type. It is NOT the pointer, which knows the type of the object it points to. In Smalltalk it is totally impossible to treat a pointer as an integer or as a pointer to something else. There is no such thing like a cast in Smalltalk. Therefore we say, that Smalltalk is a "dynamically strongly typed language". In contrast to C++, which is a "statically weakly typed language".In Smalltalk, all objects are always and only created conceptionally on the dynamic garbage collected heap storage. There is no such thing as "boxing" or "unboxing". Assignments never copy the value, but instead the reference to the object. When arguments are passed in a message, references are passed.
For now, only global variables and local variables are described (because we need them for more interesting examples); the other variable types will be described later.
Beside classes, only a few other objects are bound to globals; the most interesting for now are:
Transcript
show:
something
cr
showCR:
something
show:
followed by cr
.
flash
Smalltalk
Stdin
, Stdout
and Stderr
Logger
Even simple references to the Transcript, UserPreferences or Display screen lead to trouble when multiple threads/sessions/users are to be supported. For this, ST/X provides queries like "Transcript current", "UserPreferences current" or "Screen current", which return thread-local references. So each thread may have its own, private I/O devices and settings.
That said (and kept in mind), being able to access the console via the Transcript
is often very helpful: it allows to send debugging and informative messages from the
program.
For example:
shows that greeting in the Transcript window,
and
Transcript show: 'Hello world'
advances its text cursor to the next line.
Transcript cr
There is also a combined message, as in:
Finally, to wakeup a sleepy user, try:
Transcript showCR: 'Hello world'
Transcript topView raise.
Transcript showCR: 'Ring Ring - Wakeup!'.
Transcript flash.
A global is created by sending the message at:put:
to the global called Smalltalk
,
passing the name of the new global as a symbol.
For example:
and can then be used:
Smalltalk at:#Foo put: 'Now there is a Foo'
or simply:
Smalltalk at:#Foo
if you want Smalltalk to forget about that variable, execute
Foo
(be careful to not remove one of your loved one's by accident).
Smalltalk removeKey:#Foo
Having said this, you now better immediately forget about global variables.
Workspace variables are created and destroyed via corresponding menu functions in the workspace window. You can also configure the workspace to auto-define any unknown variable as a workspace variable (in the workspace's "Workspace" - "Settings"-menu). That's the way to go for the remainder of this lecture, because it makes your life so much easier.
Be aware of the fact, that workspace variables are invisible to compiled code - i.e. any reference to such a variable from within compiled code will actually refer to a global variable with the same name (which will be seen as nil if it never gets a value assigned to).
For a C++, Java or C# programmer, class instance variables are hard to understand, unless they see the class objects as real objects with private slots, protocol etc. This is because none of those languages offers a similar construct.
Instance variables are private to some object and their lifetime is the lifetime of the object.
We will come back to instance variables, once we have learned how classes are defined.
A local variable declaration consists of an opening '|' (vertical bar) character,
a list of identifiers and a closing '|'.
It must be located before any statement within a code entity
(a doIt-evaluation, block or method; the later being described below).
For example:
declares 3 local variables, named 'foo', 'bar' and 'baz'.
| foo bar baz |
A local variable's lifetime is limited to the time the enclosing context is active - typically, a method or a block.
When a piece of code is evaluated in a workspace window, the system generates an anonymous method and calls it for the execution. Therefore, a local variable declaration is also allowed with doIt-evaluations (the variable's lifetime will be the time of the execution).
foo
" and "bar
" have been declared as
variables before, you can assign a value with:
foo := 1
or:
bar := 'hello world'
foo := bar := baz := 1
:
" in ":=
".
=
" instead, you will get a binary message send expression
which means "is equal to" (i.e. it is a comparison operator).
foo := baz = 1.
would assign true or false to "foo
", depending on whether "baz
" is equal to 1 or not.
foo := (baz = 1).
Even if they are not required, it is a bit easier to read.
All variables are initially bound to nil.
This is the same behavior as found in Java or C#,
but opposed to C or C++.
You will never get random or even invalid values in a Smalltalk variable.
Keep in mind that only a reference to an object is stored into the variable,
not the state of the object itself.
This means that multiple variables may refer to the same object.
For example:
The previous example demonstrates,
that both var1 and var2 refer to the same array object.
I.e. that in Smalltalk, a variable actually holds a reference to an object,
and that more than one variable may refer to the same object
|var1 var2|
"create an Array with 5 elements ... and assign it to var1"
var1 := Array new:5.
"and also to var2"
var2 := var1.
"change the 2nd element..."
var1 at:2 put:1.
Transcript show:'var1 is '. Transcript showCR:var1.
Transcript show:'var2 is '. Transcript showCR:var2.
Technically speaking: a variable holds a pointer to the object.
This is especially true with multiple assignments;
so:
binds both "
foo := bar := 'hello'
foo
" and "bar
" to the same string object.
Array := nil
To prevent beginners from
doing harm to the system, ST/X checks for this situation
and gives a warning.
As a general rule:
do not assign to global variables - it is usually a sign of very very bad design if you have to. As you read above and will see below, there are other variable types which can be used in most situations.
Ask the Float
class for the π (pi) constant:
Ask the
Float pi
Transcript
object to raise its top view:
Ask the
Transcript topView raise
Transcript
object to flash its view:
Ask the
Transcript flash
WorkspaceApplication
class to create a new instance and open
a view for it:
Declare a local variable, assign a value and display it on the transcript
window:
WorkspaceApplication open
Remember, that a variable may refer to any object.
|foo|
Transcript show:'foo is initially bound to: '.
Transcript showCR:foo.
foo := -1.
Transcript show:'foo is now bound to: '.
Transcript showCR:foo.
foo := foo + 2.
Transcript show:'foo is now bound to: '.
Transcript showCR:foo.
Thus, the following is legal (although not considered a good style):
A rule of wisdom:
|foo|
foo := -1.
Transcript show:'foo is: '.
Transcript show:foo.
Transcript cr.
Transcript show:'and it is a: '.
Transcript showCR:foo class name.
foo := 'hello'.
Transcript show:'foo is now: '.
Transcript show:foo.
Transcript cr.
Transcript show:'and it is a: '.
Transcript showCR:foo class name.
do not reuse variables (as in the above case) unless needed for accumulating something.
Having an extra variable in a method does not cost anything (neither time, nor space).
However, it helps a lot in readability.
Sometimes even use a temporary variable just for the name of it, to document what an
intermediate result represents.
| coll |
coll := Set new. "/ create an empty Set-collection
coll add:'one'.
coll add:'two'.
coll add:3.
A cascade expression (semicolon) allows this to be written a little shorter:
it sends another message - possibly with arguments - to the previous receiver.
The following cascade is semantically equivalent to the above
albeit a bit shorter:
| coll |
coll := Set new. "/ create an empty Set-collection
coll add:'one'; add:'two'; add:3.
add:
" method returns its argument
(for historic reasons beyond my understanding).
This means, that the following code does NOT what it looks like:
| coll |
coll := Set new
add:1; add:2. "/ Attention: add returns its argument
Instead of the expected, it leaves the integer 2 in the variable named "coll",
because the assigned value is the value of the last "add:
" message.
Because this is a recurring pattern, a method named "yourself
" has been added to the Object class.
As the name implies, it simply returns itself.
Use this as the last message of the cascade:
to prevent the above problem and get the expected value assigned.
You may encounter this kind of code at various places in the system.
| coll |
coll := Set new
add:1; add:2;
yourself. "/ returns the receiver - i.e. the Set
A block represents a piece of executable code. Being a "real object" just like any other, it can be stored in a variable, passed around as argument or returned as value from a method - just like any other object. When required, the block can be evaluated at any later time, which results in the execution of the block's statement(s). The fancy thing is that the block's statements can see and are allowed to access all of the surrounding variables. Those which are visible within the static block scope.
| someBlock |
someBlock := [ Transcript flash ].
later, when the block has to be evaluated (i.e. its statements executed),
send it the "#value
" message:
...
someBlock value.
...
Blocks may be defined with 0 (zero) or more argument(s);
|someBlock|
...
someBlock := [:a | Transcript showCR:a ].
...
defines a block which expects (exactly) one argument.
#value:
" message, passing the desired
argument object.
someBlock value:'hello'
(here, a string object is passed as argument).
For multiple arguments, declare each formal argument preceeded by a colon.
For evaluation, a message of the form
"#value:...value:
" with a corresponding number of arguments must be used.
For example, the block:
can be evaluated with:
|someBlock|
...
someBlock := [:a :b :c |
Transcript show:a.
Transcript show:' '.
Transcript show:b.
Transcript show:' '.
Transcript show:c.
Transcript cr
].
...
someBlock value:1 value:2 value:3
|someBlock|
...
someBlock := [:a :b :c | a + b + c].
...
Transcript showCR:(someBlock value:1 value:2 value:3).
...
When executed, the above will display "6" on the Transcript window.
|someBlock|
...
someBlock := [:a :b :c | Transcript showCR:'hello'. a + b + c].
...
result := someBlock value:1 value:2 value:3.
...
will assign the numeric value 6 to the result variable.
Notice that blocks are closures;
they "close over the variables" of the environment which was active at
the time the closure was created.
And also that blocks also create such a variable-environment when executed.
This means that in the following:
the "action at:5" retrieves a block which has captured the current value of the factor
variable (which was 5) and therefore multiplies the argument by 5.
|actions|
actions := (1 to:10) collect:[:factor | [:arg | arg * factor] ].
(actions at:5) value:10.
Blocks have many nice applications: for example, a GUI-Button's action can be defined using blocks, a timer may be given a block for later execution, a batch processing queue may use a queue of block-actions, a background process may be forked to execute a block and a sorted collection may use a block to specify how elements are to be compared.
However, the most striking application of blocks is in defining control structures (like if, while, repeat, loops etc.), and as "higher order functions" when enumerating or processing collections and the like.
Boolean
, Block
and the Collection
classes.
ifTrue:
/ ifFalse:
protocol as implemented by the two boolean objects, which are bound to the globals "true
" and "false
":
ifTrue:
aBlock
ifFalse:
aBlock
ifTrue:
trueBlock ifFalse:
falseBlock
ifFalse:
falseBlock ifTrue:
trueBlock
So, to compare two variables and send some message to the Transcript
window, you can write:
of course, you may change the indentation to reflect the program flow;
...
(someVariable > 0) ifTrue:[ Transcript showCR:'yes' ].
...
this is what a C-Hacker (like I used to be) would write:
and that is how a Lisper (and many Smalltalkers) would write it:
...
(someVariable > 0) ifTrue:[
(someVariable < 10) ifTrue:[
Transcript showCR:'between 1 and 9'
] ifFalse:[
Transcript showCR:'positive'
]
] ifFalse:[
Transcript showCR:'zero or negative'
].
...
Because the above constructs are actually message sends
(NOT statement syntax), they do also return a value when invoked.
Thus, some Smalltalkers or Lispers would probably prefer a more functional style,
as in:
...
(someVariable > 0)
ifTrue:
[(someVariable < 10)
ifTrue:
[Transcript showCR:'between 1 and 9']
ifFalse:
[Transcript showCR:'positive']]
ifFalse:
[Transcript showCR:'zero or negative'].
...
Which one you prefer is mostly a matter of style,
and you should use the one which is more readable
- sometimes, deeply nested expressions can become quite
complicated and hard to read.
...
Transcript showCR:
((someVariable > 0)
ifTrue:
[(someVariable < 10)
ifTrue:['between 1 and 9']
ifFalse:['positive']]
ifFalse:
['zero or negative']).
...
As a final trick, noticing the fact that every object responds to the #value
-message,
and that the #if
-messages actually send #value
to one of the alternatives and
return that,
you may even encounter the following coding style (notice the non-block args of the inner ifs):
The above "trick" should (if at all) only be used for constant if-arguments
and only when using the "if" for its value.
With message-send arguments, both alternatives would be evaluated,
which has probably not the desired effect.
Also be aware that some other objects implement value and will not return themself.
Most noteworthy are instances of Association and the ValueModel hierarchy.
...
Transcript showCR:
((someVariable > 0)
ifTrue:
[(someVariable < 10)
ifTrue:'between 1 and 9'
ifFalse:'positive']
ifFalse:
'zero or negative').
...
whileTrue:
loopBlock
whileFalse:
loopBlock
whileTrue
whileFalse
|someVar|
someVar := 1.
[someVar < 10] whileTrue:[
Transcript showCR:someVar.
someVar := someVar + 1.
]
"(someVar < 10)"
would return a boolean, which does
not implement the while messages.)
condition := [ something evaluating to a Boolean ].
...
condition whileTrue:[
...
]
If while-loops are used that way, the condition is typically passed in as
an argument or configured in some instance variable.
The above while-loops check the condition at the beginning - i.e. if the condition block evaluates to false initially, the loop-block is not executed at all.
The Block class also provides looping protocol for condition checking at the end
(I.e. where the loop-block is executed at least once):
and also:
[
...
loop statements
...
] doWhile: [ ...condition... ]
[
...
loop statements
...
] doUntil: [ ...condition... ]
Of course, an obvious way to write an endless loop is:
However, to document the programmers intention, it it better to
use one of the explicit endless loop constructs (#
[true] whileTrue:[
...
endless loop statements
...
]
loop
or #repeat
),
as in:
or:
[
...
endless loop statements
...
] loop
this one demonstrates that a return statement inside a block
will actually force a return from the enclosing method.
Especially C,C++,C#, Java and JavaScript programmers should raise their eyebrows here.
[
...
endless loop statements
...
someCondition ifTrue:[ ^ something ].
...
] loop
Finally, take a look at:
this one is interesting, as the exit object passed in as argument
is exiting the loop when #value is sent to it.
Thus, because ifTrue: sends #value to its argument,
the loop can also be written as:
[:exit |
...
endless loop statements
...
someCondition ifTrue:[ exit value ]
...
] loopWithExit
[:exit |
...
endless loop statements
...
someCondition ifTrue: exit
...
] loopWithExit
n timesRepeat:[
...
repeated statements
...
]
where n stands for an integer value (constant, variable or message expression).
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
1 to: 6 do: [:idx |
Transcript showCR: (anArray at: idx)
].
or, with an increment,
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
1 to: 6 by: 2 do: [:idx |
Transcript showCR: (anArray at: idx)
].
However, no real Smalltalk programmer would use "to:do:
" to enumerate a collection's elements.
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
anArray do:[:eachElement |
Transcript showCR: eachElement
].
Notice that this example also demonstrates good vs. bad resuability of the code:
the first version (using to:do:) uses a numeric-index-based address to fetch each element.
This implies that the collection must be some kind of numerically-sequenceable collection.
The second version simply leaves that decision to the collection itself.
It will therefore work with any kind of collection (lists, trees, hashtables, sets, etc.).
Of course, in the above example we hardcoded an array as receiver, which is known to allow access
via a numeric index. However, in practice, the collection is often coming from elsewhere via a
message argument or variable value. In that case, a changing collection representation in other parts of
the program will not affect the enumeration loop.
Open a browser and look at the implementation of
#reverseDo:
,
#collect:
, #detect:
, #select:
, #findFirst:
etc.
do
"- or even "while
"-loops with
indexing to enumerate elements for element searching or processing.
[
'nonExistingFile' asFilename contents
] on:Error do:[:exceptionInfo |
Transcript showCR:(exceptionInfo description).
].
The above code should be read as a demonstration example;
it catches any error, not only file-not-found exceptions.
In practice, more specific handlers are usually setup:
[
10 / 'nonExistingFile' asFilename contents size
] on:StreamError do:[:exceptionInfo |
Transcript showCR:(exceptionInfo description).
] on:ArithmeticError do:[:exceptionInfo |
Transcript showCR:('Oops: ',exceptionInfo description).
].
(Hint: create a file named 'nonExistingFile' and try it)
In Smalltalk, a handler may decide to repair things, and
either restart the computation, or proceed:
it may also decide to not handle the error,
and pass it on to either another handler or the default
exception handler (called "rejecting the error"), which typically opens a debugger:
[
|input divisor|
input := Dialog request:'Enter a divisor (try 0)'.
divisor := input asNumber.
Dialog information:'The result is ',(10 / divisor) asString
] on:ArithmeticError do:[:e |
Dialog information:'Mhmh - I will proceed with 0'.
e proceedWith:0.
].
You can also combine exception into a so called "handler set"
and handle a bunch of otherwise unrelated (meaning:" not inheriting from each other")
with a common handler:
[
|input divisor|
input := Dialog request:'Enter a divisor (try 0)'.
divisor := input asNumber.
Dialog information:'The result is ',(10 / divisor) asString
] on:ArithmeticError do:[:e |
(Dialog confirm:'Proceed with 0 or debug?') ifTrue:[
e proceedWith:0.
].
e reject.
].
[
|input divisor|
input := Dialog request:'Enter a divisor (try 0)'.
divisor := input asNumber.
Dialog information:'The result is ',(10 / divisor) asString
] on:(ZeroDivide, StreamError, ImaginaryResultError) do:[:e |
(Dialog confirm:'Proceed with 0 or debug?') ifTrue:[
e proceedWith:0.
].
e reject.
].
|s|
[
s := 'someFile' asFilename writeStream.
Transcript showCR:'start writing...'.
s nextPutLine:'hello'.
"/ now, an error occurs and a debugger is opened
self error:'please abort (here or in the debugger)'.
"/ so this line is not executed:
Transcript showCR:'not reached'.
] ensure:[
"/ but this is:
Transcript showCR:'cleaning up'.
s close.
'someFile' asFilename remove.
].
There is also a combined handler+ensure method which corresonds to other language's
try-catch-finally statement:
[
some action
] on:Error do:[
error handler
] ensure:[
cleanup action
]
Smalltalk's blocks are perfectly well suited for this style of programming, because they allow for all of the above. And actually, they are used heavily as arguments in the collection class protocol.
Array
, Set
, Dictionary
etc.) provide for
messages to enumerate their elements and evaluate a given block for each of them.
The most useful of those enumeration messages is:
do:
aOneArgBlock
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
anArray do:[:eachElement | Transcript showCR:eachElement ].
of course, you should indent the code to reflect the intended control flow.
With C-style indentation the code looks as:
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
anArray do:[:eachElement |
Transcript showCR:eachElement
].
|bag mostUsed|
bag := Bag new.
'../../doc/online/english/getstart' asFilename directoryContentsAsFilenames
select:[:eachFile | eachFile isDirectory not]
thenDo:[:eachFile |
eachFile contents do:[:eachLine |
bag addAll: eachLine asCollectionOfWords.
].
].
mostUsed := (bag valuesAndCounts asArray sort:[:a :b | a value > b value ]) first:10.
CodingExamples_GUI::HistogrammView new
extent:500@300;
labels:(mostUsed collect:[:eachPair | eachPair key storeString]);
values:(mostUsed collect:[:eachPair | eachPair value]);
open.
The higher-order functions used are:
select:thenDo:
select:thenDo:
do:
sort:
collect:
collect:
|function measureData|
function :=
[
1000000 timesRepeat:[
'abcdefxghijklxmn' occurrencesOf:$x
]
].
measureData := (1 to:30)
collect:[:n |
Time millisecondsToRun: function.
].
CodingExamples_GUI::HistogrammView new
extent:750@400;
labels:nil;
values:measureData;
open.
Notice again, that higher order functions are used as the function itself,
with the timesRepeat and with the collect: expressions.
Historically, due to its very readable, English-like syntax, Smalltalk does not have lots of syntactic sugar. Everything was expressed as message-sends to objects. This includes class- and method-definition, variable initialization, looping, exception handling etc.
In contrast, most other programming languages typically provide separate syntactic constructs for each of the above mentioned issues (lisp being a well-respected exception here). The only existing syntactic sugar is the additional message syntax for binary selectors (which was added to make mathematic expressions more readable) and the cascade message.
{ expression1 . expression2 . ... expressionN }
to construct a new Array (at runtime) with N elements,
computed by the corresponding expressions.
Please notice the separating periods (to separate the expressions).
{ 'str'. Date today. Time now. 1. #sym }
creates a 5-element array at run time.
Notice that the brace-constructor shows the same behavior as a multi-new-message to
the Array
class, or (for more than a small number of elements),
for an "Array new:"
followed by
a bunch of at:put:
messages;
in other words: it is equivalent to:
but without the restriction on the max. number of arguments.
Array with:expression1 with:expression2 ... with:expressionN
Thus, the above is equivalent to:
If you use this feature, be aware that "#( )" and "{ }" both return an empty array.
However, the array returned by "#( )" has been created at compilation time, and the
same identical object will be returned, whenever the "#( )"-expression is evaluated again.
(Array new:5)
at:1 put:'str';
at:2 put:(Date today);
at:3 put:(Time now);
at:4 put:1;
at:5 put:#sym;
yourself
In contrast, every evaluation of "{ }" will construct and return a new Array at runtime.
Notice by the author: I personally have one critique on the brace constructor:
why should the Array class be so special as to justify a special syntactic sugar construct?
Most collections in real life are variable in size,
so creating an OrderedCollection could pretty much the same be justified.
But then, why exclude Set, Dictionary and all other fancy collections?
Why exclude Matrix or Vectors?
In addition, those with a functional background would definitely love to have a simple constructor
for Lisp-like linked lists or cons-objects.
In other words:
the brace constructor seems to be a quick hack for a single programmer's needs (lazyness?).
It should have been more thought-through,
for a more generic solution, before finding its way into thousands of methods.
(an alternative possible syntax could have been: "<ClassName>{ ... }")
Transcript
showCR:'Today is ',Date today asString,' and the time is ',Time now asString.
this concatenates a longer string from the four parts,
of which two are computed dynamically.
This becomes especially ugly, if you have to consider national language
variants; and even more so, as not all languages will order
the parts of the sentence the same way.
For example, in German, you may want to write:
For this, ST/X provides a national language translation mechanism,
which is based on a getter named 'resources', which is understood
by all classes and all application instances.
You can give it a string, which it will translate as set by the current
language setting. Different sentence ordering is supported,
by passing in the english string, with placeholders for the
parts to be filled in:
Transcript
showCR:'Heute ist der ',Date today asString,' und es ist ',Time now asString,' Uhr'.
this is very flexible, in that you can add a resource file
named "de.rs" and add the translation for:
Transcript showCR:(
self class resources
string:'Today is %1 and the time is %2'
with:(Date today)
with:(Time now))
'Today is %1 and the time is %2' 'Heute ist der %1 und es ist %2 Uhr'
or even change the sentence structure in German to:
'Today is %1 and the time is %2' 'Es ist %2 am %1'
Hoever, as the above code looks rather ugly,
ST/X provides syntactic sugar for national language strings;
you can also write:
Transcript showCR: i'Today is {Date today} and the time is {Time now}'
You can also embed newlines and other special characters in a C-language fashion:
Transcript show: i'Today is {Date today} and the time is {Time now}\n'
The "i"-prefix before the string
tells the compiler that this is a string with embedded expression,
which is to be translated via the resources
(a so called "internationalized" or "i-string" for short).
Transcript show: e'Today is {Date today} and the time is {Time now}\n'
Now, we reached a point, where we realize that the key to becoming a Smalltalker lies in the knowledge of the system's class library. Although this is true for all big programming systems, it is even more true for Smalltalk, since even control structures and looping is implemented by message protocol as opposed to being a syntax feature.
No programming is possible if you don't
know the protocol of the classes in the system, or at least part of it.
To give you a starting point, we have compiled a
list of the most useful messages as implemented by
various classes in the
``list of useful selectors''
document.
A rough overview of the most common classes and their typical use is found in the "Basic Classes Overview". Please, read this document now.
Copyright © 1996 Claus Gittinger Development & Consulting
Copyright © eXept Software AG
<cg at exept.de>