Smalltalk/X Language Definition & Differences

Introduction
Source Text Files
Stc File Format
Semantic Details
Extensions to Smalltalk-80 (BlueBook definition)
More Codechecks
Limitations
Known Bugs & Limitations

Introduction

This document describes the source file format as expected by the stc compiler, language differences to Smalltalk-80 and known bugs & limitations of ST/X.

One of the unique features of Smalltalk/X is its ability to compile Smalltalk code into statically compiled binary code files (shared libraries). The contain fully compiled machine code, and do not require dynamic compilation (from bytecode) at execution time.

This compilation scheme is NOT used while working in the browser. For any code which is added or modified after the initial startup, a traditional bytecode compiler (accelerated by a Just-In-Time compiler) is used.

However, the ultimate goal of your development is usually to deploy either an executable program, or a set of libraries as a stand-alone program. For this, the stc compiler is used.

Files processed by the stc (Smalltalk-to-C) compiler are usually generated by either filing out class code directly from the SystemBrowser, or indirectly, by checking some class into the source code repository (also via the SystemBrowser) and then checking it out into a directory via a "cvs update" or "cvs checkout" command. The later could even be an automatic process, for example controlled by a jenkins build system.

Source Text Files

Of course, as these files are regular text files, you can alternatively use any text editing tool to edit and manipulate these files, working in the traditional edit-compile-link mode if you prefer. In this mode, think of the file as describing one class; comparable to programming in C++ or similar languages. However, before doing so, read on and be aware of some pitfalls (especially the chunk-format, and the resulting "!"-doubling).

Stc File Format

Files compiled by stc must be in Smalltalk's fileout format This means that the file consists of Smalltalk expressions, separated by '!'-characters (the so called "chunk separator" or "bang").

Bangs (i.e. '!'-characters) within the text have to be doubled; this need for doubling also and especially applies to exclamation marks within comments and string literals (*).
Since ST/X replaces doubled '!'-characters by a single '!' when filing in, you will see only single '!'-characters in the browser. You have to be very careful, when editing a source file using the File Browser or another editor.
Notice, that the SystemBrowser cares for this doubling when classes are filedOut - but the File Browser does not, since it treats Smalltalk source code files just like any other text file.

Currently, stc can only compile files which contain either one single class definition (with optional private classes), or a "methods-only file", which contains methods, but no class definition.

The source syntax for compiled Smalltalk implements a subset of the messages used to create/manipulate classes and methods. Other expressions than those listed below are not allowed/supported.

The first expression in a "class-definition file" must be a class-definition expression; a "methods-only file" may only consist of method definions (i.e. "methodsFor"-expressions).

(*) However: stc and the file-in code inside Smalltalk recognize ST/X's embedded primitive C-code, and no doubling is needed inside those. This hack was added to make editing of such code fragments much easier and less error prone.

Class Definition ("class-definition files" only)

The stc compiler accepts the following (and only those) class definition expressions:

Simple Subclassing

    superclass subclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1 sharedPool2...'
	     category:'some-category'

to define class as a subclass of superclass.
The subclass will have indexed instance variables if (and only if) the superclass has indexed instance variables.

Instance Variables

The instance variables of class are those of its superclass(es) and additionally 'instVar1', 'instvar2', ...

Class Variables

The class variables of class are those of its superclass(es) and additionally 'classVar1', 'classvar2', ...

Class variables are visible in both class- and instance methods, of the defining class and of every subclass (and subclasses of subclasses). Class variables are shared (unless redefined) - meaning that access is to the same physical memory "slot" from both the defining class and all subclasses. You can think of class variables as globals with limited accessiblility: only the defining class and its subclasses 'see' them.

See below for class instance variables, which are class private (i.e. each class provides its own physical "slot").

Notice:

there are some classes (currently UndefinedObject and SmallInteger) which CANNOT be subclassed.

for the curious:
the reason is that instances of these are no real objects, but are marked by a special tag-bit or object-pointer value. Thus these instances do not have a class field in memory. This makes it impossible for the VM (= virtual machine or runtime-system) to know the class of such a sub-instance.

Notice:

there are some classes, to which you CANNOT add instance variables.

for the curious:
these are especially Object, SmallInteger and all classes which are also known by the VM and/or the compiler. The reasons are:

for Object:

as there are some classes which inherit from Object, and which are not represented by pointers (i.e. UndefinedObject and SmallInteger). Since these cannot have instance variables, all superclasses of them may also not define any instance variables. This means, that all classes between Object and SmallInteger (i.e. Magnitude, ArithmeticValue, Number and Integer) are also not allowed to have instance variables.
for the built-in classes: (actually, the following is also true for the classes mentioned above)

all classes known by the VM (i.e. Float, SmallInteger, Character, Array, String, Method, Block, Class, Metaclass etc.) must have a layout as compiled into the VM. Since the VM accesses these instance variables (and is not affected by a class change) it would use wrong offsets when accessing an instance of such a changed class. Since instance variables are inherited, this also affects all super- classes of the above listed classes. You will get an error-notification, if you try to change such a class within the browser.

Pool Dictionaries

Versions prior to rel 5.3 do not allow/support poolDictionaries. In those versions, the "poolDictionaries:" argument must be an empty string.
In newer versions, all pool variables of each listed pool are imported and visible both for class- and for instance methods.

Implementation of ClassVariables

Technically, classvariables are implemented as globals with a special name constructed as:

ClassName:ClassVarName

however, you should not have to care for or depend on this, except for the fact that class variables are visible when inspecting the Smalltalk dictionary and can be accessed easily from C-functions as globals (named "ClassName_ClassVarName").

Do not depend on any specific implementation of class variables, the current implementation may change without notice. Actually, it is planned to separate classVariables from Smalltalk globals in future ST/X versions and use multiple dictionaries within the VM.

Subclasses with Indexed Instance Variables

    superclass variableSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1...'
	     category:'some-category'

to define class as a subclass of superclass with indexed instance variables even if superclass had no indexed instance variables. An error will be generated, if the superclass is a variableByte- or variableWord class.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Byte-Valued Indexed Instance Variables

    superclass variableByteSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1...'
	     category:'some-category'

to define class as a subclass of <superclass> with indexed instance variables which are byte-valued (0 .. 255) integers.

An error will be generated, if the superclass is a variable class (i.e. has indexed instances) AND it has NO byte valued elements.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Word-Valued Indexed Instance Variables

    superclass variableWordSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1...'
	     category:'some-category'

to define class as a subclass of superclass with indexed instance variables which are word-valued (0 .. 16rFFFF) integers (i.e. unsigned shorts in c-world).

It is an error if superclass has non-word indexed instance variables.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Float- and Double-Valued Indexed Instance Variables

use

     superclass variableFloatSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
	      poolDictionaries:'sharedPool1...'
	      category:'some-category'

or:

     superclass variableDoubleSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
	      poolDictionaries:'sharedPool1...'
	      category:'some-category'

to define class as a subclass of superclass with indexed instance variables which are shortfloat- or doublefloat-valued rational numbers. (i.e. floats and doubles in c-world).

Float- and DoubleArrays were added to support 3D graphic packages (i.e. GL), which use arrays of float internally to represent matrices and vectors. They provide much faster access to their elements than the alternative using byteArrays and floatAt:/doubleAt: access methods.

Also, storage is much more dense than in arrays, since they store the values directly instead of pointers to the float objects.

A 1000-element floatArray will need 1000*4 + OHDR_SIZE = 4012 bytes, while a 1000-float-element array needs 1000*4 + OHDR_SIZE + 1000*(12+8) = 20012 bytes. (each float itself requires 8-bytes plus 12-byte header)
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Long, Signed-Word, Signed-Long Indexed Instance Variables

     superclass variableLongSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
	      poolDictionaries:'sharedPool1...'
	      category:'some-category'

or:

     superclass variableSignedWordSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
	      poolDictionaries:'sharedPool1...'
	      category:'some-category'

or:

     superclass variableSignedLongSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
	      poolDictionaries:'sharedPool1...'
	      category:'some-category'

to define class as a subclass of superclass with long (32 bit integers in the range 0 .. 16rFFFFFFFF), signed short (i.e. -16r8000 .. 16r7FFFF) and signed long (-16r80000000 .. 16r7FFFFFFF) indexed instance variables.
(i.e. int, short and unsigned int in the c-world).

These types were added for easier bulk data exchange with C language functions. They are not currently used in Smalltalk itself.
Notice that a class may be defined with both named and indexed instance variables.

Be aware, that indexable classes with float, double, signedWord, long and signedLong elements may (are) NOT be available on other Smalltalk implementations.
Using them may make your application non portable to other systems.
(however, these can be easily simulated by subclassing ByteArray and redefining the access methods).

Class Comment ("class-definition files" only)

A class comment may be defined with an expression of the form:

    ClassName comment:'some string'

an alternative to using a comment is to define class methods under the category "documentation", consisting of comments only.
Empty methods do not use ANY code space in ST/X, and have the positive effect of not eating up data space in the Smalltalk executable (which the comment does)

Class Instance Variables ("class-definition files" only)

A class may have instance variables, these MUST be declared before the first class method is declared. The declaration has the form:

    ClassName class instanceVariableNames:'string with varNames'

Do not confuse class variables with class instance variables.

Only one such class-instance-variable definition is allowed per input file.

Method Definition

The expressions following the class definition are to be method definitions of the form:

    !ClassName methodsFor:'method-category'!

    aMethod
	...
    !

    aMethod
	...
    !

    ...

    lastMethodInCategory
	...
    ! !

    !ClassName class methodsFor:'category'!

    aClassMethod
	...
    !

    ...

    lastClassMethodInCategory
	...
    ! !

"class-definition files" may only contain method definitions for the class defined in the class definition.

"method-only files" may contain methods for any class - but no class definitions.

Instance methods and class methods may be in any order.

To allow compilation of classes filed out from ENVY, stc also recognizes the selectors privateMethodsFor: and publicMethodsFor:. In addition, the special selector ignoredMethodsFor: tells stc to ignore all followup methods up to an empty chunk.

Method Syntax

Method and expression syntax is à la Smalltalk-80 (with a few extensions).

There is a limit on the maximum number of arguments a method can be defined with and messages can be sent with (currently 15).
This limit will be removed eventually, allowing an arbitrary number of arguments.

Other limits are:

max. 127 local variables
max. 31 temporaries for intermediate expression results
max. 300 other sends per method
lineNumber information (debugger) is only valid for the first 255 lines of a method.

For very complicated expressions (especially when these are generated automatically), the temporary limit could be reached in theory. In practice, so far no Smalltalk code (available PD programs and users' application code) has ever hit those limits.

Since most terminals cannot display the Smalltalk assignment character '<-' (backarrow as one character with same ascii-code as '_'), the scanner also accepts the character sequences ":=" (colon-equal) to express assignment.
This is compatible to similar extensions found in other Smalltalk implementations. Of course, the '_' is also accepted.

Use ':='
Support for '_' may be removed in later versions. Also, Smalltalk/X, like newer Smalltalk-80 versions allows underscores in identifiers - no longer treating them as assignment.

Although not defined in the book, Smalltalk-80 expressions seem to require (blank) characters to separate tokens (i.e. "Point origin: point1 corner: point2").

Smalltalk/X does not need these (i.e. "Point origin:point1 corner:point2" is fine)

I do not know at the moment, if this makes any problem when porting Smalltalk/X code to other Smalltalk implementations. (if required, the fileOut-methods may have to be changed to add blanks)

Assignment and init-Expressions

In contrast to Smalltalk-80's fileIn format (where any expression is allowed), expressions other than above must be of the form:

    Smalltalk at:#name put:constant

(constant may be any integer, float, string, symbol, true, false or nil)

or of the form:

    classname initialize

(classname must be the name of the class defined in this source-file)
These expressions allow globals to be set to a predefined value at startup and/or class initialization. Example:

   ...
 Smalltalk at:#MyVariable put:true !
   ...

Example Class

    Point subclass:#Point3D
	    instanceVariableNames:'z'
	    classVariableNames:''
	    poolDictionaries:''
	    category:'Graphics-Primitives'
    !

    Point3D comment:'
     this class defines a point in 3-dimensional space
    '!

    !Point3D class methodsFor:'instance creation'!

    x:newX y:newY z:newZ
	"answer a new point with coordinates newX and newY"
	^ ((self basicNew) x:newX y:newY) z:newZ
    ! !

    !Point3D methodsFor:'accessing'!

    z
	"Answer the z coordinate"
	^ z
    !

    z:newZ
	"set the z coordinate"
	z := newZ
    ! !

    !Point3D methodsFor:'printing'!

    printString
	"answer my printString"
	^ super printString , '@' , z printString
    ! !

Semantic Details

The following only lists non obvious semantic details - for a description of the Smalltalk language, please refer to standard literature.

Evaluation Order

Expression arguments and receiver are evaluated left to right, starting with the receiver (with exceptions as described below).

Smalltalk is an eager evaluating language - that is, all arguments are evaluated before the message send - even if not used by the called code.
Lazy evaluation can be simulated partially by using blocks as arguments, or by special code (see the LazyValue class and its documentation).

Side Effects

If any argument of an expression has a side effect on an instance variable, and the expression uses that instance variable, it is NOT DEFINED if the original or modified value of that instance variable is used.
For example:

    Object subclass:SomeClass
	    instanceVariableNames:'i'
	    ...

    i:aNumber
	i := aNumber
    !

    increment
	i := i + 1.
	^ i
    !

    undefinedBehavior
	^ self increment + i
    !

    undefinedBehavior2
	^ i + self increment
    !

    test
	Transcript showCR:'undefinedBehavior returns: '
			  , (SomeClass new i:0) undefinedBehavior printString.
	Transcript showCR:'undefinedBehavior2 returns: '
			  , (SomeClass new i:0) undefinedBehavior2 printString.
    !

in the #undefinedBehavior method, the value used for i in the #+ message may or may not be the incremented value.

Warning:
do never depend on the particular behavior of a Smalltalk or compiler; the semantic here is not defined. Even in ST/X, the behavior may differ between versions, or between the incremental and batch compiler.
(actually, in the current ST/X version, the incremental compiler returns 2 for the first, and 1 for the second method. In contrast, stc compiled code returns 2 for both, because it does not always evaluate arguments left to right - especially for arithmetic operations).

Notice:
this is to be considered a bug, because it conflicts with the evaluation order as defined above (although it is bad coding style...).
In practice, there have been only very minor problems due to this in the past.

#become

The behavior of your program is undefined, if instance variables of the receiver are accessed in a method, after a #become: message was sent to the receiver.
If the #become: changed the receiver into some other object with less or no instance variables, even a nonrecoverable fatal error may occur. Otherwise, the access will be to the corresponding instance variable slot as defined by the other class.
For example, the following may lead to unexpected behavior (or even a nonrecoverable fatal error):

    Object subclass:SomeClass
	    instanceVariableNames:'i'
	    ...

    badMethod1
	i := 0.
	self become:somethingElse.
	^ i
    !

    badMethod2
	self become:somethingElse.
	i := 0.
    !

The use of message sends to access the instance variables removes the above danger:

    Object subclass:SomeClass
	    instanceVariableNames:'i'
	    ...
    i
	^ i
    !

    i:newValue
	i := newValue
    !

    fixedMethod1
	self i:0.
	self become:somethingElse.
	^ self i
    !

    fixedMethod2
	self become:somethingElse.
	self i:0.
    !

Notice:
ST/X typically falls into a segmentation violation exception, which can be cought by an appropriate exception handler.

Literal Array of a Method

Stc generated code does not (currently) access the literal array; instead, the literal array of a method is created for the debugger (to find senders) only. Modifying the literal array (which is bad coding style anyway) has no effect on machine compiled code.

In contrast, bytecode-interpreted methods use the values found in the literal array. A modified literal array will change the behavior of the method. This modified behavior is not reflected in the method's source code.

And finally, the dynamic compiler (JITTER) generates code which accesses literals inline (i.e. it takes the literalArrays contents at compilation time and creates inline constant accesses). Thus, JITTED code behaves like static compiled code, in that changing the literal array does not affect the execution. However, since the system may chose to flush its dynamic code cache, and recompile at any time later, the changed literal array may eventually affect the execution then.

For these reasons, we highly recommend keeping the literal arrays untouched.

(Experts may do so, but have to ensure that the method gets recompiled, by converting a static compiled method into a dynamic one, and flushing the code cache entry for this method explicitly.

Builtin Methods

For a number of message sends, both the stc- and the incremental compiler create inline code which performs the function without doing any message send.
Redefinition of any method listed below will have no effect on your program; also, tracing and breakpointing of these methods is not possible (since they are never executed).

In theory, many more methods could be inlined; the current set represents a compromise between performance (inlined code is much faster) and flexibility (inlined methods cannot be redefined/traced).

In general, only methods for which a changed semantic would make the system unusable or change the semantic completely in a non Ansi-Standard way, are inlined. With the stc compiler, the degree of inlining can be further controlled by command line arguments.

Inlined messages:

any ifTrue:[ifFalse:] [ ... ]
any ifFalse:[ifTrue:] [ ... ]
with bytecode interpretation, the receiver is checked for being either true or false, and an error is raised if not.
STC compiled- and just-in-time generated code simply compares the receiver against true or false, showing undefined behavior if the receiver is not a boolean.
(i.e. "foo ifTrue:" is compiled as "foo == true ifTrue:" and "foo ifFalse:" is compiled as "foo ~~ true ifTrue:")
When debugging programs, you may want to disable just-in-time compilation, to have the system check for non-boolean receivers and detect those error situations.
We are aware of the fact, that this different behavior is bad, and we are still looking for an easy fix (which does not cost performance and does not blow up the generated code too much).
[any] whileTrue: [ ... ]
[any] whileFalse: [ ... ]
as above for the blocks value
aSmallInteger timesRepeat:[]
aSmallInteger to: aSmallInteger do:[]
aSmallInteger to: aSmallInteger by: aSmallInteger do:[]
aSmallInteger + aSmallInteger
aSmallInteger - aSmallInteger
aSmallInteger * aSmallInteger
aSmallInteger // aSmallInteger
aSmallInteger bitAnd: aSmallInteger
aSmallInteger bitOr: aSmallInteger
aSmallInteger negated
arguments are checked for being smallIntegers and the expression is evaluated without sending the message.
Depending on the compiler's optimization settings, this may also be done partially for float or mixed float & smallInteger operands.
anArray at: aSmallInteger
anArray at: aSmallInteger put:anObject
the array access is performed inline, if the index is within the bounds and it is likely, that the argument is an array.
aString at: aSmallInteger
aString at: aSmallInteger put:anObject
the string access is performed inline, if the index is within the bounds and it is likely, that the argument is a string.
any class
any isMemberOf:
direct access to the objects (hidden) class slot
any ==
any ~=
an identity compare produces true or false without a message send
any isNil
any notNil
an identity compare against nil is generated
any perform: aMessage
inline as any message, if the argument is a constant symbol
any yourself
no message send is generated - the receiver is directly evaluated
any ? anObject
the receiver is evaluated and compared against nil. If nonNil, results in the receiver - otherwise the argument.
Character space
Character tab
Character value:aSmallInteger
no message - the space-Character constant is directly returned. This is also done for tab, cr and a few other common character constants.
SmallInteger maxVal
no message - the maximum SmallInteger constant is directly returned. This is also done for minVal, maxBits and maxBytes
Smalltalk isSmalltalkX
no message - the compiler knows that it is compiling ST/X code. This is even propagated to any ifXXX expression using this as condition, so that no code is generated for the condition and for the false branch of the ifXXX expression.

In addition, some constructs are partially inlined - special code is generated to avoid a message send in common cases.
Partial inlined messages:

any = anObject
an identity test is performed first; no equality test is performed, if the objects are identical and true is generated by inline code.
any ~= anObject
an identity test is performed first; no equality test is performed, if the objects are identical and false is generated by inline code.
any < anObject
if the arguments are SmallIntegers, the comparison is done inline. Otherwise, a regular message send is generated.
The same is done for the other relational operators.

The above list may be incomplete - depending on the ST/X version, more messages could be inlined in your system.

Extensions to Smalltalk-80 (Blue Book Version)

Brace Array Constructor

The brace construct "{ expr . ... . expr }" for array instantiation at runtime was added as syntactic sugar to Squeak/Pharo. This is also supported by ST/X.

The construct:

    { expr1 . expr2 ... exprN }

is semantically equivalent to;

    Array with:expr1 with:expr2 ... with:exprN

for an arbitrary number of expressions.

This makes passing of array-arguments or the return of multiple values much easier. Notice that individual expressions are separated by a period (i.e. statement separator).

Array Index Expressions

The following constructs are compiled to generate variations of the "_at:..." and "_at:...put:" messages, to access one-dimensional and multi-dimensional arrays:

array-expr [ index-expr ]: generates: array-expr _at:index-expr
array-expr [ index-expr1 ][ index-expr2 ]: generates: array-expr _at:index-expr1 at:index-expr2
array-expr [ index-expr ] := value-expr: generates: array-expr _at:index-expr1 at:index-expr2 put:value-expr
array-expr [ index-expr1 ][ index-expr2 ] := value-expr: generates: array-expr _at:index-expr1 at:index-expr2 put:value-expr

For convenience, the "_at:" messages are also understood by sequential collection classes, and will create corresponding one- or two-dimensional arrays. I.e.

Array [ size-expr ]: generates a regular array with size-expr elements.
Array [ dim1-expr ][ dim2-expr ]: generates a two-dimensional array with 1..dim1 rows and 1..dim2 columns.

Compiler Directives

Comments of the form:

    "{ something ... }"

are recognized by the stc-compiler as directives. Since directives are hidden within comments, these will be ignored by other Smalltalk systems; making ST/X sources transferable to other Smalltalks.

Line Number Definition

The directive:

    "{ Line: n }"

tells stc that line-numbering should continue with line n. Line numbers in following warning- and error-messages will be relative to n.
This feature is used internally, with incremental stc-compilation to machine code.
It could also be useful for systems where Smalltalk is passed as an intermediate language to stc (i.e. compiler-compilers or code generators) to base linenumbering on the original file.

Symbol Definition

The directive:

    "{ Symbol: aSymbolString }"

tells stc that a primitive wants to access a symbol. Stc includes a definition for that symbol and generates code to create the symbol at startup time; within the primitive, the symbol can be refered to by a C-conforming name as described in ``How to write inline C code''.

Symbols can also be created using the (slower) _MKSYMBOL() function at runtime. This also allows C-Strings to be converted to symbols.
(example: in the XWorkstation-class where keypress-characters are converted to symbols like #Home, #Down etc.)

This directive is no longer needed and may not be supported in future versions. Use the @symbol-mechanism, since it reliefs you of the need to know about name translations.

Type Hints / Declarations

The directive:

    "{ Class: className }"

after an instance-, class-, or local-variable declaration tells stc, that this variable will always be assigned an object of class: className.

Various optimizations in the code are possible if the type of an object is known (especially for simple types such as "SmallInteger", "Character" "Point" or "String").

Currently everything but SmallInteger, Float and Point-definitions in method local declarations are ignored by the compiler.

Even with these type declarations, the compiler still generates code which checks assignments for correct typing (i.e. an assignment of a float to a SmallInteger-typed variable will generate a runtime error).

With the improvements of the type-tracker and optimizations performed in stc, this feature seems now much less useful in many situations - especially, when considering the limited reusability of the generated code.
(see benchmark results of sieve/sieveWithInteger, atAllPut/atAllPut2 etc. some show very small differences between the untyped and typed versions)

We recommend using type hints only in performance critical code, for fully debugged code.

Code Gemerator Pragmas

The stc compiler's code generation strategy can be controlled on a per-class basis with command line options such as "+optspace", "+optinline" etc.

Sometimes, finer control (i.e. over individual methods) is needed. Comments of the form:

    "{ Pragma: keyword }"

instruct stc to change its code generation startegy for a single method. Keyword must (currently) be one of:

"+optspace"
generate shorter code (which will run slower). Useful for seldom reached code (or seldom encountered erro handling code)
"+optspeed" or "+optSpeed"
generate inline code for integer operations and common constructs.
"+optmath" or "+optMath"
additionally generate inline code for floating point operations
"+inlinemath" or "+inlineMath"
additionally generate inline code for floating point triginometric functions
"+inlinemath2" or "+inlineMath2"
like above, but do not generate range checks on the receiver
may lead to floating point exceptions when executed, if - for example - a negative receiver gets a square-root (#sqrt) message.

These pragmas must be placed right after a method's selector specification.
example:

    "although the whole class is complied '+optinline',
     the following class-initialization method is compiled for space,
     since it is only called once ..."

    initialize
	"{ Pragma: +optspace }"

	....
	....
    !

Changing compilation to "+optspace" is useful for methods which are seldom called (such as class-initialization methods, which are usually invoked only once during startup) or error reporting methods, which are only invoked for abnormal events.

The effect of pragmas can be turned off with the "-noPragmas" stc command line argument - with this option, optimizations are under control of command line arguments only.

Currently, not all possible trigonometric function generate inline code with the inlineMath options - there may be more in the future if there is a need.

Namespace Definition

A comment of the form:

    "{ NameSpace: nameSpaceID }"

declares the namespace, into which the following class is to be installed. It must preceede any class definition message in the source file (i.e. it should be located somewhere at the files beginning).
Semantic details of namespaces are described below.

The current projects defaultNameSpace is used, if no namespace directive is present in a loaded source file.

Package Definition

A comment of the form:

    "{ Package: 'package-identifier' }"

defines a package identifier, which is attached to all methods and classes which are defined in that file.

This is mostly useful, if individual methods for existing (Smalltalk-) classes are to be filed in, and you want those to be easily identified later.
For example, the "tgen" package adds a few methods to the Object and Array classes. In order to identify those later (i.e. find them quickly for removal), the change file contains a line defining a package identifier of "tgen"; therefore, all of the redefined methods get this as their package identifier.
Thus, you can later use the ProjectViews "browse" menu item, to open a browser on all those methods.

The current projects defaultPackage identifier is used, if no package directive is present in a loaded source file.

Multiple Namespaces

Especially when filing in third party code or you are working in a big team, you may encounter name conflicts with class names. These conflicts are very inconvenient, since (without namespaces) you had to manually browse those files (before filing in) and change all names - which is especially inconvenient, since the systemBrowser cannot be used for this.

To allow a reasonable handling of this case, Smalltalk/X provides (starting with rel3.1) multiple namespaces, which effectively allow you to have two or more classes with the same name to reside in one image/executable.

By default, all classes are defined in the Smalltalk namespace.

To stc-compile a class for another namespace, either add a line as: "{ NameSpace: NamespaceIdentifier }" at the beginning of the ST-source file,
or compile it with the stc command line argument: stc ... -nameSpace=NamespaceIdentifier ... file.st
Both are equivalent, and tell stc, that all globals defined in this module are not to be entered into the default namespace 'Smalltalk', but instead into a space called NamespaceIdentifier.
Also, globals used within the compiled class are first searched for in NamespaceIdentifier, THEN in the Smalltalk namespace.
NamespaceIdentifier must be a single identifier starting with an upper-case letter; underscores are allowed, but spaces or non-alphanumeric characters are not. (i.e. since a global variable will be created for it, it must be a valid global variable identifier)
When a source file is filed in with the FileBrowser, the same mechanism is used if a namespace directive is present.
Otherwise, the currently active project determines the defaultNamespace into which new classes are loaded. By default, this is Smalltalk.
A projects defaultNamespace can be changed by selecting the ProjectViews "default namespace" popupMenu item.

Notice, that a directive was choosen, to define a namespace within the source file. This was done by purpose, to allow classes in a namespace to be filed out and loaded into another system, which does not support multiple namespaces (i.e. VisualWorks) - of course, there the loaded class will be placed into the global namespace - but at least, it is possible to get it loaded (unless nameConflicts arise ;-) and rename it then in the browser.

Example:

You get some code, which defines a class "Button", which should not conflict with the builtin Button class.
To allow both classes to reside in one image, either load it into some (say) "MyWidgets" namespace using the FileBrowser, or stc-compile it with:

    stc -c -NMyWidgets filename.st

The class defined by the module will then NOT conflict (i.e. overwrite) the existing Button. The loaded class will not even be visible in the Smalltalk dictionary. However, classes within the same namespace may refer to the new class as Button.

Explicit Naming

In rare cases, it may be nescessary, to access globals from different namespaces within one module. Consider the above case (Button in MyWidgets), and you need access to the original Button from within that module.

To access to original Button from within the module, you can either use the explicit:

    Smalltalk at:#Button

or use the (nonstandard) construct:

    Smalltalk::Button

To access the new Button from other modules, use either:

    MyWidgets at:#Button

or the (nonstandard):

    MyWidgets::Button

For compatibility with VA and VW5.x, the dot-notation:

    MyWidgets.Button

is also supported when filing in code.

Notice:
The following "using"-directive is not yet released (in vsn 3.1)
(its currently being evaluated and tested).

If you don't want to change the sourcecode, you can also define the namespaces to use for searching in a line as:

    "{ Using: name1 name2 ... nameN }"

at the beginning of the source file, or with an stc command line argument (if you don't want to modify the file):

    stc ... -Uname1 -Uname2 ... -UnameN filename.st

The names given define the namespaces to search for globals, in the given order. Thus a line:

    "{ Using: MyWidgets }"

will force searching for globals in the MyWidgets namespace first, THEN in the standard Smalltalk namespace; thus the name Button refers to MyWidgets::Button automatically.

Notes & Recommendations

Although the above solves most name conflicts, you should still try to avoid name conflicts if possible - if you ever plan to port to other Smalltalk systems.
Therefore, do not use namespaces if there are chances to get things working without them.
since there is currently no standard for multiple namespaces, we highly recommend using the explicit construct (i.e. "Smalltalk at:#Button" or "MyWidgets at:Button"), since this is compatible to other Smalltalk implementations (i.e. it can be simulated using pool dictionaries or changing some methods).
If you like this feature, tell others about it - maybe ST/X sets a standard here... ;-)

A trick:

It is sometimes required, to add additional protocol to existing classes, for example, some application may like to add a #foo method to an existing base system class. If this method is only required within that application, AND the creation of instances of that class is under that applications control, the following trick encapsulates the added method in a nice way:
Redefine the class in your namespace as:

	"{ NameSpace: MyNameSpace }"

	NameOfSystemClass subclass:#NameOfSystemClass
			...
			...
			...
	<added foo method here>

All instance creations of "NameOfSystemClass" (by code within that namespace) will now create instances of the modified subclass - which inherits and therefore mimics the original classes's behavior except for the added foo-method.

There is no need to add foo to the main class.

Of course, the above has its limitations, in that subclasses of the original baseclass are not affected by the new foo method - which could also be called a feature, since those classes are completely protected from any changes done in the private version ...

Final note:: The name of a namespace should not be the same as that of some other class (because the same mechanism is used for private classes).

Private Classes

Starting with rel2.11, ST/X allows classes to be declared as being private (i.e. owned) by some other class. These private classes are not visible to the outside of the owning class - there may even be a globally known class with the same name.
Private classes help in organizing large projects in that additional information is hidden and name conflicts are avoided.

Certain restrictions apply to private classes:

their package identifier and category are forced to be the same as of the owning class
their sourceCode must reside in the same file as the owning classes'; this affects especially the source code management.
they may not define any inline-C primitive variables, functions or includes. However, if these are placed into the owning class, methods with primitive C code are allowed.
they cannot be filed out separately from their owning class.
the name of a private class may not conflict with a class variables name of the owning class. If there is a conflict, the class variable takes precedence.
extension methods to private classes are not allowed/possible. I.e. you cannot add or replace a private class's method via an extension from another package.

Like regular classes, private classes are created by a class definition expression. Additional variants of the subclass-creation messages are provided for private classes:

    superclass subclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1...'
	     privateIn:OwningClass

or:

    superclass variableSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1...'
	     privateIn:OwningClass

and so on ...
(notice the privateIn: keyword argument, which replaces the category: of a regular class definition)

Within the owning class, any reference to class refers to the private class - even if some other (global) class with that name exists.
A global class with the same name can be referred to as "Smalltalk::class", or - if you prefer portable code - with "Smalltalk at:class".

Technically, a special class variable is created, but its visibility is limited to the owning class.
I.e. private classes are hidden from subclasses of the owning class. This visibility is still to be evaluated and could be changed in future versions. If a private class is to be referenced by a subclass, use access methods in the owners class protocol (which is better style, anyway).

Use the Systembrowser's "new private class" item in its class list menu, to get a template for private class creation.

Private classes and namespaces use the same basic mechanism - a namespace is actually a dummy class, providing a home for its classes. Therefore, you should also avoid any conflicts between namespace names and class names.

Recommendations:
In contrast to the above described namespace mechanism and fileIn format, private classes use a slightly different definition format, which is NOT backward compatible with systems that do not support this feature.
We therefore recommend, to not use private classes for your projects, but instead use namespaces, if you ever plan to port your application to other Smalltalk systems.
Conflicts within a namespace are much easier to avoid than overall conflicts, and the added encapsulation provided by having classes absolutely private is often not needed.

Of course, since ST/X's system classes are probably never of any interest to other system vendors, these can and do make use of private classes ;-)

Limitations:
A class may not be a subclass of one of its private classes (technically, this constellation is possible to create in the browser, but is not possible to fileIn).

Local ('here'-) Sends

In some situations, it is strictly nescessary, that a send goes to a locally defined method. For example, many private methods are supposed to be not redefined by subclasses. In standard Smalltalk, there is no way for an implementor of a class, to make certain that his own methods are called by self-sends, if other programmers use this class as (abstract-) superclass and create subclasses based on it.
To offer some safety in this situation, Smalltalk/X extends the standard Smalltalk language with a so called hereSend.
It is used similiar to a super-send, using the (new) pseudovariable "here" as receiver. The semantic of the hereSend is much like that of a superSend. However, while a superSend starts the method lookup in the superclass of the class which contains the method, hereSends start it in the class containing the method.
(A normal self-send starts it in the class of the receiver - independent of where the method is defined.)

Warning:: you should keep in mind, the using here-sends will limit the reusability of your class, in that it removes the posibility to change the behavior in subclasses by redefinition of methods.
Also, remember that hereSends are a special ST/X feature. Code using them will probably not be portable to other Smalltalk implementations.

VisualWorks Exception Pragmas

The following VisualWorks pragmas for context marking are used with the exception handling system. Methods marked with such a flag will mark their context accordingly when executed. This allows for quicker handler finding when an exception is eventually raised (the exception handler looks for the flag in the context chain, instead of checking against a set of selectors).
Notice, that these only provide syntactic sugar to functionality which is also provided by available context protocol (which has been available and used before). Thus, it is always possibly to add an explicit statement of the form:

    thisContext markForXXX

to a method's code.

The supported pragmas are:

    <exception: #raise>

marks the context of the method as an exception-raising method. This has the same effect as adding the statement "thisContext markForRaise" to the beginning of the method.

    <exception: #handle>

marks the context of the method as an exception-handling method. This has the same effect as adding the statement "thisContext markForHandle" to the beginning of the method.

    <exception: #unwind>

marks the context of the method as an unwind-handling method. This has the same effect as adding the statement "thisContext markForUnwind" to the beginning of the method.

ST/X Context Pragma

The pragma:

    <context: #return>

marks the context of the method as a possible target of a return message - i.e. the context will be created such that it will allow returning from it via the #return message. If this pragma is not used, the compilers may choose to setup the context for not being returnable (which is slightly faster), and you will get an error when attempting to return from it.
Notice, that methods which contain a returning block are always setup as returnable, so this is only required for contexts which are subject to some special context manipulation (such as "thisContext sender sender return").

Primitive Definitions

Definitions and declarations common to all primitive code in all methods can be placed into a single global primitive definition section. Typically, C-include or C-define statements and/or type declarations are placed in these.
A primitive definition is defined with:

    !className class primitiveDefinitions!

    %{
	... anything you like ...
    %}
    ! !

The contents of this chunk will be remembered internally and included whenever methods which contain primitive code are to be compiled.

Additional C-functions must be declared in a primitiveFunctions chunk, which is not included when individual methods are compiled (otherwise, you would get linkage errors, due to multiple definitions of the same function).
Finally, C-variables are to be declared in a primitiveVariables chunk.

The SystemBrowser's class menu includes items to show those primitive definitions (for example, see the definitions in ExternalStream).

Method Annotations

Methods can get additional attributes via annotations. The syntax is similar to a resource (see below) or pragma definition, with an arbitrary keyword:

    <keyword: arguments...>

The annotation can be extracted from a method via the #hasAnnotations, #annotations and similar messages. Also, searching or other operations (such as marking menu methods) are possible using annotations.

Some classes use this feature to dynamically extract special methods. For example, the SOAP and REST frameworks will automatically generate RPC-call entries for methods marked as such.

Resource Definitions

Methods may be marked as resource-accessing-methods by adding an annotation like:

    <resource: #resource>

or:

    <resource: #resource ( list of additional symbols ) >

Both of the above forms have no semantic meaning - except for the methods being marked specially. This marking allows that those methods are easier (and especially: quicker) to locate, without a need to scan all of the method's source code.
The launcher provides a menu item (in the classes .. menu), to quickly search for specific resource accesses and open a browser on them.

For example, all methods which depend on the keyboard mapping are marked in ST/X as:

    <resource: #keyboard>

If present, a resource definition must be at the very top of a method's code - before any local variable definition or statement (but after the method's argument specification).

Trick:
Resource definitions can also be used to mark methods for yourself, or for your project management;
a definition like:

    <resource: #toBeFixed>

or:

    <resource: #toBeReviewed>

may help you to locate those methods easily later.

Do not use resource definitions for things which is common to ALL of your methods (i.e. never automatically generate resources containing your name, date or other version information).
Such information should be recorded in a method comment (ST/X already provides a mechanism to do this automatically: the HistoryManager, which can be enabled via the launcher's settings menu does exactly this for you).

If many methods are marked with a resource tag, the fast search will degrade into a slow overall search.

Standard Conventions

For the ST/X system, we use the following conventions when marking a method with a resource symbol:

<resource: canvas>
the method returns an interface specification which describes a views/subviews/dialogs GUI. Typically, this is a method as generated by the GUI builder tool, and supposed to be used by the UIBuilder to generate a views components.
Notice: All methods as generated by the GUI building tool are automatically marked with this resource flag.
<resource: menu>
the method returns a menu specification which describes a pullDown- or popUpMenu. Typically, this is a method as generated by the Menu builder tool, and supposed to be used by the UIBuilder to generate the menu.
Notice: All methods as generated by the menu building tool are automatically marked with this resource flag.
<resource: tabList>
the method returns a tabList specification which describes a notebook widget's sub-components. Typically, this is a method as generated by the TabList builder tool, and supposed to be used by the UIBuilder to generate a notebook's tab list.
Notice: All methods as generated by the tabList building tool are automatically marked with this resource flag.
<resource: keyboard>
a keyboard handling method. The resource tag includes the keys which are handled by that method (this allows easy searching for methods which react to a particular key).
<resource: style>
a viewStyle handling method. The resource tag includes the values which are extracted from the view styleSheet. This allows easy searching for methods which are affected by particular styleSheet value changes.
<resource: image>
an image returning method. The method returns a bitmap image from an inlined constant bytearray which defines the pixel values. The image as defined by such methods can be edited via the image editor, which is opened by double clicking on the method in the browser.
<resource: programImage>
an image returning method. The method returns a computed bitmap image (i.e. does not contain the pixel data itself as a literal array, but calls other methods). Such images are not editable, as the method does not itself contain the pixel data.
<resource: programMenu>
the method returns a menu. These are oldStyle methods, which were written without use of the menu building tool.
Over time, these will probably vanish and be replaced by more user-friendly methods, which are based upon the new GUI building tools.
<resource: obsolete>
methods which are obsoleted can be marked as such. They will be marked as such in the system browser, and ignored by code completion helpers.
Typically, methods which are going to vanish in future ST/X versions are kept for a migration period, but marked as obsolete during that period (kept for backward compatibility).
<resource: needsFix>
methods which are known to need fixes/enhancements or rewrites, which are known to not affect the system's operation and are low in the bugFix/enhancement item list.
<resource: example>
method is an example. Can be used to search for example code. Also, methods marked as example are ignored in the preRequisite-search, when the ProjectDefinition class calculates the preRequisites for a package.
<resource: ignoreInPreRequisite>
Methods marked as such are ignored in the preRequisite-search, when the ProjectDefinition class calculates the preRequisites for a package.
<resource: skipInDebuggersWalkBack>
When a debugger is entered (for example, when hitting an exception), the debugger tries to make a useful guess on which method should be the initially selected. Usually, helper methods from the exception handling framework are skipped over to present the method in which the error was reported, not the reporting mechanism. For this, framework and helper methods are marked with this annotation.
<context: #return>
Needed for exception framework developers only (*).
Marks contexts from which a return may be forced via stack unwinding (i.e. a method called by this method may do a "thisContext sender ... sender return"). Because the ability to return from a method in this way requires some extra bookkeeping in the runtime system (saving stack and registers), optimized code is generated by the compiles for normal methods which do not contain returning blocks. Thus, these cannot be returned from using the above mechanism. However, methods annotated with "context: return" will do the required bookkeeping and will therefore be returnable.
<exception: #raise / exception: #handle / exception: #unwind>
Needed for exception framework developers only (*).
Marks contexts involved in the exception raise mechanism. This is an optimization used by the runtime system to walk along the context chain faster, when searching for particular handling and raising contexts.

(*) "normal" programmers do not need to care for those annotations. They are required (and must be carefully placed) in the exception handling framework, though. Check these especially, if you are manipulating the exception handling code in the GenericExeption class.

Other Annotations (used by frameworks)

<Rest: (GET/PUT name: 'rest-name' argument: 'argument-type' return: 'return-type' comment: '...')>
The REST framework will automatically generate service-call entries for methods marked this way. For details, take a look at the HTTPRestService class.
<inspector2Tab>
marks methods which provide a page for the inspector. Individual classes may provide additional (typically domain-specific, and sometimes application-specific) pages to be shown in an inspector view. One such example is the FileName class, which offers additional pages prsenting the file's contents as a convenience.
<javanative: ...>
used by the Java-language subsystem to mark methods which implement a native Java method. (if you do not know what that is, it is probably not of interest to you anyway :-)
<postLoad>
marks methods which should be automatically called after a package has been loaded (with "Smalltalk loadPackage:...").
<timeout: seconds>
used by the SUnit testrunner to set a timeout on the execution of a test. If the test takes longer than the given number of seconds, it will be canceled and marked as failing.
<ignore: ruleClass rationale: reasonString author: authorString>
tells lint to not run a particular lint-rule on this method. <ruleClass> is the name of the rule (i.e. RBIfNilIfNotNilReplaceRule), <reasonString> to document the reason for not checking, and <authorString> the one who added the skip annotation.
<modifier: super>
tells lint, that this method - if redefined in a subclass - should be called via a super-send. I.e. that this method's execution is required for proper operation of subclass instances.
<modifier: final>
tells lint, that this method should not be redefined in subclasses. In Smalltalk, final methods are considered a bad idea, and not enforced (as opposed to Java, where this is actually used as a performance optimization). Think twice.
<modifier: override>
tells lint, that this method is supposed to redefine a method in a superclass.
<foreignSelectors: #( list-of-selectors ) >
tells lint, that selectors listed should not be warned about being unimplemented; typically, these messages will be sent via the doesNotUnderstand mechanism to another system (eg. a remote, dotNET or Java program)
<foreignSelector: selector >
tells lint, that the (single) selector should not be warned about being unimplemented; typically, this message will be sent via the doesNotUnderstand mechanism to another system (eg. a remote, dotNET or Java program)
<pragma: +flagName>
tells the compiler to turn on a parser flag.

Method Privacy

Beside the default of being public, methods may be private, protected or invisible in Smalltalk/X.
The four possible visibilities are:

public
the method may be invoked by any other method.
private
the method may only be invoked from methods within the containing class. If invoked from the outside (or from a subclass), a runtime privacy exception is raised.
protected
the method may only be invoked from methods within the containing class, or from subclass methods. If invoked from the outside, a runtime privacy exception is raised.
invisible (ignored)
the method is transparent to both the containing class and to the outside world. If a superclass implements that message, the corresponding superclass method is invoked - otherwise a doesNotUnderstand-exception is raised, if that message is ever received.

Invisible methods are mostly useful to temporarily disable a method, without actually removing it (for example, during testing/debugging).

Sending via #perform: is always possible, since this is equivalent to a self-send (thus, even private and protected methods can be reached via #perform:).

Late note:
We did not find this feature very useful (although many ex-c programmers asked for it in the first place), and are probably not going to further support it in new browser versions.

Lexical Stuff

Some extensions to Smalltalk as described in the blue-book were made by ParcPlace up to OW4.1. Some of these extensions are also available in ST/X. Additional extensions were made in Squeak (brace construct for Arrays) and ST/X (extended comments). All of those are documented in the following chapter.

Integer Literals with C-like Radix Prefix

Integer literals prefixed with '0x', '0b' and '0o' will be taken as base-16, base-2 and base-8 integer constants.
Example:

    x := 0x1234.
    y := 0b1010111.
    z := 0o1777.

String Literals with C-like Character Escapes

String literals which are prefixed by a "c"-character will be expanded according to the C-language syntax.
The following escape sequences are supported:
  \n - newline (0x0A)
  \r - return (0x0D)
  \t - tab (0x09)
  \b - backspace (0x08)
  \f - formfeed (0x0C)
  \g - bell (0x07)
  \0 - null (0x00)
  \xXX - single byte hex code (0..255)

Thus, to get a newline inside a string, you can either write:
'hello\world' withCRs (old, portable ANSI style)
or:
c'hello\nworld' "(new non-portable ST/X style)"

Another example:

    x := c'hello\n\tworld'

will assign a string with embedded newline and tab characters.

If you need a backslash inside the string, it should be escaped with a backslash (i.e. doubled).

Expanded String Expressions

Escape sequences are expanded like with C-strings, but in addition these allow embedded Smalltalk expressions (in braces) which are sliced in.
Example:

    |a|

    e'the square root of {a} is {a sqrt}' printCR

The compiler parses this syntactic sugar and generates a bindWith: expression. I.e. the above is equivalent to:

    |a|

    ('the square root of %1 is %2' bindWith:a with:a sqrt) printCR

If you need a brace inside the string, it should be escaped with a backslash.

International String Translation

Escape sequences are expanded and Smalltalk expressions can be embedded like with E-strings, in addition, the string is translated via the resource translation mechanism.
Example:

    |a|

    i'the square root of {a} is {a squared}' printCR

The compiler parses this syntactic sugar and generates a stringWith: expression to the current visible resource pack.
I.e. the above is equivalent to:

    (thisContext resources string:'the square root of %1 is %2' with:a with:a sqrt) printCR

if instances of the method's class contain an instance variable named "resources", then the following optimized code is generated:

    (resources string:'the square root of %1 is %2' with:a with:a sqrt) printCR

If you need a brace inside the string, it should be escaped with a backslash.

Of course, your resource file should provide a translation for the string (which is likely not the case for the above example).

ByteArray Literals

Literal byteArrays are created by enclosing the elements in #[ .. ].
The elements must be in the range 0 .. 255 (i.e. 16r00 to 16rFF).
Example:

    x := #[ 1 2 3 4 ].

    masks := #[ 2r10000000
		2r01000000
		2r00100000 ]

Note: This used to be a new feature in the early 90's, but is nowadays supported by all Smalltalk dialects.

Regex Strings

String constants preceeded with 'r' will generate a regex match pattern.
Example:

    r'[a-z][a-z0-9]*' matches: someString

The compiler parses this syntactic sugar and generates a regex matcher instance for it.

ByteArray Literals

Literal byteArrays are created by enclosing the elements in #[ .. ].
The elements must be in the range 0 .. 255 (i.e. 16r00 to 16rFF).
Example:

    x := #[ 1 2 3 4 ].

    masks := #[ 2r10000000
		2r01000000
		2r00100000 ]

Note: This used to be a new feature in the early 90's, but is nowadays supported by all Smalltalk dialects.

Extended Array Literals

Integer- and FloatArray literals can be created by adding a type prefix after the '#' (hash) character.
The following prefixes are recognized:

u8 - unsigned bytes (same as ByteArray)
u16 - unsigned shorts (instance of WordArray)
u32 - unsigned ints (instance of IntegerArray)
u64 - unsigned long ints (same as LongIntegerArray)
s8 - signed bytes (instance of SignedByteArray)
s16 - signed shorts (instance of SignedWordArray)
s32 - signed ints (instance of SignedIntegerArray)
s64 - signed long ints (same as SignedLongIntegerArray)
f16 - half floats (instance of HalfFloatArray)
f32 - short floats (instance of FloatArray)
f64 - doubles (instance of DoubleArray)
u1 - bits (instance of BitArray)
b - bits (same)
B - booleans (instance of BooleanArray)

Examples:

    a := #u16( 1 2 3 4 ).
    b := #f16( 1.0 2.0 3 4.0 ).
    c := #B( true false true true ).

Underline in Identifiers

The underline character is treated like a letter when encountered in an identifier.
This extension was added to ST-80 with the introduction of rel4.
Notice, that the underline character parsed as an assignment token in older ST-80 versions, which results in "var1_var2" being parsable both as a single identifier and as an assignment statement.
Currently, ST/X parses the above as an identifier iff no space characters are contained in the construct. I.e. "var1_var2" will parse as a single identifier, while "var1 _ var2" parses as an assignment.
This is compatible with most oldStyle code, but may lead to trouble if spaces are missing; for example, the following code fragment (found in the Squeak Smalltalk system) parses incorrectly, if the underline option is not turned off:

    |foo|

    foo_ 10.

The old-style assignment is supported to allow old Smalltalk code to be loaded; however, it is recommend, to not use the underline character as an assignment operator and convert old code to use the new syntax.
Future Smalltalk/X versions may no longer support this (backward compatible construct). For now, you can change the behaviour in the settings dialog.

The degenerated identifier consisting of an underline alone is only allowed within a keyword-message selector; i.e. the following is legal: "self _:1 _:2 _:3", and compiles to a #_:_:_: message send.
For portable code, you should not use this, since not all other Smalltalk implementations allow this.

Non Alphanumeric Characters in Symbols

Usually symbols are defined as #xxx, where xxx consists of a letter followed by letters or digits.
There are also keyword and binary symbol literals, such as: #at:put:, #at: or #+.

Symbols with other characters can be specified by enclosing them in single quotes, where the first quote must immediately follow the '#'-character.
Example:

    #'a symbol with spaces'   - spaces
    #'123'                    - starts with a digit
    #'hello_world'            - underscore

Symbols with unprintable characters must be created at runtime, by sending #asSymbol to an appropriate string.

Note: Nowadays supported by all Smalltalk dialects.

Empty Local Variable Declaration

The list of local variables may be empty, as in:

    myMethod
	| |

	....

the same is true for blocks:

    x := [:a | ]           - as in-the-book
    x := [:a | | | ]       - with empty locals

Notice, that some Smalltalk dialects may not allow this. If you checked the "warn about possible incompatibilities" flag in the compiler settings, you will get a warning.

Empty Methods

a totally empty method is legal; it is equivalent to a simple ^ self.
Thus:

    myMethod1
    !

    myMethod2
	| |
    !

    myMethod3
	| aLocal |
    !

    myMethod4
	"only a comment"
    !

    myMethod5
	^ self
    !

all behave identically (returning self).
Please add a comment telling that the empty method is empty by intention, and not simply forgotten to be finished.

Special 'constants' as Array Literals

Smalltalk/X allows "nil", "true" and "false" to be used in literal arrays. Thus it is possible, to declare an array as:

    #('string1' 'string2' nil 1 1.2 false true wow)

Within an array literal, both simple identifiers AND identifiers prefixed by the #-character are accepted and define a symbol within that literal.
However, if a symbol named 'nil', 'true' or 'false' is required as an array element (i.e not the value), a #-character MUST be preceeded, as in:

    #(1 2 3 #nil #true #false true)

In the above example, the 5th element will be the symbol true, while the last element will be the object true. (Which -for your confusement- is the object bound to the symbol true :-)

'Double' Constants

Although Smalltalk/X does not differentiate between Floats and Doubles as Smalltalk-80 does (i.e. short floats vs. double-floats), float constants with a trailing "d" are accepted. However, these literals will be compiled in any case into an ST/X Float object (which is the equivalent to a Double in ST-80).

This may be changed in an upcoming version. Rationale:
ST/X uses double IEEE numbers in the Float class mostly for compatibility with Digitalk- and Squeak Smalltalks. If you really need single precision float arithmetic, use instances of the ShortFloat class in ST/X.

End-of-line Comments

Smalltalk/X allows special comments, which start with the character sequence:

    "/   (double-quote followed by slash)

and are treated as a comment to the end if the source-line. I.e. everything up to the end-of-line is ignored, even if it contains another comment, or comment closing character. Within string constants, this character sequence is ignored (i.e. not a comment).

Notice, that this feature is NOT compatible to other ST versions; code containing these to-end-of-line comments will not compile on other Smalltalks.

However, it simplifies porting of existing code to ST/X, since parts of the code can be easily commented out, by adding "/ to the beginning of each such line.

Token Delimited Comments

Smalltalk/X allows special comments, which start with an initial delimiter token sequence:

    "<<TOKEN   (double-quote followed by two less characters,
		followed by an arbitrary alphanumeric token word)

The token word must be the only word on the comment start line, otherwise the comment is treated like a regular comment. This was done for backward compatibility, to e.g. allow regular comments like "<<--- See here".
After the token start line, all followup lines are treated as comment lines, up to a line starting with the delimiter token.

For example:

    "<<END
    anything here, even other comments
    or other token-delimited comments
    END

Notice, that this feature is NOT compatible to any other ST dialect; code containing these comments will likely not compile on other Smalltalks. However, portability chances are better, if the terminated line has the form: TOKEN" (i.e. the token is followed by a double quote). Such comments are recognized by other Smalltalks if NO double quote is contained inside.

This much simplifies the commenting of big junks of code, which may or may not contain any other comment in it.

We recommend to use such comments only temporarily and remove them before code is published or committed to the source repository, as they are definitely leading to problems when the code has to be ever ported to another Smalltalk dialect.

Redefining Instance Variables

Notice:
The following has been disabled in all current versions

Stc allows subclasses to define instance variables with the same name as already defined in superclasses. Normally, to do so is not a good idea and discouraged. However, in certain situations (i.e. only a binary of the subclass is available or you do not want to or may not change the source), allowing this makes sense.
The flag "-errorInstVarRedef" tells stc to output a warning instead of an error, and continue with the compilation.
A typical use for this flag is when you want to port a class from some other Smalltalk implementation, which includes an instance variable conflict due to a different internal implementation of one the classes superclasses in the original Smalltalk vs. Smalltalk/X.
With this flag, this new class will access its own instance variable under that name (which was obviously the original intention when the class was written). This flag should be used only when porting (unmodifyable) code to ST/X - new classes should follow the rules.

Lowercase vs. Uppercase

Normally it is required (by convention - not by language syntax) that all globals and class-variable names start with an upper case character, while instance variables and method/block args & vars start with a lower case character. By default, stc will stop compilation with an error if these rules are not followed. The compiler flags "-errorLowerGlobal" and "-errorUpperLocal" turn these into warning messages. (even those warnings can be turned off.)
These flags should only be used when porting (unmodifyable) code to ST/X - new classes should follow the rules.

The 'here' Pseudovariable

Smalltalk/X supports another type of send beside the normal 'self' and 'super' sends: the 'here'-send.

To make this extension be compatible with existing code, 'here' is only recognized as the pseudoVariable, if no other variable named as here is defined in the compilation scope.
Thus, if any instance-, local or argument variable exists with a name of 'here', the compiler will produce code for a normal send - not creating 'here'-sends.

Read the above section on the semantic and use of 'here'-sends.

Extended Binary Operators

Starting with release 4.1.3, binary operators may consist of up to 3 special characters (the Blue Book specified a maximum of 2 characters).
Thus, it is now possible to define messages named: #'<=>', #'==>', #'===' or even #':=:'.

Binary operators may be constructed from 1 to 3 characters from the following character set:

	-  +  *  /  \
	=  <  >  ~
	&  |  @  #
	,  ?  !  %  :

excluded is, of course, the assignment: #':=', and multiple hash characters (for backward compatibility, ## is interpreted as the hash symbol itself).

Unicode String- and Character Literals

Starting with release 5.2, unicode is allowed in string- and character literals. CharacterArrays will now be instances of String, Unicode16String or Unicode32String, depending on the highest codepoint present in the string.
The string classes have been enhanced to both handle Unicode (isNationalLetter, isUppercase, asUppercase etc.) and to perform automatic conversion as required. (For example, when concatenating 8 bit and 16bit strings).
Notice that, although ST/X does handle 32 bit strings, both the X11 and the windows display interfaces may be still limited to 16bit strings at the time of this writing. Therefore, we recommend not going beyond a codepoint of 16rFFFF.

The external source code file format is now utf8.

For backward compatibility, ST/X marks utf8-encoded files by writing an encoding pragma:

	"{ Encoding: utf8 }"

near the beginning of generated source files, and detects utf8 encoded files by the presence of the "encoding:" string somewhere near the beginning of the file.
If no such pragma is found in a source file, the file is assumed to be iso8859-1 (i.e. latin1) encoded.

Tools which read or write external files (i.e. the bytecode compiler, the external stc-compiler, Workspace and FileBrowser) look for and care for this pragma.

Please note, that this format is backward compatible to other (non-utf8) Smalltalks, and it is still possible to file-in ST/X source files into Squeak, VisualWorks etc.
This is actually even possible if non-ascii characters are present in String literals, as these would appear in the target system as funny strings, which could (in theory) be still utf8 decoded (manually in the browser, or at runtime or automatically during fileIn).
Sorry, but portability is lost if non-ascii Character literals are present in the filed-out code - these will lead to a syntax error when loaded into a non utf8 Smalltalk system.
We therefore recommend to NOT use non-ascii character literals, instead use Strings wherever possible, and use as "(Character value:xxx)" construct (which is evaluated at compile-time by the ST/X compilers) when required.

Notice that the language has only been extended for String- and Character literals; non-ascii letters/digits are still NOT allowed for message selectors, variable- and class names etc.
This was done by purpose - allowing this would probably make the code less readable, and also much less portable. Also, it is a good idea to force all programmers to stick to (at least) the same language in their program code (and comments). We'd even recommend using english (just consider, how hard it will be to read and understand a program written in Chinese, Russian or Czech, if you are not a native speaker).

More Codechecks

The stc compiler performs some more checks on your code; this (currently) may result in classes being accepted by the incremental compiler, but fail to compile with an error being reported by stc.

Additional checks performed are:

local variables which are used but never assigned
Example: !Someclass methodsFor:'foo and bar'! foo |var| var ifTrue:[ ... do something ... ]. ! or: bar |var1 var2| var1 := var2 + 1 ! if compiled using the incremental bytecode compiler, the above methods will lead to a runtime error (doesNotUnderstand), while stc refuses to compile these right away.
empty local variable declaration in methods or blocks
Example: !Someclass methodsFor:'foo and bar'! foo || ... do something ... ! stc will report a syntax error. In the Blue Book, it is not specified if the above is valid or not. Currently, the incremental bytecode compiler accepts it (to allow easier fileIn of alien code), while stc refuses to compile this.
Future versions will be fixed to show common behavior - either both accepting or refusing this construct.

Limitations

Restricted Subclassing

These classes cannot be subclassed:

UndefinedObject
SmallInteger
True
False

Classes of which subclasses may not add named instance variables:

Float (restriction removed with releae 2.10.5)

There are a few other classes, of which subclasses may behave strange. For example, instances of a Symbol subclass may not be seen as true symbols in many places; subclasses of String will return an instance of String when asked to copy, convert etc.

In general, be very careful in subclassing any of:

Float
String
Symbol
Context & BlockContext
Method & Block

These restrictions also apply to the incremental byteCode compiler.

Late note:
Some restrictions and strange behavior were removed with release 2.10.5.3;
now, you can subclass Context, Method, Block and Behavior AND have these objects be treated correctly by the VM's runtime system (i.e. accept and treat them like other codeObjects and classObjects respectively).

Use of Namespaces and Private Classes

The following restrictions apply to namespaces and/or private classes:

the name of a namespace may not be the same as the name of any other class or global variable.
the name of a private class may not be the same as the name of any class variable in its owning class.
although possible, we recommend to not use nested private classes
if a class, which is not in the default (i.e. Smalltalk-) namespace, is filedOut and loaded into another Smalltalk system (i.e. VisualWorks), the class will be installed as a regular (i.e. globally visible) class there.
For portable code, namespaces should not be used or used with great care.
private classes CANNOT be filedOut and loaded into another system, without manually change of either the generated source file, or the other system's scanner/parser.

No Continuations

In ST/X, contexts are not fully usable as continuations; this means for example, that a method's context cannot be restarted or resumed, once the context has returned.
This affects and complicates implementations of backtracking algorithms, coroutines and other fancy control tricks.

It is planned for such features to be at least partially supported in future versions.

Known Bugs & Limitations

The current version of ST/X has some limitations and bugs, of which some are going to be removed with one of the next versions, others will probably remain.
There are workarounds for these limitations.

Block Local Variables

stc cannot always generate inline code for blocks with locals variables. It will occasionally generate less performant full block calls. This affects the block arguments of ifTrue:, ifFalse:, whileTrue:, whileFalse:, timesRepeat:, to:do: and to:by:do:.

For to:do: and to:by:do:, this bug will show up only for Integer arguments where stc can deduce Integer types at compile time.

This happens if the stc compiler thinks, that there is a chance for the block to be exposed to the outside world via subblocks or thisContext. Often, stc is too conservative in this analysis.

Workaround:: use method variables instead of block locals (there is no performance lost, since inlined blocks access method locals as fast as block locals).

This has been fixed with release 2.10.4.

Cascades Requiring Temporaries

Cascades which contain a message as the original receiver and thus need a temporary to hold the result of the original send are not implemented, i.e. the following code will not compile with stc:

	(anObject xxx) foo; bar; baz

while

	anObject foo; bar; baz

will be ok.

Workaround:: add a temporary and keep the result of the first send there. Do the cascade on this temporary.

This has been fixed with release 2.10.5.

Conflicting Names of Local Variables and Structures/Typedefs

Names of C-Structures, structure fields and typedefs may not conflict with the names of method or block local variables. "stc" will produce wrong code, leading to a syntax error in the C-compilation phase. Example:

    !MyClass class primitiveDefinitions!

    %{
	struct abc {
	    int field1;
	    char field2;
	};
    %}
    ! !

    !MyClass methodsFor:'foo'!

    method
	|local1 field2|
	...

will lead to an error, since the name field2 is used both in a c-structure and as a method local. This may also happen with other C-names (i.e. typedefs, structure names, enum values etc.) Care should be taken, since these name conflicts may also be due to some #define in an included C header file.

Compiling code with such conflicts will usualy lead to errors in the C-compilation phase. Since stc does not parse (and understand) the structure of primitive code, it will not notice this conflict.

Workaround:: rename the local variables.

Limited Number of Method & Block Arguments

Currently, there is a limit of 15 arguments to methods. It is NOT possible to evaluate methods with more arguments by using perform:withArguments:.
The number of block arguments is limited to 7.

Workaround:: If more argument values have to be passed, the arguments should be put into a collection, or other special object, which is then passed as argument.

Limited Number of Method & Block Locals

Currently (and maybe forever) there is a maximum of 127 local variables in both methods and blocks. Although this limit is hard to reach for normal code, it may show up when Smalltalk code is created automatically - i.e. by some translators.

A suggested workaround is to create some collection and put local values into that.

Limited Number of Method & Block Temporaries

In the code created by stc, nested expressions evaluate their intermediate results into (anonymous) temporary variables. These are placed into the context (and could, theoretically be inspected).

There is (currently) a limit of 31 temporaries, leading to a maximum expression nesting of 31 (since for every nesting level, one such temporary is needed).

The compiler is reusing temporaries as much as possible, so this limit is hardly ever reached - if it does, rewrite the complicated expression, using method locals as explicit temporaries.

Workaround:: Simplify the expression(s). Use local variables as explicit temporaries.

Limited Line Number Info

For interpreted bytecode, there is a limit of 255 lines, for which line number information can be recorded. Larger methods can be compiled, but no debugging line number information is available for code after the 255th line. (the reason is of course, that a byte is used for lineNumber information; we do not want to waste more memory and/or use a more complicated variable number encoding scheme)
When encountered in the debugger, all lines above the 255th line are highlighted (since the debugger cannot tell exactly, where the programs state of execution is).
This limitation is relaxed to 32767 in stc.

Workaround:: There is no workaround - simplify your methods.
In practice, such long methods are very rare - mostly appearing in automatically generated code (which is not subject of debugging anyway ;-).

No Large Integer Constants

This has been completely fixed with release 4.x:
LargeInteger constants with any radix are supported, up to a maximum value of 2^1023-1

This has been partially fixed with release 2.10.6:
LargeInteger constants with radix 2, 8, 10 and 16 are now supported, up to a maximum value of 2^1023-1

Stc cannot currently generate LargeInteger constants. Versions before 2.10.2 did not even detect overflow in integer constants, silently generating wrong code. Stc versions after 2.10.2 will quit compilation with an error.
You have to make sure, that your integer constants fit into 31 bits (including the sign-bit, this gives 30bits of absolute value). Thus, the following code will lead to a compilation error:

    |v|

    v := 16r12345678.          "ok, fits into 31 bits"
    v printNL.

    v := 16r87654321.          "not ok, does not fit into 31 bits"
    v printNL.

The built-in incremental compiler DOES handle large integer constants correctly; the above only applies to stc-compilation.

Workaround:

(this is only a temporary workaround; later versions of stc will be able to handle & generate large constants.)

Add a class variable (such as MYLONGCONST) and initialize it in the classes #initialize method from a string.
I.e. instead of:

    ...
    x := 12345678901234567890.
    ...

use:

    ...
    classVariableNames:'MYCONST'
    ...

    initialize
    MYCONST := '12345678901234567890' asInteger.
    ...

    ...
    x := MYCONST.
    ...

No Pool Dictionaries

Up to vsn 5.3.x, ST/X does not support pool dictionaries.

Starting with release 5.3, SharedPools are implemented as classes whose class variables are imported and visible by other classes. The pools are defined as subclass of SharedPool, and the values should be set in the sharedPool's #initialize method.
See OpenGLConstants as an example.

As a side effect of the implementation (in the current 5.3 release), any classes' set of classVariable can be imported by another class as a sharedPool. Do not depend on this, as this feature may be removed without notice in future versions.

Workaround (for pre 5.3 systems):

Use a dictionary stored in a class or global variable.
Access your poolVariables as

    myDict at:name

Initialize the dictionary in the classes' initialize method using:

    myDict at:name1 put:value.
	...
    myDict at:nameN put:value.

Empty Chunks

Stc cannot (currently) handle empty chunks. This means, that it is not possible to compile a file which contains code as:

    ...

    "
     commented out method definition
    "
    !

    ...

instead, you have to include the chunk separator ('!') in the comment:

    ...
    "
     commented out method definition
    !
    "

    ...

This is of course incompatible with the Smalltalk fileOut format definition and will be fixed in later stc versions.

<cg at exept.de>

Doc $Revision: 1.106 $ $Date: 2021/03/13 18:24:51 $

Smalltalk/X Language Definition & Differences

Contents

Class Definition ("class-definition files" only)

Simple Subclassing

Instance Variables

Class Variables

Pool Dictionaries

Implementation of ClassVariables

Subclasses with Indexed Instance Variables

Subclasses with Byte-Valued Indexed Instance Variables

Subclasses with Word-Valued Indexed Instance Variables

Subclasses with Float- and Double-Valued Indexed Instance Variables

Subclasses with Long, Signed-Word, Signed-Long Indexed Instance Variables

Class Comment ("class-definition files" only)

Class Instance Variables ("class-definition files" only)

Method Syntax

Line Number Definition

Symbol Definition

Type Hints / Declarations

Code Gemerator Pragmas

Namespace Definition

Package Definition

Explicit Naming

Notes & Recommendations

ST/X Context Pragma

Other Annotations (used by frameworks)