[prev] [up] [next]

Smalltalk/X Language Definition & Differences



This document describes the source file format as expected by the stc compiler, language differences to Smalltalk-80 and known bugs & limitations of ST/X.

One of the unique features of Smalltalk/X is its ability to compile Smalltalk code into statically compiled binary code files (shared libraries). The contain fully compiled machine code, and do not require dynamic compilation (from bytecode) at execution time.

This compilation scheme is NOT used while working in the browser. For any code which is added or modified after the initial startup, a traditional bytecode compiler (accelerated by a Just-In-Time compiler) is used.

However, the ultimate goal of your development is usually to deploy either an executable program, or a set of libraries as a stand-alone program. For this, the stc compiler is used.

Files processed by the stc (Smalltalk-to-C) compiler are usually generated by either filing out class code directly from the SystemBrowser, or indirectly, by checking some class into the source code repository (also via the SystemBrowser) and then checking it out into a directory via a "cvs update" or "cvs checkout" command. The later could even be an automatic process, for example controlled by a jenkins build system.

Source Text Files

Of course, as these files are regular text files, you can alternatively use any text editing tool to edit and manipulate these files, working in the traditional edit-compile-link mode if you prefer. In this mode, think of the file as describing one class; comparable to programming in C++ or similar languages. However, before doing so, read on and be aware of some pitfalls (especially the chunk-format, and the resulting "!"-doubling).

Stc File Format

Files compiled by stc must be in Smalltalk's fileout format This means that the file consists of Smalltalk expressions, separated by '!'-characters (the so called "chunk separator" or "bang").

Bangs (i.e. '!'-characters) within the text have to be doubled; this need for doubling also and especially applies to exclamation marks within comments and string literals.
Since ST/X replaces doubled '!'-characters by a single '!' when filing in, you will see only single '!'-characters in the browser. You have to be very careful, when editing a source file using the File Browser or another editor.
Notice, that the SystemBrowser cares for this doubling when classes are filedOut - but the File Browser does not, since it treats Smalltalk source code files just like any other text file.

Currently, stc can only compile files which contain either one single class definition (with optional private classes), or a "methods-only file", which contains methods, but no class definition.

The source syntax for compiled Smalltalk implements a subset of the messages used to create/manipulate classes and methods. Other expressions than those listed below are not allowed/supported.

The first expression in a "class-definition file" must be a class-definition expression; a "methods-only file" may only consist of method definions (i.e. "methodsFor"-expressions).

Class Definition ("class-definition files" only)

The stc compiler accepts the following (and only those) class definition expressions:

Simple Subclassing

    superclass subclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
	     poolDictionaries:'sharedPool1 sharedPool2...'
to define class as a subclass of superclass.
The subclass will have indexed instance variables if (and only if) the superclass has indexed instance variables.

Instance Variables

The instance variables of class are those of its superclass(es) and additionally 'instVar1', 'instvar2', ...

Class Variables

The class variables of class are those of its superclass(es) and additionally 'classVar1', 'classvar2', ...

Class variables are visible in both class- and instance methods, of the defining class and of every subclass (and subclasses of subclasses). Class variables are shared (unless redefined) - meaning that access is to the same physical memory "slot" from both the defining class and all subclasses. You can think of class variables as globals with limited accessiblility: only the defining class and its subclasses 'see' them.

See below for class instance variables, which are class private (i.e. each class provides its own physical "slot").

there are some classes (currently UndefinedObject and SmallInteger) which CANNOT be subclassed.

for the curious:
the reason is that instances of these are no real objects, but are marked by a special tag-bit or object-pointer value. Thus these instances do not have a class field in memory. This makes it impossible for the VM (= virtual machine or runtime-system) to know the class of such a sub-instance.

there are some classes, to which you CANNOT add instance variables.

for the curious:
these are especially Object, SmallInteger and all classes which are also known by the VM and/or the compiler. The reasons are:

Pool Dictionaries

Versions prior to rel 5.3 do not allow/support poolDictionaries. In those versions, the "poolDictionaries:" argument must be an empty string.
In newer versions, all pool variables of each listed pool are imported and visible both for class- and for instance methods.

Implementation of ClassVariables

Technically, classvariables are implemented as globals with a special name constructed as:
however, you should not have to care for or depend on this, except for the fact that class variables are visible when inspecting the Smalltalk dictionary and can be accessed easily from C-functions as globals (named "ClassName_ClassVarName").

Do not depend on any specific implementation of class variables, the current implementation may change without notice. Actually, it is planned to separate classVariables from Smalltalk globals in future ST/X versions and use multiple dictionaries within the VM.

Subclasses with Indexed Instance Variables

    superclass variableSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
to define class as a subclass of superclass with indexed instance variables even if superclass had no indexed instance variables. An error will be generated, if the superclass is a variableByte- or variableWord class.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Byte-Valued Indexed Instance Variables

    superclass variableByteSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
to define class as a subclass of <superclass> with indexed instance variables which are byte-valued (0 .. 255) integers.

An error will be generated, if the superclass is a variable class (i.e. has indexed instances) AND it has NO byte valued elements.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Word-Valued Indexed Instance Variables

    superclass variableWordSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
to define class as a subclass of superclass with indexed instance variables which are word-valued (0 .. 16rFFFF) integers (i.e. unsigned shorts in c-world).

It is an error if superclass has non-word indexed instance variables.
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Float- and Double-Valued Indexed Instance Variables

     superclass variableFloatSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
     superclass variableDoubleSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
to define class as a subclass of superclass with indexed instance variables which are shortfloat- or doublefloat-valued rational numbers. (i.e. floats and doubles in c-world).

Float- and DoubleArrays were added to support 3D graphic packages (i.e. GL), which use arrays of float internally to represent matrices and vectors. They provide much faster access to their elements than the alternative using byteArrays and floatAt:/doubleAt: access methods.

Also, storage is much more dense than in arrays, since they store the values directly instead of pointers to the float objects.

A 1000-element floatArray will need 1000*4 + OHDR_SIZE = 4012 bytes, while a 1000-float-element array needs 1000*4 + OHDR_SIZE + 1000*(12+8) = 20012 bytes. (each float itself requires 8-bytes plus 12-byte header)
Notice that a class may be defined with both named and indexed instance variables.

Subclasses with Long, Signed-Word, Signed-Long Indexed Instance Variables

     superclass variableLongSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
     superclass variableSignedWordSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
     superclass variableSignedLongSubclass:#class
	      instanceVariableNames:'instVar1 instVar2...'
	      classVariableNames:'classVar1 classVar2...'
to define class as a subclass of superclass with long (32 bit integers in the range 0 .. 16rFFFFFFFF), signed short (i.e. -16r8000 .. 16r7FFFF) and signed long (-16r80000000 .. 16r7FFFFFFF) indexed instance variables.
(i.e. int, short and unsigned int in the c-world).

These types were added for easier bulk data exchange with C language functions. They are not currently used in Smalltalk itself.
Notice that a class may be defined with both named and indexed instance variables.

Be aware, that indexable classes with float, double, signedWord, long and signedLong elements may (are) NOT be available on other Smalltalk implementations.
Using them may make your application non portable to other systems.
(however, these can be easily simulated by subclassing ByteArray and redefining the access methods).

Class Comment ("class-definition files" only)

A class comment may be defined with an expression of the form:
    ClassName comment:'some string'
an alternative to using a comment is to define class methods under the category "documentation", consisting of comments only.
Empty methods do not use ANY code space in ST/X, and have the positive effect of not eating up data space in the Smalltalk executable (which the comment does)

Class Instance Variables ("class-definition files" only)

A class may have instance variables, these MUST be declared before the first class method is declared. The declaration has the form:
    ClassName class instanceVariableNames:'string with varNames'
Do not confuse class variables with class instance variables.

Only one such class-instance-variable definition is allowed per input file.

Method Definition

The expressions following the class definition are to be method definitions of the form:
    !ClassName methodsFor:'method-category'!




    ! !
    !ClassName class methodsFor:'category'!



    ! !
"class-definition files" may only contain method definitions for the class defined in the class definition.

"method-only files" may contain methods for any class - but no class definitions.

Instance methods and class methods may be in any order.

To allow compilation of classes filed out from ENVY, stc also recognizes the selectors privateMethodsFor: and publicMethodsFor:. In addition, the special selector ignoredMethodsFor: tells stc to ignore all followup methods up to an empty chunk.

Method Syntax

Method and expression syntax is à la Smalltalk-80 (with a few extensions).

There is a limit on the maximum number of arguments a method can be defined with and messages can be sent with (currently 15).
This limit will be removed eventually, allowing an arbitrary number of arguments.

Other limits are:

For very complicated expressions (especially when these are generated automatically), the temporary limit could be reached in theory. In practice, so far no Smalltalk code (available PD programs and users' application code) has ever hit those limits.

Since most terminals cannot display the Smalltalk assignment character '<-' (backarrow as one character with same ascii-code as '_'), the scanner also accepts the character sequences ":=" (colon-equal) to express assignment.
This is compatible to similar extensions found in other Smalltalk implementations. Of course, the '_' is also accepted.

Use ':='
Support for '_' may be removed in later versions. Also, Smalltalk/X, like newer Smalltalk-80 versions allows underscores in identifiers - no longer treating them as assignment.

Although not defined in the book, Smalltalk-80 expressions seem to require (blank) characters to separate tokens (i.e. "Point origin: point1 corner: point2").

Smalltalk/X does not need these (i.e. "Point origin:point1 corner:point2" is fine)

I do not know at the moment, if this makes any problem when porting Smalltalk/X code to other Smalltalk implementations. (if required, the fileOut-methods may have to be changed to add blanks)

Assignment and init-Expressions

In contrast to Smalltalk-80's fileIn format (where any expression is allowed), expressions other than above must be of the form:
    Smalltalk at:#name put:constant
(constant may be any integer, float, string, symbol, true, false or nil)

or of the form:

    classname initialize
(classname must be the name of the class defined in this source-file)
These expressions allow globals to be set to a predefined value at startup and/or class initialization. Example:
 Smalltalk at:#MyVariable put:true !

Example Class

    Point subclass:#Point3D

    Point3D comment:'
     this class defines a point in 3-dimensional space

    !Point3D class methodsFor:'instance creation'!

    x:newX y:newY z:newZ
	"answer a new point with coordinates newX and newY"
	^ ((self basicNew) x:newX y:newY) z:newZ
    ! !

    !Point3D methodsFor:'accessing'!

	"Answer the z coordinate"
	^ z

	"set the z coordinate"
	z := newZ
    ! !

    !Point3D methodsFor:'printing'!

	"answer my printString"
	^ super printString , '@' , z printString
    ! !

Semantic Details

The following only lists non obvious semantic details - for a description of the Smalltalk language, please refer to standard literature.

Evaluation Order

Expression arguments and receiver are evaluated left to right, starting with the receiver (with exceptions as described below).

Smalltalk is an eager evaluating language - that is, all arguments are evaluated before the message send - even if not used by the called code.
Lazy evaluation can be simulated partially by using blocks as arguments, or by special code (see the LazyValue class and its documentation).

Side Effects

If any argument of an expression has a side effect on an instance variable, and the expression uses that instance variable, it is NOT DEFINED if the original or modified value of that instance variable is used.
For example:
    Object subclass:SomeClass

	i := aNumber

	i := i + 1.
	^ i

	^ self increment + i

	^ i + self increment

	Transcript showCR:'undefinedBehavior returns: '
			  , (SomeClass new i:0) undefinedBehavior printString.
	Transcript showCR:'undefinedBehavior2 returns: '
			  , (SomeClass new i:0) undefinedBehavior2 printString.
in the #undefinedBehavior method, the value used for i in the #+ message may or may not be the incremented value.

do never depend on the particular behavior of a Smalltalk or compiler; the semantic here is not defined. Even in ST/X, the behavior may differ between versions, or between the incremental and batch compiler.
(actually, in the current ST/X version, the incremental compiler returns 2 for the first, and 1 for the second method. In contrast, stc compiled code returns 2 for both, because it does not always evaluate arguments left to right - especially for arithmetic operations).

this is to be considered a bug, because it conflicts with the evaluation order as defined above (although it is bad coding style...).
In practice, there have been only very minor problems due to this in the past.


The behavior of your program is undefined, if instance variables of the receiver are accessed in a method, after a #become: message was sent to the receiver.
If the #become: changed the receiver into some other object with less or no instance variables, even a nonrecoverable fatal error may occur. Otherwise, the access will be to the corresponding instance variable slot as defined by the other class.
For example, the following may lead to unexpected behavior (or even a nonrecoverable fatal error):
    Object subclass:SomeClass

	i := 0.
	self become:somethingElse.
	^ i

	self become:somethingElse.
	i := 0.
The use of message sends to access the instance variables removes the above danger:
    Object subclass:SomeClass
	^ i

	i := newValue

	self i:0.
	self become:somethingElse.
	^ self i

	self become:somethingElse.
	self i:0.
ST/X typically falls into a segmentation violation exception, which can be cought by an appropriate exception handler.

Literal Array of a Method

Stc generated code does not (currently) access the literal array; instead, the literal array of a method is created for the debugger (to find senders) only. Modifying the literal array (which is bad coding style anyway) has no effect on machine compiled code.

In contrast, bytecode-interpreted methods use the values found in the literal array. A modified literal array will change the behavior of the method. This modified behavior is not reflected in the method's source code.

And finally, the dynamic compiler (JITTER) generates code which accesses literals inline (i.e. it takes the literalArrays contents at compilation time and creates inline constant accesses). Thus, JITTED code behaves like static compiled code, in that changing the literal array does not affect the execution. However, since the system may chose to flush its dynamic code cache, and recompile at any time later, the changed literal array may eventually affect the execution then.

For these reasons, we highly recommend keeping the literal arrays untouched.

(Experts may do so, but have to ensure that the method gets recompiled, by converting a static compiled method into a dynamic one, and flushing the code cache entry for this method explicitly.

Builtin Methods

For a number of message sends, both the stc- and the incremental compiler create inline code which performs the function without doing any message send.
Redefinition of any method listed below will have no effect on your program; also, tracing and breakpointing of these methods is not possible (since they are never executed).

In theory, many more methods could be inlined; the current set represents a compromise between performance (inlined code is much faster) and flexibility (inlined methods cannot be redefined/traced).

In general, only methods for which a changed semantic would make the system unusable or change the semantic completely in a non Ansi-Standard way, are inlined. With the stc compiler, the degree of inlining can be further controlled by command line arguments.

Inlined messages:

In addition, some constructs are partially inlined - special code is generated to avoid a message send in common cases.
Partial inlined messages: The above list may be incomplete - depending on the ST/X version, more messages could be inlined in your system.

Extensions to Smalltalk-80 (Blue Book Version)

Brace Array Constructor

The brace construct "{ expr . ... . expr }" for array instantiation at runtime was added as syntactic sugar to Squeak/Pharo. This is also supported by ST/X.

The construct:

    { expr1 . expr2 ... exprN }
is semantically equivalent to;
    Array with:expr1 with:expr2 ... with:exprN
for an arbitrary number of expressions.

This makes passing of array-arguments or the return of multiple values much easier. Notice that individual expressions are separated by a period (i.e. statement separator).

Compiler Directives

Comments of the form:
    "{ something ... }"
are recognized by the stc-compiler as directives. Since directives are hidden within comments, these will be ignored by other Smalltalk systems; making ST/X sources transferable to other Smalltalks.

Line Number Definition

The directive:
    "{ Line: n }"
tells stc that line-numbering should continue with line n. Line numbers in following warning- and error-messages will be relative to n.
This feature is used internally, with incremental stc-compilation to machine code.
It could also be useful for systems where Smalltalk is passed as an intermediate language to stc (i.e. compiler-compilers or code generators) to base linenumbering on the original file.

Symbol Definition

The directive:
    "{ Symbol: aSymbolString }"
tells stc that a primitive wants to access a symbol. Stc includes a definition for that symbol and generates code to create the symbol at startup time; within the primitive, the symbol can be refered to by a C-conforming name as described in ``How to write inline C code''.

Symbols can also be created using the (slower) _MKSYMBOL() function at runtime. This also allows C-Strings to be converted to symbols.
(example: in the XWorkstation-class where keypress-characters are converted to symbols like #Home, #Down etc.)

This directive is no longer needed and may not be supported in future versions. Use the @symbol-mechanism, since it reliefs you of the need to know about name translations.

Type Hints / Declarations

The directive:
    "{ Class: className }"
after an instance-, class-, or local-variable declaration tells stc, that this variable will always be assigned an object of class: className.

Various optimizations in the code are possible if the type of an object is known (especially for simple types such as "SmallInteger", "Character" "Point" or "String").

Currently everything but SmallInteger, Float and Point-definitions in method local declarations are ignored by the compiler.

Even with these type declarations, the compiler still generates code which checks assignments for correct typing (i.e. an assignment of a float to a SmallInteger-typed variable will generate a runtime error).

With the improvements of the type-tracker and optimizations performed in stc, this feature seems now much less useful in many situations - especially, when considering the limited reusability of the generated code.
(see benchmark results of sieve/sieveWithInteger, atAllPut/atAllPut2 etc. some show very small differences between the untyped and typed versions)

We recommend using type hints only in performance critical code, for fully debugged code.

Code Gemerator Pragmas

The stc compiler's code generation strategy can be controlled on a per-class basis with command line options such as "+optspace", "+optinline" etc.

Sometimes, finer control (i.e. over individual methods) is needed. Comments of the form:

    "{ Pragma: keyword }"
instruct stc to change its code generation startegy for a single method. Keyword must (currently) be one of:
These pragmas must be placed right after a method's selector specification.
    "although the whole class is complied '+optinline',
     the following class-initialization method is compiled for space,
     since it is only called once ..."

	"{ Pragma: +optspace }"

Changing compilation to "+optspace" is useful for methods which are seldom called (such as class-initialization methods, which are usually invoked only once during startup) or error reporting methods, which are only invoked for abnormal events.

The effect of pragmas can be turned off with the "-noPragmas" stc command line argument - with this option, optimizations are under control of command line arguments only.

Currently, not all possible trigonometric function generate inline code with the inlineMath options - there may be more in the future if there is a need.

Namespace Definition

A comment of the form:
    "{ NameSpace: nameSpaceID }"
declares the namespace, into which the following class is to be installed. It must preceede any class definition message in the source file (i.e. it should be located somewhere at the files beginning).
Semantic details of namespaces are described below.

The current projects defaultNameSpace is used, if no namespace directive is present in a loaded sourceFile.

Package Definition

A comment of the form:
    "{ Package: 'package-identifier' }"
defines a package identifier, which is attached to all methods and classes which are defined in that file.

This is mostly useful, if individual methods for existing (Smalltalk-) classes are to be filed in, and you want those to be easily identified later.
For example, the "tgen" package adds a few methods to the Object and Array classes. In order to identify those later (i.e. find them quickly for removal), the change file contains a line defining a package identifier of "tgen"; therefore, all of the redefined methods get this as their package identifier.
Thus, you can later use the ProjectViews "browse" menu item, to open a browser on all those methods.

The current projects defaultPackage identifier is used, if no package directive is present in a loaded sourceFile.

Multiple Namespaces

Especially when filing in third party code or you are working in a big team, you may encounter name conflicts with class names. These conflicts are very inconvenient, since (without namespaces) you had to manually browse those files (before filing in) and change all names - which is especially inconvenient, since the systemBrowser cannot be used for this.

To allow a reasonable handling of this case, Smalltalk/X provides (starting with rel3.1) multiple namespaces, which effectively allow you to have two or more classes with the same name to reside in one image/executable.

By default, all classes are defined in the Smalltalk namespace.

Notice, that a directive was choosen, to define a namespace within the sourceFile. This was done by purpose, to allow classes in a namespace to be filed out and loaded into another system, which does not support multiple namespaces (i.e. VisualWorks) - of course, there the loaded class will be placed into the global namespace - but at least, it is possible to get it loaded (unless nameConflicts arise ;-) and rename it then in the browser.


You get some code, which defines a class "Button", which should not conflict with the builtin Button class.
To allow both classes to reside in one image, either load it into some (say) "MyWidgets" namespace using the FileBrowser, or stc-compile it with:

    stc -c -NMyWidgets filename.st
The class defined by the module will then NOT conflict (i.e. overwrite) the existing Button. The loaded class will not even be visible in the Smalltalk dictionary. However, classes within the same namespace may refer to the new class as Button.

Explicit Naming

In rare cases, it may be nescessary, to access globals from different namespaces within one module. Consider the above case (Button in MyWidgets), and you need access to the original Button from within that module.

To access to original Button from within the module, you can either use the explicit:

    Smalltalk at:#Button
or use the (nonstandard) construct:
To access the new Button from other modules, use either:
    MyWidgets at:#Button
or the (nonstandard):

For compatibility with VA and VW5.x, the dot-notation:

is also supported when filing in code.

The following "using"-directive is not yet released (in vsn 3.1)
(its currently being evaluated and tested).

If you don't want to change the sourcecode, you can also define the namespaces to use for searching in a line as:

    "{ Using: name1 name2 ... nameN }"
at the beginning of the source file, or with an stc command line argument (if you don't want to modify the file):
    stc ... -Uname1 -Uname2 ... -UnameN filename.st
The names given define the namespaces to search for globals, in the given order. Thus a line:
    "{ Using: MyWidgets }"
will force searching for globals in the MyWidgets namespace first, THEN in the standard Smalltalk namespace; thus the name Button refers to MyWidgets::Button automatically.

Notes & Recommendations

A trick:
It is sometimes required, to add additional protocol to existing classes, for example, some application may like to add a #foo method to an existing base system class. If this method is only required within that application, AND the creation of instances of that class is under that applications control, the following trick encapsulates the added method in a nice way:
Redefine the class in your namespace as:
	"{ NameSpace: MyNameSpace }"

	NameOfSystemClass subclass:#NameOfSystemClass
	<added foo method here>
All instance creations of "NameOfSystemClass" (by code within that namespace) will now create instances of the modified subclass - which inherits and therefore mimics the original classes's behavior except for the added foo-method.

There is no need to add foo to the main class.

Of course, the above has its limitations, in that subclasses of the original baseclass are not affected by the new foo method - which could also be called a feature, since those classes are completely protected from any changes done in the private version ...

Final note:
The name of a namespace should not be the same as that of some other class (because the same mechanism is used for private classes).

Private Classes

Starting with rel2.11, ST/X allows classes to be declared as being private (i.e. owned) by some other class. These private classes are not visible to the outside of the owning class - there may even be a globally known class with the same name.
Private classes help in organizing large projects in that additional information is hidden and name conflicts are avoided.

Certain restrictions apply to private classes:

Like regular classes, private classes are created by a class definition expression. Additional variants of the subclass-creation messages are provided for private classes:

    superclass subclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
    superclass variableSubclass:#class
	     instanceVariableNames:'instVar1 instVar2...'
	     classVariableNames:'classVar1 classVar2...'
and so on ...
(notice the privateIn: keyword argument, which replaces the category: of a regular class definition)

Within the owning class, any reference to class refers to the private class - even if some other (global) class with that name exists.
A global class with the same name can be referred to as "Smalltalk::class", or - if you prefer portable code - with "Smalltalk at:class".

Technically, a special class variable is created, but its visibility is limited to the owning class.
I.e. private classes are hidden from subclasses of the owning class. This visibility is still to be evaluated and could be changed in future versions. If a private class is to be referenced by a subclass, use access methods in the owners class protocol (which is better style, anyway).

Use the Systembrowser's "new private class" item in its class list menu, to get a template for private class creation.

Private classes and namespaces use the same basic mechanism - a namespace is actually a dummy class, providing a home for its classes. Therefore, you should also avoid any conflicts between namespace names and class names.

In contrast to the above described namespace mechanism and fileIn format, private classes use a slightly different definition format, which is NOT backward compatible with systems that do not support this feature.
We therefore recommend, to not use private classes for your projects, but instead use namespaces, if you ever plan to port your application to other Smalltalk systems.
Conflicts within a namespace are much easier to avoid than overall conflicts, and the added encapsulation provided by having classes absolutely private is often not needed.

Of course, since ST/X's system classes are probably never of any interest to other system vendors, these can and do make use of private classes ;-)

A class may not be a subclass of one of its private classes (technically, this constellation is possible to create in the browser, but is not possible to fileIn).

Local ('here'-) Sends

In some situations, it is strictly nescessary, that a send goes to a locally defined method. For example, many private methods are supposed to be not redefined by subclasses. In standard Smalltalk, there is no way for an implementor of a class, to make certain that his own methods are called by self-sends, if other programmers use this class as (abstract-) superclass and create subclasses based on it.
To offer some safety in this situation, Smalltalk/X extends the standard Smalltalk language with a so called hereSend.
It is used similiar to a super-send, using the (new) pseudovariable "here" as receiver. The semantic of the hereSend is much like that of a superSend. However, while a superSend starts the method lookup in the superclass of the class which contains the method, hereSends start it in the class containing the method.
(A normal self-send starts it in the class of the receiver - independent of where the method is defined.)

you should keep in mind, the using here-sends will limit the reusability of your class, in that it removes the posibility to change the behavior in subclasses by redefinition of methods.

Also, remember that hereSends are a special ST/X feature. Code using them will probably not be portable to other Smalltalk implementations.

VisualWorks Exception Pragmas

The following VisualWorks pragmas for context marking are used with the exception handling system. Methods marked with such a flag will mark their context accordingly when executed. This allows for quicker handler finding when an exception is eventually raised (the exception handler looks for the flag in the context chain, instead of checking against a set of selectors).
Notice, that these only provide syntactic sugar to functionality which is also provided by available context protocol (which has been available and used before). Thus, it is always possibly to add an explicit statement of the form:
    thisContext markForXXX
to a method's code.

The supported pragmas are:

    <exception: #raise>
marks the context of the method as an exception-raising method. This has the same effect as adding the statement "thisContext markForRaise" to the beginning of the method.

    <exception: #handle>
marks the context of the method as an exception-handling method. This has the same effect as adding the statement "thisContext markForHandle" to the beginning of the method.

    <exception: #unwind>
marks the context of the method as an unwind-handling method. This has the same effect as adding the statement "thisContext markForUnwind" to the beginning of the method.

ST/X Context Pragma

The pragma:
    <context: #return>
marks the context of the method as a possible target of a return message - i.e. the context will be created such that it will allow returning from it via the #return message. If this pragma is not used, the compilers may choose to setup the context for not being returnable (which is slightly faster), and you will get an error when attempting to return from it.
Notice, that methods which contain a returning block are always setup as returnable, so this is only required for contexts which are subject to some special context manipulation (such as "thisContext sender sender return").

Primitive Definitions

Definitions and declarations common to all primitive code in all methods can be placed into a single global primitive definition section. Typically, C-include or C-define statements and/or type declarations are placed in these.
A primitive definition is defined with:
    !className class primitiveDefinitions!

	... anything you like ...
    ! !
The contents of this chunk will be remembered internally and included whenever methods which contain primitive code are to be compiled.

Additional C-functions must be declared in a primitiveFunctions chunk, which is not included when individual methods are compiled (otherwise, you would get linkage errors, due to multiple definitions of the same function).
Finally, C-variables are to be declared in a primitiveVariables chunk.

The SystemBrowser's class menu includes items to show those primitive definitions (for example, see the definitions in ExternalStream).

Method Annotations

Methods can get additional attributes via annotations. The syntax is similar to a resource (see below) or pragma definition, with an arbitrary keyword:
    <keyword: arguments...>
The annotation can be extracted from a method via the #hasAnnotations, #annotations and similar messages. Also, searching or other operations (such as marking menu methods) are possible using annotations.

Some classes use this feature to dynamically extract special methods. For example, the SOAP framework will automatically generate SOAP-call entries for methods marked as such.

Resource Definitions

Methods may be marked as resource-accessing-methods by adding an annotation like:
    <resource #resource>
    <resource #resource ( list of additional symbols ) >
Both of the above forms have no semantic meaning - except for the methods being marked specially. This marking allows that those methods are easier (and especially: quicker) to locate, without a need to scan all of the method's source code.
The launcher provides a menu item (in the classes .. menu), to quickly search for specific resource accesses and open a browser on them.

For example, all methods which depend on the keyboard mapping are marked in ST/X as:

    <resource #keyboard>

If present, a resource definition must be at the very top of a method's code - before any local variable definition or statement (but after the method's argument specification).

Resource definitions can also be used to mark methods for yourself, or for your project management;
a definition like:

    <resource #toBeFixed>
    <resource #toBeReviewed>
may help you to locate those methods easily later.

Do not use resource definitions for things which is common to ALL of your methods (i.e. never automatically generate resources containing your name, date or other version information).
Such information should be recorded in a method comment (ST/X already provides a mechanism to do this automatically: the HistoryManager, which can be enabled via the launcher's settings menu does exactly this for you).

If many methods are marked with a resource tag, the fast search will degrade into a slow overall search.

Standard Conventions

For the ST/X system, we use the following conventions when marking a method with a resource symbol: (*) "normal" programmers do not need to care for those annotations. They are required (and must be carefully placed) in the exception handling framework, though. Check these especially, if you are manipulating the exception handling code in the GenericExeption class.

Other Annotations (used by frameworks)

Method Privacy

Beside the default of being public, methods may be private, protected or invisible in Smalltalk/X.
The four possible visibilities are: Invisible methods are mostly useful to temporarily disable a method, without actually removing it (for example, during testing/debugging).

Sending via #perform: is always possible, since this is equivalent to a self-send (thus, even private and protected methods can be reached via #perform:).

Late note:
We did not find this feature very useful (although many ex-c programmers asked for it in the first place), and are probably not going to further support it in new browser versions.

Lexical Stuff

Some extensions to Smalltalk as described in the blue-book were made by ParcPlace up to OW4.1. Some of these extensions are also available in ST/X. Additional extensions were made in Squeak (brace construct for Arrays) and ST/X (extended comments). All of those are documented in the following chapter.

ByteArray Literals

Literal byteArrays are created by enclosing the elements in #[ .. ]. The elements must be in the range 0 .. 255 (i.e. 16r00 to 16rFF).
    x := #[ 1 2 3 4 ].

    masks := #[ 2r10000000
		2r00100000 ]

Underline in Identifiers

The underline character is treated like a letter when encountered in an identifier.
This extension was added to ST-80 with the introduction of rel4.
Notice, that the underline character parsed as an assignment token in older ST-80 versions, which results in "var1_var2" being parsable both as a single identifier and as an assignment statement.
Currently, ST/X parses the above as an identifier iff no space characters are contained in the construct. I.e. "var1_var2" will parse as a single identifier, while "var1 _ var2" parses as an assignment.
This is compatible with most oldStyle code, but may lead to trouble if spaces are missing; for example, the following code fragment (found in the Squeak Smalltalk system) parses incorrectly, if the underline option is not turned off:

    foo_ 10.

The old-style assignment is supported to allow old Smalltalk code to be loaded; however, it is recommend, to not use the underline character as an assignment operator and convert old code to use the new syntax.
Future Smalltalk/X versions may no longer support this (backward compatible construct). For now, you can change the behaviour in the settings dialog.

The degenerated identifier consisting of an underline alone is only allowed within a keyword-message selector; i.e. the following is legal: "self _:1 _:2 _:3", and compiles to a #_:_:_: message send.
For portable code, you should not use this, since not all other Smalltalk implementations allow this.

Non Alphanumeric Characters in Symbols

Usually symbols are defined as #xxx, where xxx consists of a letter followed by letters or digits.
There are also keyword and binary symbol literals, such as: #at:put:, #at: or #+.

Symbols with other characters can be specified by enclosing them in single quotes, where the first quote must immediately follow the '#'-character.

    #'a symbol with spaces'   - spaces
    #'123'                    - starts with a digit
    #'hello_world'            - underscore

Symbols with unprintable characters must be created at runtime, by sending #asSymbol to an appropriate string.

Empty Local Variable Declaration

The list of local variables may be empty, as in:
	| |

the same is true for blocks:
    x := [:a | ]           - as in-the-book
    x := [:a | | | ]       - with empty locals
Notice, that some Smalltalk dialects may not allow this. If you checked the "warn about possible incompatibilities" flag in the compiler settings, you will get a warning.

Empty Methods

a totally empty method is legal; it is equivalent to a simple ^ self.

	| |

	| aLocal |

	"only a comment"

	^ self
all behave identically (returning self).
Please add a comment telling that the empty method is empty by intention, and not simply forgotten to be finished.

Special 'constants' as Array Literals

Smalltalk/X allows "nil", "true" and "false" to be used in literal arrays. Thus it is possible, to declare an array as:
    #('string1' 'string2' nil 1 1.2 false true wow)
Within an array literal, both simple identifiers AND identifiers prefixed by the #-character are accepted and define a symbol within that literal.
However, if a symbol named 'nil', 'true' or 'false' is required as an array element (i.e not the value), a #-character MUST be preceeded, as in:
    #(1 2 3 #nil #true #false true)
In the above example, the 5th element will be the symbol true, while the last element will be the object true. (Which -for your confusement- is the object bound to the symbol true :-)

'Double' Constants

Although Smalltalk/X does not differentiate between Floats and Doubles as Smalltalk-80 does (i.e. short floats vs. double-floats), float constants with a trailing "d" are accepted. However, these literals will be compiled in any case into an ST/X Float object (which is the equivalent to a Double in ST-80).

This may be changed in an upcoming version. Rationale:
ST/X uses double IEEE numbers in the Float class mostly for compatibility with Digitalk- and Squeak Smalltalks. If you really need single precision float arithmetic, use instances of the ShortFloat class in ST/X.

End-of-line Comments

Smalltalk/X allows special comments, which start with the character sequence:
    "/   (double-quote followed by slash)
and are treated as a comment to the end if the source-line. I.e. everything up to the end-of-line is ignored, even if it contains another comment, or comment closing character. Within string constants, this character sequence is ignored (i.e. not a comment).

Notice, that this feature is NOT compatible to other ST versions; code containing these to-end-of-line comments will not compile on other Smalltalks.

However, it simplifies porting of existing code to ST/X, since parts of the code can be easily commented out, by adding "/ to the beginning of each such line.

Token Delimited Comments

Smalltalk/X allows special comments, which start with an initial delimiter token sequence:
    "<<TOKEN   (double-quote followed by two less characters,
		followed by an arbitrary alphanumeric token word)
The token word must be the only word on the comment start line, otherwise the comment is treated like a regular comment. This was done for backward compatibility, to e.g. allow regular comments like "<<--- See here".
After the token start line, all followup lines are treated as comment lines, up to a line starting with the delimiter token.

For example:

    anything here, even other comments
    or other token-delimited comments

Notice, that this feature is NOT compatible to any other ST dialect; code containing these comments will likely not compile on other Smalltalks. However, portability chances are better, if the terminated line has the form: TOKEN" (i.e. the token is followed by a double quote). Such comments are recognized by other Smalltalks if NO double quote is contained inside.

This much simplifies the commenting of big junks of code, which may or may not contain any other comment in it.

We recommend to use such comments only temporarily and remove them before code is published or committed to the source repository, as they are definitely leading to problems when the code has to be ever ported to another Smalltalk dialect.

Redefining Instance Variables

The following has been disabled in all current versions

Stc allows subclasses to define instance variables with the same name as already defined in superclasses. Normally, to do so is not a good idea and discouraged. However, in certain situations (i.e. only a binary of the subclass is available or you do not want to or may not change the source), allowing this makes sense.
The flag "-errorInstVarRedef" tells stc to output a warning instead of an error, and continue with the compilation.
A typical use for this flag is when you want to port a class from some other Smalltalk implementation, which includes an instance variable conflict due to a different internal implementation of one the classes superclasses in the original Smalltalk vs. Smalltalk/X.
With this flag, this new class will access its own instance variable under that name (which was obviously the original intention when the class was written). This flag should be used only when porting (unmodifyable) code to ST/X - new classes should follow the rules.

Lowercase vs. Uppercase

Normally it is required (by convention - not by language syntax) that all globals and class-variable names start with an upper case character, while instance variables and method/block args & vars start with a lower case character. By default, stc will stop compilation with an error if these rules are not followed. The compiler flags "-errorLowerGlobal" and "-errorUpperLocal" turn these into warning messages. (even those warnings can be turned off.)
These flags should only be used when porting (unmodifyable) code to ST/X - new classes should follow the rules.

The 'here' Pseudovariable

Smalltalk/X supports another type of send beside the normal 'self' and 'super' sends: the 'here'-send.

To make this extension be compatible with existing code, 'here' is only recognized as the pseudoVariable, if no other variable named as here is defined in the compilation scope.
Thus, if any instance-, local or argument variable exists with a name of 'here', the compiler will produce code for a normal send - not creating 'here'-sends.

Read the above section on the semantic and use of 'here'-sends.

Extended Binary Operators

Starting with release 4.1.3, binary operators may consist of up to 3 special characters (the Blue Book specified a maximum of 2 characters).
Thus, it is now possible to define messages named: #'<=>', #'==>', #'===' or even #':=:'.

Binary operators may be constructed from 1 to 3 characters from the following character set:

	-  +  *  /  \
	=  <  >  ~
	&  |  @  #
	,  ?  !  %  :
excluded is, of course, the assignment: #':=', and multiple hash characters (for backward compatibility, ## is interpreted as the hash symbol itself).

Unicode String- and Character Literals

Starting with release 5.2, unicode is allowed in string- and character literals. CharacterArrays will now be instances of String, Unicode16String or Unicode32String, depending on the highest codepoint present in the string.
The string classes have been enhanced to both handle Unicode (isNationalLetter, isUppercase, asUppercase etc.) and to perform automatic conversion as required. (For example, when concatenating 8 bit and 16bit strings).
Notice that, although ST/X does handle 32 bit strings, both the X11 and the windows display interfaces may be still limited to 16bit strings at the time of this writing. Therefore, we recommend not going beyond a codepoint of 16rFFFF.

The external source code file format is now utf8.

For backward compatibility, ST/X marks utf8-encoded files by writing an encoding pragma:

	"{ Encoding: utf8 }"
near the beginning of generated source files, and detects utf8 encoded files by the presence of the "encoding:" string somewhere near the beginning of the file.
If no such pragma is found in a source file, the file is assumed to be iso8859-1 (i.e. latin1) encoded.

Tools which read or write external files (i.e. the bytecode compiler, the external stc-compiler, Workspace and FileBrowser) look for and care for this pragma.

Please note, that this format is backward compatible to other (non-utf8) Smalltalks, and it is still possible to file-in ST/X source files into Squeak, VisualWorks etc.
This is actually even possible if non-ascii characters are present in String literals, as these would appear in the target system as funny strings, which could (in theory) be still utf8 decoded (manually in the browser, or at runtime or automatically during fileIn).
Sorry, but portability is lost if non-ascii Character literals are present in the filed-out code - these will lead to a syntax error when loaded into a non utf8 Smalltalk system.
We therefore recommend to NOT use non-ascii character literals, instead use Strings wherever possible, and use as "(Character value:xxx)" construct (which is evaluated at compile-time by the ST/X compilers) when required.

Notice that the language has only been extended for String- and Character literals; non-ascii letters/digits are still NOT allowed for message selectors, variable- and class names etc.
This was done by purpose - allowing this would probably make the code less readable, and also much less portable. Also, it is a good idea to force all programmers to stick to (at least) the same language in their program code (and comments). We'd even recommend using english (just consider, how hard it will be to read and understand a program written in Chinese, Russian or Czech, if you are not a native speaker).

More Codechecks

The stc compiler performs some more checks on your code; this (currently) may result in classes being accepted by the incremental compiler, but fail to compile with an error being reported by stc.

Additional checks performed are:


Restricted Subclassing

These classes cannot be subclassed: Classes of which subclasses may not add named instance variables: There are a few other classes, of which subclasses may behave strange. For example, instances of a Symbol subclass may not be seen as true symbols in many places; subclasses of String will return an instance of String when asked to copy, convert etc.

In general, be very careful in subclassing any of:

These restrictions also apply to the incremental byteCode compiler.

Late note:
Some restrictions and strange behavior were removed with release;
now, you can subclass Context, Method, Block and Behavior AND have these objects be treated correctly by the VM's runtime system (i.e. accept and treat them like other codeObjects and classObjects respectively).

Use of Namespaces and Private Classes

The following restrictions apply to namespaces and/or private classes:

No Continuations

In ST/X, contexts are not fully usable as continuations; this means for example, that a method's context cannot be restarted or resumed, once the context has returned.
This affects and complicates implementations of backtracking algorithms, coroutines and other fancy control tricks.

It is planned for such features to be at least partially supported in future versions.

Known Bugs & Limitations

The current version of ST/X has some limitations and bugs, of which some are going to be removed with one of the next versions, others will probably remain.
There are workarounds for these limitations.

Block Local Variables

stc cannot always generate inline code for blocks with locals variables. It will occasionally generate less performant full block calls. This affects the block arguments of ifTrue:, ifFalse:, whileTrue:, whileFalse:, timesRepeat:, to:do: and to:by:do:.

For to:do: and to:by:do:, this bug will show up only for Integer arguments where stc can deduce Integer types at compile time.

This happens if the stc compiler thinks, that there is a chance for the block to be exposed to the outside world via subblocks or thisContext. Often, stc is too conservative in this analysis.

use method variables instead of block locals (there is no performance lost, since inlined blocks access method locals as fast as block locals).
This has been fixed with release 2.10.4.

Cascades Requiring Temporaries

Cascades which contain a message as the original receiver and thus need a temporary to hold the result of the original send are not implemented, i.e. the following code will not compile with stc:
	(anObject xxx) foo; bar; baz
	anObject foo; bar; baz
will be ok.

add a temporary and keep the result of the first send there. Do the cascade on this temporary.
This has been fixed with release 2.10.5.

Conflicting Names of Local Variables and Structures/Typedefs

Names of C-Structures, structure fields and typedefs may not conflict with the names of method or block local variables. "stc" will produce wrong code, leading to a syntax error in the C-compilation phase. Example:
    !MyClass class primitiveDefinitions!

	struct abc {
	    int field1;
	    char field2;
    ! !

    !MyClass methodsFor:'foo'!

	|local1 field2|
will lead to an error, since the name field2 is used both in a c-structure and as a method local. This may also happen with other C-names (i.e. typedefs, structure names, enum values etc.) Care should be taken, since these name conflicts may also be due to some #define in an included C header file.

Compiling code with such conflicts will usualy lead to errors in the C-compilation phase. Since stc does not parse (and understand) the structure of primitive code, it will not notice this conflict.

rename the local variables.

Limited Number of Method & Block Arguments

Currently, there is a limit of 15 arguments to methods. It is NOT possible to evaluate methods with more arguments by using perform:withArguments:.
The number of block arguments is limited to 7.

If more argument values have to be passed, the arguments should be put into a collection, or other special object, which is then passed as argument.

Limited Number of Method & Block Locals

Currently (and maybe forever) there is a maximum of 127 local variables in both methods and blocks. Although this limit is hard to reach for normal code, it may show up when Smalltalk code is created automatically - i.e. by some translators.

A suggested workaround is to create some collection and put local values into that.

Limited Number of Method & Block Temporaries

In the code created by stc, nested expressions evaluate their intermediate results into (anonymous) temporary variables. These are placed into the context (and could, theoretically be inspected).

There is (currently) a limit of 31 temporaries, leading to a maximum expression nesting of 31 (since for every nesting level, one such temporary is needed).

The compiler is reusing temporaries as much as possible, so this limit is hardly ever reached - if it does, rewrite the complicated expression, using method locals as explicit temporaries.

Simplify the expression(s). Use local variables as explicit temporaries.

Limited Line Number Info

For interpreted bytecode, there is a limit of 255 lines, for which line number information can be recorded. Larger methods can be compiled, but no debugging line number information is available for code after the 255th line. (the reason is of course, that a byte is used for lineNumber information; we do not want to waste more memory and/or use a more complicated variable number encoding scheme)
When encountered in the debugger, all lines above the 255th line are highlighted (since the debugger cannot tell exactly, where the programs state of execution is).
This limitation is relaxed to 32767 in stc.

There is no workaround - simplify your methods.
In practice, such long methods are very rare - mostly appearing in automatically generated code (which is not subject of debugging anyway ;-).

No Large Integer Constants

This has been completely fixed with release 4.x:
LargeInteger constants with any radix are supported, up to a maximum value of 2^1023-1

This has been partially fixed with release 2.10.6:
LargeInteger constants with radix 2, 8, 10 and 16 are now supported, up to a maximum value of 2^1023-1

Stc cannot currently generate LargeInteger constants. Versions before 2.10.2 did not even detect overflow in integer constants, silently generating wrong code. Stc versions after 2.10.2 will quit compilation with an error.
You have to make sure, that your integer constants fit into 31 bits (including the sign-bit, this gives 30bits of absolute value). Thus, the following code will lead to a compilation error:


    v := 16r12345678.          "ok, fits into 31 bits"
    v printNL.

    v := 16r87654321.          "not ok, does not fit into 31 bits"
    v printNL.
The built-in incremental compiler DOES handle large integer constants correctly; the above only applies to stc-compilation.

(this is only a temporary workaround; later versions of stc will be able to handle & generate large constants.)

Add a class variable (such as MYLONGCONST) and initialize it in the classes #initialize method from a string.
I.e. instead of:

    x := 12345678901234567890.

    MYCONST := '12345678901234567890' asInteger.

    x := MYCONST.

No Pool Dictionaries

Up to vsn 5.3.x, ST/X does not support pool dictionaries.

Starting with release 5.3, SharedPools are implemented as classes whose class variables are imported and visible by other classes. The pools are defined as subclass of SharedPool, and the values should be set in the sharedPool's #initialize method.
See OpenGLConstants as an example.

As a side effect of the implementation (in the current 5.3 release), any classes' set of classVariable can be imported by another class as a sharedPool. Do not depend on this, as this feature may be removed without notice in future versions.

Workaround (for pre 5.3 systems):
Use a dictionary stored in a class or global variable.
Access your poolVariables as
    myDict at:name
Initialize the dictionary in the classes' initialize method using:
    myDict at:name1 put:value.
    myDict at:nameN put:value.

Empty Chunks

Stc cannot (currently) handle empty chunks. This means, that it is not possible to compile a file which contains code as:

     commented out method definition

instead, you have to include the chunk separator ('!') in the comment:
     commented out method definition

This is of course incompatible with the Smalltalk fileOut format definition and will be fixed in later stc versions.

Copyright © 1995 Claus Gittinger Development & Consulting


Doc $Revision: 1.78 $ $Date: 2017/02/07 17:58:37 $