[prev] [up] [next]

How to Write Primitives and Inline C Code

Contents

Introduction

Standard Smalltalk implementations define a set of primitive operations which are built into a virtual (Smalltalk-) machine; these primitives do the kind of work which cannot be expressed as Smalltalk code (for example integer addition), which interfaces with operating system APIs (such as file reading) or which is reimplemented for performance reasons (for example: copying Arrays).

Smalltalk/X does not support these (*); instead primitive code is entered as C-code into the methods. This allows the programmer to code everything he/she likes as C-code (beside the traditional primitives). Notice that in order to simply call any existing C function (for example, a library function from a dll), no inline C code is strictly needed: you can use the alternative FFI callout mechanism, which is very easy to use by defining a method with a C-call pragma. However, inline C code is both faster and allows for more control over what is done in the method. It also allows for arbitrary code to be written which manipulates Smalltalk or other objects without calling another function.

Primitive C-code is included right into the method code and surrounded by the special character sequences '%{' and '%}' (the quotes are not part of it).
This character sequence has been choosen as it does not conflict with existing Smalltalk programs.

Inline code can be placed wherever a Smalltalk statement may be placed.

Notice, that not all systems support the dynamic compilation to machine code (and especially: its dynamic integration) into the active system;

In either case it is possible to disable primitive compilation via the Launcher's settings dialog. An error raising stub method will then be created whenever primitive code is accepted in the browser. This allows for a classes code to be typed to completion and even to execute methods which do not contain primitive code. Then, save (fileout or checkin) the code, compile using make/stc and try the code in a second terminal window.

Lets start with introductionary examples, to give you some idea how primitive code looks in ST/X. The details will be described below.

Calling a simple C function:

   myPrimitive
   %{
       printf("hello world\m");
   %}
   !
inline C-code within a condition:
   anotherPrimitive:aFlag
       aFlag ifTrue:[
   %{
	   printf("it is true\n");
   %}
       ] ifFalse:[
   %{
	   printf("no, it is not\n");
   %}
       ]
   !
the same code, completely in C:
   anotherPrimitive2:aFlag
   %{
       if (aFlag == true) {
	   printf("it is true\n");
       } else{
	   printf("no, it is not\n");
       }
   %}
   !
accessing a local variable from within C code:
   yetAnotherPrimitive:something
       |aVariable|

       aVariable := something.
   %{
       if (aVariable == true)
	   aVariable = nil;
   %}
   . "<- notice the period - %{ ... %} is syntactically a statement
	 which must be separated by '.'-characters "

       aVariable print
   !
accessing arguments and local variables from within C code:
   add:num1 and:num2
       |sum|

   %{
       /* the code below only handles
	* SmallInteger operands ...
	*/
       if (__isSmallInteger(num1)
	&& __isSmallInteger(num2)) {
	    sum = __MKINT(sum1 + sum2);
       }
   %}.

       ^ sum
   !
accessing a global from within C code:
   globalAccessExample
   %{
       if (@global(Debugging) == true) {
	   printf("some info message\n");
       }
   %}.
   !
instantiating an array of strings:
   makeArrayWithStrings
   %{
       OBJ myArray;

       myArray = __ARRAY_NEW_INT(3);
       // notice the protect/unprotect. These are needed to protect myArray
       // from being garbage collected. it is required around any code which possibly
       // leads to a garbage collect.
       __PROTECT__(myArray);
       s = __mkString("string1");
       __UNPROTECT__(myArray);
       __ArrayVal(myArray)->a_element[0] = s;   // C uses 0-based indexing
       __STORE(myArray, s);     // tell GC that a reference was taken (write barriere)

       __PROTECT__(myArray);
       s = __mkString("string2");
       __UNPROTECT__(myArray);
       __ArrayVal(myArray)->a_element[1] = s;   // C uses 0-based indexing
       __STORE(myArray, s);     // tell GC that a reference was taken (write barriere)

       __PROTECT__(myArray);
       s = __mkString("string3");
       __UNPROTECT__(myArray);
       __ArrayVal(myArray)->a_element[2] = s;   // C uses 0-based indexing
       __STORE(myArray, s);     // tell GC that a reference was taken (write barriere)

       RETURN(myArray);     // not return! RETURN is a macro which handles context references
   %}.
   !
As you see, C code can easily get complicated, error prone and hard to read. Notice that the above is exactly the code which would be emitted by the stc compiler for a simple Array creation "{ 'string1' . 'string2' . 'string3' }". The C code is therefore not faster or shorter. Therefore it only makes sense to write primitive C coe for stuff which cannot be easily expressed in Smalltalk.

Accessing Smalltalk Objects from within Primitives

Since C does not know about Smalltalk objects (**), some conversion is required whenever Smalltalk objects are passed to/from C code. Although it is theoretically possible to access the internals of any Smalltalk object in C code, we recommend that only basic data types (integers, floats, arrays etc.) are ever processed by C code.
In any case, some understanding of the object representation is required, to avoid errors and frustration. Please read and understand the following chapters before trying any C primitives.

Object Representation

Smalltalk objects consist of a header, followed by the named instance variables (if any), followed by the indexed instance variables (if any).
The header includes the size, the class and some additional information required for memory management. The instance variables are actually direct pointers to the objects (except for small integers).
Since the header is always required, the minimum size is:
sizeof(char *) + (2 * sizeof(int32))
bytes (call it overhead).
On 32bit machines, this is 12 bytes.
On 64bit machines (alpha, x86_64), which use 64bit pointers, the headers size is 16 bytes.

Notice, that beside those 12bytes, there is NO additional overhead; if you think this is much, think about malloc(), which also requires at least 4 or 8 bytes of overhead per allocated chunk (depending on the algorithm used), and does NOT provide runtime type information and automatic generational memory reclamation.

For example, the memory used by the "true" singleton object looks like:

	+-------------------------+
	|       class-pointer   --------------> True-class
	+-------------------------+
	|       size  (12)        |
	+-------------------------+
	|       additional info   +
	+-------------------------|
Notice, that the size field includes the header.

The size field is actually only required for objects with indexed instance variables; for other objects, the size could be fetched from its class object (which also knows how big its instances are). However, to tune the memory system, all objects include that size info, even if that is redundant. Future ST/X versions may optionally support objects without a size field.
This would save about 120k-150k bytes of object space in the initial image (there are about 35000-40000 objects initially) and would result in a noticable reduction in memory requirements if many small objects are created. Obviously, the savings are marginally for big objects. However, also a slight performance degration is to be expected (since all accesses to the size will require an additional compare & conditional branch) and a major part of object space is required for indexed objects (strings, byteArrays, methodDictionaries and symbols), which is why there are currently no plans to do this.

However, to avoid disaster if such a change may ever be made, never access the size field (or any other header field) directly.
Instead, use the access macros which are described below. Upcoming systems may be delivered with different access macros.

Lets return to an object's layout; right after the header, any named instance variables are stored.
For example, the object "(100.0 @ 200.0)" is in memory:

	+-------------------------+
	|       class-pointer   --------------> Point-class
	+-------------------------+
	|       size  (20)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       x instvar       --------------> 100.0 Float object
	+-------------------------+
	|       y instvar       --------------> 200.0 Float object
	+-------------------------+
Since most numbers in the system are SmallIntegers, representing those as above would lead to many small objects.
To avoid this (and also to avoid indirect memory references), a special encoding is used to represent these: a bit (the so called TAG bit) marks object pointers from smallInteger values.
Since all pointers are aligned on word boundaries anyway (they are always even), putting the TAG bit into the low bit does not limit the address space for objects.
However, only 31 (osf1/alpha: 63) bits are available to encode a smallIntegers value.
Since the value is stored ``directly'' in the pointer field, these are often referred to as ``immediate integers''.

Do not depend on the TAG bit being in a particular bit position; on some machines, this bit is stored in the sign bit, to make the smallInteger-check faster
(its done with all CPUs which have a branch on minus instruction and set the condition codes without a test instruction; for example, the 68k or mips CPUs currently have the TAG bit in the sign bit).

To stay unaffected of such details, use the check/conversion macros described below.

Although, on 32 bit machines there are actually 2 bits for use as tag bits (alignment is really at least a 4byte one), currently only one bit is used to tag small integers.
Future versions may encode other types (for example, short floats) as immediate values, to reduce memory requireements even more.
These changes will be transparent to your primitives source code, IF access macros are used everywhere.

To summarize, the object "(100 @ 200)" is in memory:

	+-------------------------+
	|       class-pointer   --------------> Point-class
	+-------------------------+
	|       size  (20)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|TAG|   x instvar 100     |
	+-------------------------+
	|TAG|   y instvar 200     |
	+-------------------------+
BTW: this TAG representation is the reason why "SmallInteger allInstances" does not return any useful result and the SmallInteger class cannot be subclassed (instances do not have a class field where we could record them NOT being smallIntegers).

The __Class() macro described below (and therefore the #class message sent to any object) check for the TAG bit and 'mimics' a class field pointing to SmallInteger.

There is one other special object, which is represented as an immediate object, "nil". It is represented by a NULL pointer (i.e. all zero bits). The choice of using an immediate nil (as opposed to using a singleton), was mainly to make C-code easier to write and because many machines have special instructions to deal with zeros eficiently (i.e. branch-on-zero, clear register or even having a constant zero-register).

For that reason, the UndefinedObject class is also not subclassable and does not have ``instances'' in the strict sense.

Indexed instance variables are stored after the named instance variables (or after the header if there are none).
For example, the array "#('foo' true 30 nil)" looks in memory:

	+-------------------------+
	|       class-pointer   --------------> Array-class
	+-------------------------+
	|       size  (24)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       instvar (1)     --------------> 'foo' object
	+-------------------------+
	|       instvar (2)     --------------> true object
	+-------------------------+
	|TAG|   instvar (3)  30   |
	+-------------------------+
	|       instvar (4)  NULL |
	+-------------------------+
and an instance of a subclass of Array with one named instvar would look like:
	+-------------------------+
	|       class-pointer     |
	+-------------------------+
	|       size              |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       named instvar     |
	+-------------------------+
	|       instvar (1)       |
	+-------------------------+
	|       instvar (2)       |
	+-------------------------+
		   ...
	+-------------------------+
	|       instvar (N)       |
	+-------------------------+
ByteArrays, Strings, FloatArrays and DoubleArrays are variable sized too, but do not store pointers to the elements. Instead, the elements are stored as non objects (i.e. untagged immediate values).
Strings and ByteArrays store bytes in the variable part.
For example, the string "'hello'" looks in memory:
	+-------------------------+
	|       class-pointer   --------------> String-class
	+-------------------------+
	|       size  (18)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	| 'h' | 'e' |  'l' |  'l' |
	+-------------------------+
	| 'o' | '\0'| '\0' | '\0' |
	+-------------------------+
Strings are ALWAYS delimited by a zero-byte ('\0') to make interfacing C string functions (which expect this 0-byte) easier. Therefore, conversion from a Smalltalk string to a C string is done by computing the address of the first character, which is simply done by adding the size of the object header (OHDR_SIZE) to the object's address.
You have to be a bit careful when computing a strings size: on the Smalltalk level, a string's size does NOT include the 0-byte, whereas in C you have to take care of it when allocating or accessing strings.
Thus, "'hello' size" returns 5 in Smalltalk, while the object's size value (and therefore the value of the __size macro) includes the 0-byte.
We recommend using the __stringSize() macro which takes care of that and returns the actual size without the 0-byte.

Currently, the 0-byte is only added to single-byte strings. Wide strings (for example: Unicode16 and Unicode32) do not have a 0-byte at the end. This may change in the future, as it complicates the use of wide-char APIs under Windows.

Any size field reflects the number of bytes an object has - however, the memory system will always align memory in full word increments and on a full-word boundary when allocating objects.
(any padding bytes are invisible and inaccessible to Smalltalk code; for C code: the contents is undefined)

As a final example, the doubleArray "#(1.0 2.0 3.0) asDoubleArray" looks in memory:

	+-------------------------+
	|       class-pointer   --------------> FloatArray-class
	+-------------------------+
	|       size  (36)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|     optional padding    |
	+-------------------------|
	|                         |
	|     1.0 as a C double   |
	|                         |
	+-------------------------+
	|                         |
	|     2.0 as a C double   |
	|                         |
	+-------------------------+
	|                         |
	|     3.0 as a C double   |
	|                         |
	+-------------------------+
Now, it should be clear why FloatArray and DoubleArray are more efficient when storing large amounts of numbers: there is no overhead in object headers for all the elements.
Notice the optional padding - on many systems, double numbers are required to be aligned on an 8-byte boundary. This is taken care of in the doubleArray allocation and access code.

Variables

Within a primitive, a Smalltalk variable "xyz" is accessible in C as:

true, false or nil
to access the corresponding Smalltalk object.
'nil' is actually a macro for a constant (usually 0); while 'true' and 'false' are object pointers.
Nil may not be dereferenced (i.e. do not apply the _qXXX macros which are described below).

@global(xyz)
if xyz is a global variable (other than above).
(this is a bad example, globals should start with an upper case letter, as in @global(Xyz) or @global(ByteArray))

@global(class:xyz)
if xyz is a class variable in class.
(this is a bad example, classvars and class names should start with an upper case letter, as in ``@global(MyClass:Xyz)'' or ``@global(Object:AbortSignal)'')
Notice, that the classes name is encoded in the name - this allows for any classvar to be accessed from primitive code - even if that variable is defined in multiple (super-) classes.

__INST(xyz)
if 'xyz' is an instance variable.

__CINST(xyz)
if 'xyz' is a class instance variable.
(do not confuse ``class variables'' with ``class instance variables''.)

xyz
if 'xyz' is a method variable, method argument, block variable or block argument AND 'xyz' is NOT a C keyword.

__xyz
if 'xyz' is a method variable, method argument, block variable or block argument AND 'xyz' is a C keyword.
(i.e. all variables named after C-keywords, such as "int" or "char")

Global Variables & Identifier Conflicts


You may encounter problems when including a header file which defines true/false. Some C frameworks do so (and also for other common names like Rectangle, Point or Context). For this, st/x object definitions are available in two variants: one which defines identifiers which correspond to the Smalltalk names ("true", "false", "Rectangle", etc.) and another which avoids any such conflict with other C definitions by prepending an "__STX_" prefix. I.e. you can refer to the Rectangle class from within your primitive code both as "Rectangle" and "__STX_Rectangle". Thus, if you ever encounter a naming problem when compiling primitive code which includes other (non-st/x) C header files, use the following work-around:
    ...
    #include <stc.h>            // get the st/x definitions

    #undef true                 // get rid of st/x definitions
    #undef false                // which conflict with C definitions
    ...

    #include <your-file.h>      // get C definitions (possibly redefining true, false and others)

    #define true __STX_true
    #define false __STX_false   // get back st/x versions
    ...

Symbols

Symbols are string-like objects which are guaranteed to be identical, if they have equal contents. The central "__MKSYMBOL(char *)"-function (described below) keeps a table of already existing symbols and makes sure that a reference to an existing one is returned, whenever one already exists for the given character sequence.

The downside of MKSYMBOL is that it is called at run time and involves some execution time overhead (not too much, as the table is hashed, and therefore fast; but still there is a function call and hash key computation involved.

Constant symbols, can be created at compilation/system startup time, and require no execution time (actually a single memory fetch, which executes in the nanosecond order).

To specify a compile-time symbol, stc supports selector translation. A construct of the form:

	@symbol(...)
where '...' is a selector in Smalltalk's form, will be replaced by the corresponding c name. (the '@'-syntax has been chosen, since it does not conflict with existing C lexical elements (also, a similar mechanism is found in objective-c compilers). No spaces or separators are allowed around '...'.

Typically, the overhead of MKSYMBOL is in the 20-30ns order, compared to 3-5ns for @symbol.

As an example, to return one of the symbols ["#+" "#foo" "#bar" "#bar:with:with:"] from a C primitive, write:

    myPrimitive
    %{
	...
	switch (something) {
	    case a:
		RETURN ( @symbol(+) );
	    ...
	    case x:
		RETURN ( @symbol(foo) );
	    case y:
		RETURN ( @symbol(bar) );
	    case z:
		RETURN ( @symbol(bar:with:with:) );
	    default:
		RETURN (nil);
	}
    %}

PLEASE: DO use this feature, since it allows future changes made to the translation mechanism without making existing primitive codes source incompatible.

Late note:
Starting with rel 1.4 of ST/X, it is no longer optional but required that you use the above syntax, since the compiler has to generate special declarations in systems which use dynamic or shared libraries, for symbols to be unified. Since the compiler does not parse and analyze primitive code (except for @symbol and @global), it had otherwise no chance of knowing which symbols are required.
Although undetected by stc, it is considered an error if you use symbols directly - so better forget all of the above.

Types

All primitive code includes the header file "stc.h", which contains definitions and typedefs for primitive code (and also for the generated C code).
The most common type to be used by C code is "OBJ", which is defined as a pointer to an object's header. All Smalltalk objects are only to be used as "OBJ"s and extraction macros are to be used to access instance variables. Never cast an OBJ and access fields directly.

Another highly useful definition is "INT", which is defined as an integer type with a size large enough to hold a casted "OBJ". It is also the type used for smallInteger values.
On most systems, this is a 32bit integer; however, on alpha, x86_64 and other 64bit cpus, it is defined as a 64bit integer.

Be very careful to declare integer variables as "INT" in your primitive code. Not "int" or "long" in your - it may otherwise fail to run correctly if your code is to be ported to another machine. Notice, that "long" is NOT always defined to be of pointer size; thanks to wise microsoft decisions, it is especially not true for 64bit windows!

Since C originally had no boolean type, all check macros (such as "__isSmallInteger()") return 1 (one) as true and 0 (zero) as false. Those are marked with a type of "bool" in the section below (although they really return "int")

Macros & Helper Functions

The following macros and functions are provided for C-primitives and allow extraction of values, conversion or creation of Smalltalk objects.
Most of them are defined as macros in the file "include/stc.h". Code generated by stc automatically includes this file.
Notice, that some macros/functions return a 64bit C-integer on alpha CPU systems.
(notice the distinction between "int" and "INT" in the description below.)

Constants & Types

INT
an integer type of appropriate type to hold smallInteger values and tagged pointers.
On 32bit machines, this is defined as "int" or "long" (i.e. a 32bit integer type).
On the alpha (or other 64bit) CPU, this is defined as "long long" (i.e. a 64bit integer type). Never ever use "int", "long" or "int32" or similar machine types inside your primitive code, and do not expect that an "int" or "long" is able to be of the same size as a pointer (instead, always use "INT").

UINT
like INT but unsigned

_MIN_INT
the minimum value a smallInteger can have.
This returns the same value as "SmallInteger minVal".
In general, that is the smallest signed integer which can be represented with one bit less than the wordsize of the CPU.
On 32bit machines, this is defined as "0x40000000".
On a 64bit CPU, this is defined as "0x4000000000000000L"

_MAX_INT
the maximum value a smallInteger can have.
This returns the same value as "SmallInteger maxVal".
In general, that is the largest signed integer which can be represented with one bit less than the wordsize of the CPU.
On 32bit machines, this is defined as "0x3FFFFFFF".
On a 64bit CPU, this is defined as "0x3FFFFFFFFFFFFFFFL"

OHDR_SIZE
the size (in bytes) of an object header.
Useful when objects are allocated via the __NEW* functions below.

Query & Value Extraction Helpers

OBJ __Class(OBJ x)
returns the class of 'x'. Handles immediate objects (i.e. is "safe")
This is typically used in a primitives entry sequence, to check for the arguments being of the expected types.
(i.e. "if (__Class(arg) == @global(SomeClassHere)) ...")

OBJ __qClass(OBJ object_x)
same, but 'object_x' must be non-Nil, non-SmallInteger (i.e. reference to a "real", non-immediate object).
Use only if you are certain about this (it is slightly faster than _Class macro, but leads to a segmentation violation, if applied to non pointer objects, because it unconditionally indirects through the object header's class slot).
If you do not understand the above, use the __Class macro.

int __Size(OBJ x)
the size of the object (including header) in bytes (0 for nil or integers).
Do not confuse this with the value returned by Smalltalk's #size message. This macro returns the memory size in bytes - not the logical size.

Be careful with strings: the 0-byte counts here.
(use __stringSize() to get the net size without header and 0-byte).

Be also careful with arrays: you have to subtract the size of the header and divide by "sizeof(OBJ)" to get the number of elements.
(better use the __arraySize() macro, which does this for you)

int __qSize(OBJ object_x)
like above, but 'object_x' may not be nil or a SmallInteger. Use only if you are certain. (this is slighly faster than the _Size() macro, but leads to a segmentation violation, if applied to non pointer objects)

(x == nil), (x == true), (x == false)
to check for nil, true and false

bool __isObject(OBJ x)
1 if 'x' is not a SmallInteger; 0 (zero) otherwise

bool __isNonNilObject(OBJ x) (macro)
1 if 'x' is neither a SmallInteger nor nil; 0 (zero) otherwise
(i.e. a 'real' object for which the _q-macros are allowed. You should only dereference a given OBJ (-pointer) in C, if it passed the __isNonNilObject test before - otherwise, segmentation violations may occur).

bool __isSmallInteger(OBJ x)
1 if 'x' is a SmallInteger

bool __bothSmallInteger(OBJ x, OBJ y)
1 if both 'x' and 'y' are SmallIntegers
(this saves some typing for the lazy guy, and a branch for the CPU)

bool __isLargeInteger(OBJ x)
1 if 'x' is a LargeInteger

bool __isInteger(OBJ x)
1 if 'x' is either a LargeInteger or a SmallInteger

bool __isArray(OBJ x)
1 if 'x' is an Array
(i.e. elements can be accessed with __ArrayInstPtr(x)->a_element[index0])

bool __isByteArray(OBJ x)
1 if 'x' is a ByteArray (but not an instance of a subclass of it)
(i.e. elements can be accessed with __ByteArrayInstPtr(x)->a_element[index0])
Attention: Please consider using __isByteArrayLike or __isBytes.

bool __isFloat(OBJ x)
1 if 'x' is a Float (double precision float), but not an instance of a subclass.
(i.e. C-double value can be extracted with the __floatVal macro)
Attention: Please consider using __isFloatLike.

bool __isShortFloat(OBJ x)
1 if 'x' is a ShortFloat (single precision float)
(i.e. C-float value can be extracted with the __shortFloatVal macro)

bool __isFraction(OBJ x)
1 if 'x' is a Fraction (but not an instance of a subclass of it)

bool __isString(OBJ x)
1 if 'x' is a String (but not an instance of a subclass of it)
(i.e. a C-unsigned char pointer can be extracted with __stringVal(x))

bool __isUnicode16String(OBJ x)
1 if 'x' is a Unicode16String
(i.e. a C-unsigned int16 pointer can be extracted with __unicode16StringVal(x))

bool __isUnicode32String(OBJ x)
1 if 'x' is a Unicode32String
(i.e. a C-unsigned int32 pointer can be extracted with __unicode32StringVal(x))

bool __isSymbol(OBJ x)
1 if 'x' is a Symbol
Since symbols are derived from string, the __stringVal macro can also be applied to symbols.
Attention: Please consider using __isSymbolLike.

bool __isCharacter(OBJ x)
1 if 'x' is a Character

bool __isPoint(OBJ x)
1 if 'x' is a Point

bool __isBlock(OBJ x)
1 if 'x' is a Block
Attention: Please use __isBlockLike

bool __isExternalBytes(OBJ x)
1 if 'x' is an ExternalBytes instance
Attention: Please use __isExternalBytesLike

bool __isExternalAddress(OBJ x)
1 if 'x' is an ExternalAddress instance

bool __isExternalFunction(OBJ x)
1 if 'x' is an ExternalFunction instance

bool __isByteArrayLike(OBJ x)
1 if 'x' is either a ByteArray or an ImmutableByteArray

bool __isBytes(OBJ x)
1 if 'x' is ByteArray-like (i.e. ByteArray or subclass)

bool __isWords(OBJ x)
1 if 'x' is WordArray-like (i.e. WordArray or TwoByteArray or subclass)

bool __isBlockLike(OBJ x)
1 if 'x' is an instance of Block or of a subclass

bool __isSymbolLike(OBJ x)
1 if 'x' is an instance of Symbol or of a subclass

bool __isMethodLike(OBJ x)
1 if 'x' is an instance of Method or of a subclass

bool __isFloatLike(OBJ x)
1 if 'x' is an instance of Float or of a subclass

bool __isBehaviorLike(OBJ x)
1 if 'x' is an instance of Behavior or of a subclass

bool __isExternalBytesLike(OBJ x)
1 if 'x' is an instance of ExternalBytes or of a subclass thereof

INT __intVal(OBJ x)
the 31 bit (x86_64, alpha: 63 bit) signed integer value; 'x' must be a SmallInteger. No check is done whether the passed argument is really a smallInteger; you will get a wrong numeric value (actually something based on the object's address) if applied to non smallIntegers.
This is the reverse of __MKSMALLINT().
Sorry: the name was badly choosen; it should be called ``__smallIntVal()''.
(but its too late to change this now, as many users have already written C-primitive code using those macros...)

unsigned INT __longIntVal(OBJ x)
returns the 32 bit (x86_64, alpha: 64 bit) unsigned integer value; 'x' must be a SmallInteger or a 4-byte (8-byte) LargeInteger.
This does check for the argument being of correct type, and returns 0 if x is neither a small or largeInteger, or if the largeIntegers value does not fit into 32/64 bit.
Since 0 (zero) is a valid value, you have to check x agains beeing equal to __MKSMALLINT(0) before and care for that case.
This is the reverse of __MKUINT().
Sorry: the name was badly choosen; it should be called ``__uintVal()'', but the old name is kept for backward compatibility.

INT __signedLongIntVal(OBJ x)
returns the 32 bit (x86_64, alpha: 64 bit) signed integer value; 'x' must be a SmallInteger or a 4-byte (8-byte) LargeInteger.
This does check for the argument being of correct type, and returns 0 if x is neither a small or largeInteger, or if the largeIntegers signed value does not fit into 32/64 bits.
Since 0 (zero) is a valid value, you have to check x agains beeing equal to __MKSMALLINT(0) before and care for that case.
This is the reverse of __MKINT().
Sorry: the name was badly choosen; it should be called ``__intVal()''.

int __unsignedLong64IntVal(OBJ x, uint64 *pInt) (function)
returns the 64 bit unsigned integer value; 'x' must be a Small- or LargeInteger with an appropriate value.
If the class and value of the object are correct, the value is stored as a 64bit unsigned integer through the given pointer and the function returns 1.
Otherwise, a 0 is returned.
This is the reverse of __MKUINT64().

int __signedLong64IntVal(OBJ x, int64 *pInt) (function)
returns the 64 bit unsigned integer value; 'x' must be a Small- or LargeInteger with an appropriate value.
If the class and value of the object are correct, the value is stored as a 64bit unsigned integer through the given pointer and the function returns 1.
Otherwise, a 0 is returned.
This is the reverse of __MKINT64() / __MKLARGEINT64().

double __floatVal(OBJ x)
the double value; 'x' must be a Float. Does not check - you must check for 'x' being an instance of Float before invoking this (using "__isFloat()" or "__isFloatLike()").
This is the reverse of __MKFLOAT().

float __shortFloatVal(OBJ x)
the float value; 'x' must be a ShortFloat. Does not check the argument for being an instance of ShortFloat (use "__isShortFloat()").
This is the reverse of __MKSFLOAT().

char * __stringVal(OBJ x)
the character-pointer, pointing to the 0-terminated characters of the String object. 'x' must be an instance of String or a subclass of it. Does not check (use "__isString()" or "__isStringLike()").
This is the reverse of __MKSTRING().

unsigned short * __unicode16StringVal(OBJ x) (macro)
the 2-byte-character-pointer; 'x' must be a Unicode16String. Does not check. Notice that (currently) multibyte string objects are NOT 0-terminated. If required, you have to copy the bytes into a local buffer and 0-terminate them manually, before passing them to C. This may change in a future version.
This is the reverse of __MKUSTRING().

char * __symbolVal(OBJ x)
the character-pointer, pointing to the 0-terminated characters of the Symbol object. 'x' must be a Symbol. Does not check.
This is the reverse of __MKSYMBOL() or @symbol() (for constant symbols).

OBJ __characterVal(OBJ x) (macro)
the asciicode/codePoint as ST-SmallInteger; 'x' must be a Character. Does not check for x to be a valid character object.
This is the reverse of __MKCHARACTER() or __MKUCHARACTER().
Be careful: the result is still an OBJ - not a C-int. You still have to extract the asciiValue/codePoint using the __intVal() macro.
To check an argument for being a character and getting its C equivalent, use:
char ch;

if (__isCharacter(arg)) {
    ch = __intVal( __characterVal(arg) );
    if (ch <= 0xFF) {
	... iso8859-1 ...
    } else {
	... unicode ...
    }
}
Notice that character encoding is always unicode; however, for single byte characters, iso8859 encoding is used which is a subset of unicode.

void * __externalBytesAddress(x)
the address of an externalBytes underlying memory; 'x' must be an instance of ExternalBytes or a subclass thereof. Does not check.
This is the reverse of __MKEXTERNALBYTES().

OBJ __externalBytesSize(x)
returns the number of bytes in an externalBytes memory block, or nil if that size is not known. Notice, that this nil-check is still required, and the intVal must be taken in order to get the size as C-int. Does not check.

void * __externalAddressVal(x)
the C-pointer of an externalAddress; 'x' must be an ExternalAddress instance. Does not check.
This is the reverse of __MKEXTERNALADDRESS().

void * __externalFunctionVal(x)
the C-pointer of an externalFunction; 'x' must be an ExternalFunction instance. Does not check.
This is the reverse of __MKEXTERNALFUNCTION().

OBJ __point_X(OBJ p)
the x-instance as ST-Object; 'x' must be a Point. Does not check.

OBJ __point_Y(OBJ p)
the y-instance as ST-Object; 'x' must be a Point. Does not check.

int __stringSize(OBJ s)
the size of the string-object; 's' must be a String. Does not check.
In contrast to __Size(), this returns the strings net size (i.e. without header and 0-byte).

int __arraySize(OBJ a)
the number of elements in the array-object; 'a' must be an Array. Does not check.

int __byteArraySize(OBJ a)
the number of elements in the byteArray-object; 'a' must be a ByteArray. Does not check.

OBJ __AT_(OBJ receiver, int index) (function) [1]
sends #at: to the receiver. For certain classes (String, Array) no real send is performed; instead the value is extracted directly.

OBJ __AT_PUT_(OBJ receiver, int index, OBJ value) (function) [1]
sends #at:put: to the receiver. For certain classes (String, Array) no real send is performed; instead the value is stored directly.

OBJ __SIZE(OBJ receiver) (function) [1]
sends #size to the receiver. For certain classes (String, Array) no real send is performed; instead the size is determined directly.

Object Creation Helpers

Please be aware that any object creation can lead to a garbage collect operation to happen. Because Smalltalk/X uses a moving garbage collector and direct object pointers, you must be very careful to both make sure that all object references in your C-code are both reachable by (i.e. known to) the collector AND that any changed object addresses are correctly updated in your C-pointers. For that, please read the "garbage collector interface" section below very carefully, when using any of the following object allocation functions. Also, please read and understand the documentation of __PROTECT__ / __UNPROTECT__ below.
bool __ISVALIDINTEGER(INT ival)
checks if 'ival' is in the valid range for a smallInteger. (i.e. in [_MIN_INT .. _MAX_INT])

OBJ __MKSMALLINT(INT ival)
makes a SmallInteger object with value 'ival'. The argument must be within the valid SmallInteger range which is typically 31 bit (alpha: 63 bit).
For security and portability, compare 'ival' against _MIN_INT and _MAX_INT (using __ISVALIDINTEGER) before using this macro.
Attention: will be obsoleted by __mkSmallInteger()

OBJ __MKFLOAT(double dval) (function) [1]
makes a Float object with value 'dval'.
Notice that the Smalltalk class "Float" corresponds to the C-double type, whereas the Smalltalk "ShortFloat" holds a "float" value. (this was done for compatibility with VisualAge- and Dolphin Smalltalk systems)

OBJ __MKSFLOAT(float fval) (function) [1]
makes a ShortFloat object with value 'fval'
Notice that the Smalltalk class "Float" corresponds to the C-double type, whereas the Smalltalk "ShortFloat" holds a "float" value. (this was done for compatibility with VisualAge- and Dolphin Smalltalk systems)

OBJ __MKFRACT_I(int num, int denom) (function) [1]
makes a Fraction object with integral numerator and integral denominator
The arguments must be in the valid smallInteger range (which is not checked in this function).

OBJ __MKEMPTYSTRING(int len) (function) [1]
makes an empty String object for 'len' 1-byte-characters with undefined contents. Actually, the allocation includes space for an extra zero byte. So do not count that in the passed len argument.
The returned object is an instance of String, which can only hold 8-bit characters.

OBJ __MKEMPTYUSTRING(int len) (function) [1]
makes an empty Unicode16String object for 'len' 2-byte-characters with undefined contents.
The returned object is an instance of Unicode16String, which can hold 16-bit characters. The allocation does currently NOT include space for a zero character at the end.
This function was introduced with ST/X release 5.2.1.

OBJ __MKSTRING(char *sval) (function) [1]
makes a String object with value 'sval'.

Warning: 'sval' is not allowed to point into a Smalltalk object, because the string allocation itself could lead to a garbage collect to happen, which could make any passed-in object-pointer useless. To create a new string from a passed-in Smalltalk string, use the following procedure:

  1. protect the st-object from which you want to copy (__PROTECT__)
  2. allocate an empty string (__MKEMPTYSTRING)
  3. unprotect the st-object (__UNPROTECT__)
  4. copy the bytes (bcopy)
of course, the PROTECT/UNPROTECT is only needed if the reference is not via a local/argument of the current context.

OBJ __MKSTRING_L(char *sval, int len) (function) [1]
as above, if you know the length (is somewhat faster, since a call to strlen() is avoided)
Warning: 'sval' is not allowed to point into a Smalltalk object (see above).

OBJ __MKSTRING_ST(OBJ stStringObject) (function) [1]
use this, top copy an existing Smalltalk string.

OBJ __MKSTRING_ST_L(OBJ stStringObject, int len) (function) [1]
like above, if you know the length or only want to copy the first len characters.

OBJ __MKSYMBOL(char *sval, 0) (function) [1]
makes a Symbol object with name 'sval'
For constant symbols, always use the @symbol() macro, which creates the symbol at compilation time and does not take any time at execution time. In contrast, MKSYMBOL creates the symbol at execution time and requires a search in the hashtable of existing symbols.

OBJ __MKCHARACTER(unsigned char c) (macro)
makes a Character object with asciiValue 'c'
'c' must be in the range [0 .. 255].
No actual object creation is done, as all 256 single-byte character objects are created and remembered at startup time.
The character is assumed to have unicode coding (of which iso8859-1 and ascii are single-byte subsets).

OBJ __MKUCHARACTER(unsigned short uc) (function) [1]
makes a Character object with unicode-codePoint 'uc'
'uc' must be in the range [0 .. 0xFFFF].
This may or may not actually create a new object, depending on the passed codePoint: the first 1024 characters are shared and reused (immediate objects), and for those, this function simply returns a pointer to one of those preallocated character objects. For higher codepoints, an object allocation is performed and thus, garbage collection may happen.
The character is assumed to have unicode coding.
This function was introduced with ST/X release 5.2.1.

OBJ __MKLARGEINT(INT lval) (function) [1]
makes a LargeInteger object with value 'lval'
Be careful to only create largeIntegers if 32 bits (alpha: 64 bits) of value are really needed; use __MKSMALLINT() (if you are certain) or __MKINT() (if you don't know). Otherwise, this will return an unnormalized LargeInteger (i.e. a LargeInteger with a value in the SmallInteger range).
Most code in the numeric classes is not prepared to handle unnormalized largeIntegers; therefore strange things may happen.

OBJ __MKULARGEINT(unsigned INT uval) (function) [1]
makes a LargeInteger object with value 'uval'
Same care as above.

OBJ __MKINT(INT val) (function) [1]
makes a Small or LargeInteger object with value 'val'
This checks the value of the passed number and returns a correctly typed object. If in doubt, always use this or __MKUINT below to pass integers from C to Smalltalk.

OBJ __MKUINT(unsigned INT uval) (function) [1]
makes a Small or LargeInteger object with value 'uval'

OBJ __MKINT64(int64 *iP) (function) [1]
makes a LargeInteger object with up to 64 bits of value.
The argument is a pointer to an int64 structure (see stc.h).
This checks the passed arguments value and returns a correctly typed object (i.e. SmallInteger, 4-byte LargeInteger or 8-byte LargeInteger as required).

OBJ __MKUINT64(unsigned int64 *iP) (function) [1]
makes a LargeInteger object with up to 64 bits of value
The argument is a pointer to an unsigned int64 structure (see stc.h).
This checks the passed arguments value and returns a correctly typed object (i.e. SmallInteger, 4-byte LargeInteger or 8-byte LargeInteger as required).

OBJ __MKLARGEINT64(int sign, int low32bits, int hi32Bits) (function) [1]
same as __MKINT64, but expects sign, low and hi bits as separate arguments. (this is sometimes useful, when a datum is either computed or read from an external source)
This checks the passed arguments values and returns a correctly typed object (i.e. SmallInteger, 4-byte LargeInteger or 8-byte LargeInteger as required).

OBJ __MKEXTERNALBYTES(void *ptr) (function) [1]
makes an ExternalBytes object from an arbitrary pointer 'ptr'
This is useful to pass a block of memory which was allocated by C-code up to Smalltalk. It can be accessed there via protocol found in ExternalBytes.
This does not allocate the bytes-block; however, a Smalltalk object which points to the passed memory is allocated.
Notice:
The created ExternalBytes object does not know the size of the memory block, and therefore does no bound checking when individual bytes are accessed. This makes your code less robust, of course.
Therefore, it is recommended to ALWAYS use the function below if the size of the memory block is known.

OBJ __MKEXTERNALBYTES_N(void *ptr, int nBytes) (function) [1]
like above, passing the number of (byte-) elements.
If possible, use this to create externalBytes objects, since knowing the size of the data block enables for bounds checking operations when elements are accessed from Smalltalk code - this makes your code more robust.

OBJ __MKEXTERNALADDRESS(void *ptr) (function) [1]
makes an ExternalAddress object from an arbitrary pointer 'ptr'

OBJ __MKEXTERNALFUNCTION(voidFunc *ptr) (function) [1]
makes an ExternalFunction object from an arbitrary function pointer 'ptr'

OBJ __MKPOINT_INT(INT x, INT y) (function) [1]
makes a Point object given two integer coordinates; x and y must be within the valid smallInteger range.

OBJ __MKPOINT_DOUBLE(double x, double y) (function) [1]
makes a Point object given two double coordinates

OBJ __ARRAY_NEW_INT(int n) (function) [1]
makes an Array with 'n' slots (nilled) elements.

OBJ __ARRAY_NEW_INT_WITHALL(int n, OBJ val) (function) [1]
makes an Array with 'n' slots, all referring to 'val'.
The 'val' argument must be a valid OBJ.

OBJ __ARRAY_WITH1(OBJ element) (function) [1]
makes an Array with 1 element. The argument must be a valid OBJ.

OBJ __ARRAY_WITH2(OBJ e1, e2) (function) [1]
makes an Array with 2 elements. The arguments must be valid OBJs.

OBJ __ARRAY_WITH[3-5](OBJ e1, e2, ...) (function) [1]
same with up to 5 elements.

OBJ __STRING_NEW_INT(int n) (function) [1]
makes an String with 'n' elements; the string is initialized with spaces.

OBJ __BYTEARRAY_UNINITIALIZED_NEW_INT(int n) (function) [1]
makes a ByteArray with 'n' elements; the contents is undefined.

OBJ __BYTEARRAY_NEW_INT(int n) (function) [1]
makes a ByteArray with 'n' elements; the elements are set to zero.

OBJ __BASICNEW(OBJ cls) (function) [1]
low level instance creation. This does NOT actually send #basicNew to the class - only the memory is allocated and the instance variables nilled. (i.e. a classes private basicNew method would not be invoked by this).

OBJ __BASICNEW_INT(OBJ cls, int n) (function) [1]
like above, with indexed instance variables.

OBJ __NEW(OBJ cls) (function) [1]
instance creation. Sends #new to cls.

OBJ __NEW_INT(OBJ cls, int n) (function) [1]
instance creation with indexed instance variables. Sends #new: to cls.

OBJ __stArrayFromCIntArray(int *pInts, int nItems) (function) [1]
makes an Array filled with integer values from a C-array of ints. Individual elements of the returned array may be smallIntegers or largeIntegers, depending on their value.

OBJ __stArrayFromCUIntArray(unsigned int *pInts, int nItems) (function) [1]
makes an Array filled with integer values from a C-array of unsigned ints. Individual elements of the returned array may be smallIntegers or largeIntegers, depending on their value.

OBJ __stArrayFromCShortArray(short *pInts, int nItems) (function) [1]
makes an Array filled with integer values from a C-array of short ints. All elements of the returned array will be smallIntegers.

OBJ __stArrayFromCUShortArray(unsigned short *pInts, int nItems) (function) [1]
makes an Array filled with integer values from a C-array of unsigned short ints. All elements of the returned array will be smallIntegers.

OBJ __stArrayFromCFloatArray(float *pInts, int nItems) (function) [1]
makes an Array filled with floats from a C-array of floats.

OBJ __stArrayFromCDoubleArray(double *pInts, int nItems) (function) [1]
makes an Array filled with floats from a C-array of doubles.

OBJ __stFloatArrayFromCFloatArray(pfloat *pInts, int nItems) (function) [1]
makes a FloatArray filled with values from a C-array of floats.

OBJ __stDoubleArrayFromCDoubleArray(double *pInts, int nItems) (function) [1]
makes an DoubleArray filled with values from a C-array of doubles.

[1] these macros/functions (may) allocate object memory and therefore may trigger a garbage collect - read more on this below.

garbage collector interface

__PROTECT__(var)
tell the VM, that var contains a reference to a valid Smalltalk object which should not be collected. This is required if C-code holds a reference to a Smalltalk object and there is a chance for a garbage collect operation to occur.
This macro must be used in a stack-like fashion together with the UNPROTECT macro. For every PROTECT, there MUST be a corresponding UNPROTECT; otherwise, some VM internal table will overflow. Notice, that this internal table is relatively small (some 30 entries or so), and no mechanism whatsowever exists in the VM to clean up this table. Thus, PROTECT/UNPROTECT are strictly not allowed iff there is any chance of a message send (i.e. a call into smalltalk code), an interrupt or a context switch is to happen inside. This is because then, the table could overflow because a user may end up in a debugger and/or exception handler and the UNPROTECT may never be executed.

Only use PROTECT/UNPROTECT around calls to a memory allocation VM routine; never around message sends or when interrupts are enabled
See more below in ``Allocating object memory in primitives''.

__UNPROTECT__(var)
get any updated pointer back from the VM into var, and release the VM's remembering as installed with PROTECT. This macro MUST be used in a stack-like fashion together with the PROTECT macro. For every PROTECT, there MUST be a corresponding UNPROTECT; otherwise, some VM internal table will overflow.
Also notice, that you have to use the UNPROTECT macro after EVERY object allocation, to refetch any other possibly changed object reference.
See more below in ``Allocating object memory in primitives''.

OBJ __STORE(OBJ dst, OBJ val)
to tell the garbage collector, that a store of val was made into dst; the STORE macro is not needed for local variables (method locals) or if you are certain that the stored object is either nil or a SmallInteger. The macro checks for those, so it does not hurt (much) to place this macro after every non-context store.
Please read further information on why a store macro is required and how it is used.

OBJ __GSTORE(OBJ val)
to tell the garbage collector, that a store of val was made into a global (or classVariable);
This is equivalent to __STORE(Smalltalk, val).

After calling any of the macros/functions marked with a ``[1]'', any unprotected pointer referencing a Smalltalk object or pointing into a Smalltalk object (for example: __stringVal) will be void.

You have to take care of this, either by using the PROTECT/UNPROTECT macros, or by placing the variables into the current context (i.e. effectively keeping those references in a Smalltalk method local instead of a C variable which is unknown to the grabage collector).
All pointers into an object are to be considered invalid after a garbage collect, even if the underlying base pointer was PROTECTED/UNPROTECTED.
Therefore, never create any C pointers into an object; instead, always use indexed accesses, and protect the base object.

Currently (and maybe forever), pointers in ST/X are DIRECT object pointers, meaning that the values of these variables point directly to the underlying object's structure (some of which can be found in the stc.h include file).

Since I cannot guarantee that this statement remains true in the future (indirection makes things like the garbage collector or the become:-operation much much simpler), you should always use access macros such as "__InstPtr(o)->field" instead of "o->field".
These access macros are also defined in stc.h:

OBJ to C-structure casts

__objPtr(o)
same as o, to hide indirection if ever implemented, always use "__objPtr(o)->field" instead of "o->field" !

__InstPtr(o)
to access instvars by index [0 .. instSize-1]

Direct access to instance variables it is NOT recommended, since you have to modify your primitives whenever ther instance variable layout of the accessed object changes.

For primitive code in methods, use the "__INST(name)" macro to access named instance variables.

__PointInstPtr(o)
to access x and y of a point (actually, simply a cast to the points structure)

For example, to access the x component of a point instance, use:

    ...
    if (__isPoint(stVar)) {
	OBJ xComponent;

	xComponent = __PointInstPtr(stVar)->p_x;  /* see "stc.h" for p_x */
	if (__isSmallInteger(xComponent)) {
	    INT xValue;

	    xValue = __intVal(xComponent);
	    ...
	}
    }
    ...
of course, never forget to check the types with the __isXXX macros - just to make certain.

__ArrayInstPtr(o)
to access elements of an array.

For example, to access the 10th element of a Smalltalk array, use:

    ...
    if (__isArray(stVar)) {
	OBJ el_10;

	/* notice: c-indexing starts at 0 */
	el_10 = __ArrayInstPtr(stVar)->a_element[9];
	...
    }
    ...
another example (summing the elements of an array):
    ...
    if (__isArray(stVar)) {
	int nElements;
	INT sum;
	int index;
	OBJ element;

	nElements = __arraySize(stVar);
	sum = 0;
	for (index=0; index < nElements; index++) {
	    element = __ArrayInstPtr(stVar)->a_element[index];
	    if (! __isSmallInteger(element)) {
		RETURN (nil);
	    }
	    sum += __intVal(element);
	}
	RETURN ( __MKSMALLINT(sum));   /* use "__MKINT()" if there is a chance
					* that sum does not fit into 31 bits
					*/
    }
    ...
notice, stc is not too bad: the code generated by compiling the corresponding ST code is not much slower.

and so on ...
There are macros for all objects which are known to the runtime system
i.e.

(any many more) to access other objects internals. However, it is always better (with respect to portability), to send access messages to the object from the ST part of your method, deposit them in some local variable, and access those variables from the C part.

If there will ever be a switch to indirect pointers, only those macros have to be changed instead of all primitive code.

late news:

Originally, all those macros were named _XXX(). However, some compilers got confused by a definition of both a macro named "_XXX" and a global variable with the same name.

In the current ST/X release, this was true for the _isBlock macro, where a corresponding '_isBlock' c-variable exists (for the isBlock symbol).

The macro has been renamed to __isBlock to avoid this problem This renaming will be done for all and every other macro.
It has already been done for most macros, but some remain to use a single underscore. Be prepared for some minor (name-) changes in that area.
For the intermediate time, please excuse the confusion due to some macros starting with two underscores, others with only one.

With release 2.10.5.4, almost all macros have been renamed for a double-underscore name. The old names are available as synonym for easier migration, but will vanish in a future version.

Utility functions

int id = __STX_AddTimeout(func, deltaT, arg)
installs a timeout, which arranges for the C-function func to be called after deltaT milliseconds. arg is passed to the function as argument (use a pointer to a structure, if more than one argument is required).
The function will be called by the scheduler (i.e. at highest priority), but will NOT interrupt the garbage collector (i.e. it can be used for soft realtime only).

The return value is an integer, which identifies the timeout.

The passed timeout-function is called only once - for cyclic calling, another call to __STX_AddTimeout should be made in the called function.

__STX_RemoveTimeout(int id)
removes a timeout as previously installed with __STX_AddTimeout. The argument must be the timeouts id, as previously returned.

__STX_SignalSemaphore(aSemaphore)
signals a semaphore. Before using this function, carefully read the section on how references to Smalltalk objects are kept & remembered in C-code.

Returning a value from a primitive

Since some cleanup is required before a method is left, a simple C-return from a primitive will not be sufficient in most cases (consider for example a pending reference to the current context). To deal with these situations correctly, a "RETURN(value)" macro is provided by the runtime system, which does all the housekeeping. For simple primitives (which do not require a context), the macro will expand to a simple return, so there is no performance penalty in this case.
Never use return - always use the RETURN macro

examples:

    myMethodReturningOne
    %{
	/* return a SmallInteger */
	RETURN ( __MKSMALLINT(1) );
    %}
    !

    anotherMethodReturningOne
	|retVal|

    %{
	/* return a SmallInteger */
	retVal = __MKSMALLINT(1);
    %}.
	^ retVal
    !

    mySpecialTrigMethod:arg
    %{
	if (__isFloat(arg)) {
	    /* compute a Float */

	    double fVal = __floatVal(arg);
	    RETURN ( __MKFLOAT(sin(exp(fVal) * 1.2345)) );
	}
    %}
    .
	self primitiveFailed
    !

    myMethodReturningSymbol
    %{
	/* return a Symbol */
	RETURN ( @symbol(fooBar) );
	}
    %}
    !

    myMethodReturningString
    %{
	/* return a String */
	RETURN ( __MKSTRING("hello world") );
	}
    %}
    !

    anotherMethodReturningString
	|retVal|

    %{
	char buffer[100];

	strcpy(buffer, "hello ");
	strcat(buffer, getLogin());
	/* return a String */
	retVal = __MKSTRING(buffer);
    %}.
	^ retVal
    !

    aMethodReturningArrayWithStrings
    %{
	OBJ arr, s;

	/*
	 * you need a lot of PROTECT/UNPROTECT,
	 * if references are not reachable via the context...
	 */
	arr = __ARRAY_NEW_INT(2);
	__PROTECT__(arr);
	s = __MKSTRING("hello");    // may GC and make reference in arr obsolete
	__UNPROTECT__(arr);         // fetch possibly updated ref
	__ArrayInstPtr(arr)->a_element[0] = s;
	__STORE(arr, s);            // tells memory manager that a reference exists

	__PROTECT__(arr);
	s = __MKSTRING("hello");    // may GC and make reference in arr obsolete
	__UNPROTECT__(arr);         // fetch possibly updated ref
	__ArrayInstPtr(arr)->a_element[1] = s;
	__STORE(arr, s);            // tells memory manager that a reference exists
	RETURN(arr);
    %}
    !

    anotherMethodReturningArrayWithStrings
	|arr s|
    %{
	/*
	 * you dont need PROTECT/UNPROTECT,
	 * if references ARE reachable via the context...
	 */
	arr = __ARRAY_NEW_INT(2);
	s = __MKSTRING("hello");
	__ArrayInstPtr(arr)->a_element[0] = s;
	__STORE(arr, s);

	s = __MKSTRING("hello");
	__ArrayInstPtr(arr)->a_element[1] = s;
	__STORE(arr, s);
    %}.
	^ arr
    !

    myDestructiveUpperCaseToLowerCase:aString
    %{
	char *cp;
	char c;

	/* check if argument is a string */
	if (__isString(aString)) {
	    /*
	     * get the C-character pointer to the characters
	     * this is allowed here, since no danger for a garbage
	     * collect exists here.
	     */
	    cp = __stringVal(aString);

	    /* walk over string till end (0-byte) is reached */
	    while ((c = *cp)) {
		if (isUpper(c)) {
		    *cp = toLower(c);
		}
		cp++;
	    }
	    RETURN ( aString );
	}
    %}
    .
	self primitiveFailed
    !

Local storage in primitives

Primitives involving local storage which hold Smalltalk objects AND call other methods and/or allocate new objects, MUST be written with great care, since the garbage collector may run at any time. The garbage collector will move objects around so that your pointers become invalid. The garbage collector will of course update all reachable pointers, however, to be able to update your pointers, it must know them!

The easiest way of handling this situation is by declaring these locals as method locals (in contrast to c-variables). Method locals are located in the context, which is investigated (and updated) by the garbage collector:

instead of:

      aMethodName
      %{
	  OBJ aLocal;

	   ...
	  do something with aLocal
	   ...
      %}
it is better to write:
      aMethodName
	  |aLocal|
      %{
	    ...
	  do something with aLocal
	    ...
      %}
In the later case, the Smalltalk compiler produces code which protects the local variable from beeing garbage collected (by creating a context, which will be fixed by the garbage collector).

Of course, this protection is only needed if your primitive code calls other methods and/or allocates storage - otherwise there is no danger since the garbage collector will only run when new objects are allocated (which is always possible when calling other methods).
Warning: the previous statement is not true, iff you declare your primitive code as being interruptable (which is described below). If you do this, an interrupt may occur at any time, leading to a thread-switch to some other Smalltalk process, which may of course trigger a garbage collect.

For coding examples, see the primitives in "libbasic/SmallInt.st" or "libbasic/Float.st". Also, even though a bit more complex, looking into "libview/XWorkstation.st" also gives a lot of insight.

Stack management

Each thread inside ST/X has its own stack, which grows automatically as required. The Smalltalk programmer does not have to preallocate the stack, or predeclare the size of the required stack.

For this, the STC compiler generates code which checks for the amount of required stack on method- or block entry, and grows the stack if required.

Of course, called C-functions do not do so, therefore great care must be taken when calling c library functions which use big stack frames. Especially C-functions, which use alloca, or declare big automatic arrays or generate a very deep call-nesting.

ST/X internally keeps 2 classes of stacks:

Except for the startup and initialization, all processing is usually done on a threadStack, which is guaranteed to provide a certain amount of local stack frame memory (typically: 4k).

Primitive code which calls out to a C-function will have this default amount of stack available for its called C-function, which may or may not be enough. In some cases, it may be required to arrange for a bigger thread-stack-frame to be reserved, or to do a temporary switch to the (unlimited) C-stack.

Stack requirements

ST/X usually reserves a stack frame which is big enough for most C-functions (usually the stack has a 4k reserve), but some functions need more (to name some: printf, scanf, popen and some Xlib functions).
Since no documentation exists on C-library stack requirements, you have to guess, try or otherwise find out what the requirements are. If your ST/X crashes after execution of a primitive, stack violations are first class candidates for being responsible.

To tell the stc compiler, that a primitive needs more stack, a stack declaration should be added - this is done by inserting a comment of the form:

    /* STACK:n */
or
    /* UNLIMITEDSTACK */
into the first line of the primitive (i.e. right after the opening "{").

Detecting the first declaration, stc will make certain that n bytes of stack are available for the method containing that primitive. You may have to guess on what a good value for n is. Taking a bigger value may be less performant, but give you more security.

The second declaration will produce code to switch to the unlimited c-stack for the execution of the method containing that primitive. This stack is grown by the operating system and unlimited (not really, but the limit is typically some 8 or 16 megabytes).

There is one big catch in using UNLIMITEDSTACK: - it is not reentrant. Methods running on the unlimited stack may NOT send other Smalltalk messages. The reason is that all messages might eventually lead to a process switch into another thread and the c-stack cannot hold frames for multiple processes in a non first-in/last-out order.

To summarize things, your primitive might now look like:

    myMethod

    %{ /* STACK:3500 */

	char aBigBuffer[3000];
	...
	...
    %}
Do not fear estimating the stack need - if your estimate is low, there is still a 4k save area; while no memory is lost or runtime penalty to be payed if you estimate too high: after all its just stack memory, which is released with the return of the method.

A rule of thumb is the size of local data arrays (i.e. 3000 bytes of aBigBuffer) plus some 500 bytes for the context and other housekeeping locals. (500 is actually too high for most situations; but better estimate too high than too low.)

A new version with more protection against stack violations is being prepared and soon available. This will at least protect agains violations near the top of the stack - however, there is (currently and in the near future) no insurance against violations due to big alloca chunks.

You are always on the bright side of life, if an UNLIMITEDSTACK declaration is added - however, these methods MUST be leaf methods (i.e. they may not send any further messages).

Late note:
The newest ST/X releases can be started with the VM commandLine argument "-MprotectStack". This makes the memory page below a threads stack non-accessible (by using mprotect), and helps to find stack problems.
However, not all operating systems provide this functionality, and also it does not help against alloca operations, which access memory below that memory page.

Ommiting the context setup

For primitive code which does not send Smalltalk messages, it is possible to save the context setup altogether, effectively producing a very simple (and speedy) c function (saving another bunch of nanoseconds of execution time ;-).

This is done by adding a comment of the form "/* NOCONTEXT */" to the first line of the primitive code.

BIG WARNING:
Since those methods do not have a context, there is no way for the garbage collector to update object references. Thus neither self, nor locals, nor arguments are valid after a garbage collect. You must take great care (using PROTECT/UNPROTECT) to not loose any pointers in case of a garbage collect.
NOCONTEXT primitives should only be written by experienced users or for methods which do NOT allocate memory and do NOT send other Smalltalk messages (i.e. which are completely save from ever entering the garbage collector).

For the curious:

the macros __PROTECT__(ptr) and __UNPROTECT__(ptr) can be used to tell the garbage collector about variable values to be updated - see some primitives in libbasic for examples.
Thus, with careful use of these __PROTECT__ macros, it is possible to define NOCONTEXT methods even if there is a possibility of garbage collection.
These macros are used as:
    {
	OBJ myRef;
	...
	__PROTECT__(myRef);
	...
	do something which may lead to a GC
	this invalidates myRef
	...
	__UNPROTECT__(myRef);
	...
	myRef valid again
    }

Register variables / Method locals

The newest sparc version of stc can (and does) now put locals into the registers (and other versions will also support this feature in the future). Since stc does not look-into or parse primitive code, you have to tell when register locals are not wanted (for example when the address of such a variable is taken in the primitive).
This is now done using a /* NOREGISTER */ comment - which forces all method/block locals to be allocated as auto-variables, so that an address can be taken.

Not all C-compilers complain when the address is taken of a register variable; some silently make the variable a non register one.
For portability of your code, please use the NOREGISTER pragma even if your compiler is a tolerant one iff your primitive takes the address of the variable.

Multiple of these kludge comment pragmas may be in one comment as in:

	  %{ /* STACK:4096  NOREGISTER */
... however, stc does not check if your combination makes sense - using the last one if any conflicts arise.

Sending messages from within primitive code

Low performance but space saving code

To easiest way of sending a message to an object from within primitive code is via the SSEND support function:
    OBJ sel, receiver, arg1, ... argi;

    val = __SSEND<i>(receiver, sel, 0, arg1, arg2, ... argi);
where i is is the number of arguments to the method, sel is the selector which is a symbol (see above and below), and arg<i> are the arguments.

This performs a message send which uses a single static inline cache slot for all sends - therefore it does not require any additional memory for caching.

Although, the performance of this is relatively poor (if compared to a fully cached send), this is the most space effective code sequence - use this, if the message send is seldom performed ('SSEND' stands for 'Space-saving Send').

Expect a performance of roughly 500k to 2M sends per second depending on CPU and memory speed (750ns per send on a P5/200) - for highly polymorphic message sends, the performance may degrade to lower numbers.

The stc compiler generates this type of message sends if the "+optSpace" compilation option is enabled.

Medium performance

More performance is gained by using a cached message send:
    OBJ sel, receiver, arg1, ... argi;
    OBJ val;
    static struct inlineCache dummy = _DUMMYILC<i>;

    val = _SEND<i>(receiver, sel, nil, &dummy, arg1, arg2, ... argi)
where i is is the number of arguments to the method, sel is the selector which is a symbol (see above and below), and arg<i> are the arguments.

This results in a medium-speed message send - the methodLookup is cached in the global method cache, but no inline cache is used.

Expect a performance of roughly 700k to 3M sends per second. (550ns per send on a P5/200)

High performance

If your message send is often performed, you may like to use the inline caching facility, which keeps the target of the last send and speeds up future sends - effectively making the next send an indirect function call.
For very highspeed sends, use:

	static struct inlineCache ilc = _ILC<i>;

	val = (*ilc.ilc_func)(receiver, sel COMMA_SENDER, nil, &ilc, args);
here the target will be cached for a quick indirect call.

Expect a performance of roughly 2M to 15M sends per second. (105ns per send on a P5/200)

The above code also shows good performance for polymorphic message sends. However, this requires additional memory for caching.

For highly polymorphic sends (as when enumerating all objects), the performance may degrade to below the speed of the medium-speed send above, since the polymorph cache handling may create more overhead than savings in rare cases.

The stc compiler generates this code sequence, unless the +optSpace option is enabled. The JIT-compiler always generates this code sequence.

Examples:

      myMethod:argument
	  |local|
      %{
	  static struct inlineCache dummy = _DUMMYILC0;

	  local =_SEND0(argument, @symbol(redraw) COMMA_SENDER, nil, &dummy);
      %}
      !
is equivalent to:
      myMethod:argument
	  |local|

	  local := argument redraw
      !
If you don't use the @(symname) construct, you have to make certain, that the selector is known and defined somewhere. In this case, you should include a "{ Symbol: redraw }" compiler directive; see below for details. The @(..) construct will do this automatically for you.

Notice:

Since the SEND-code produced by the compiler is usually as good as handwritten code - you should avoid writing primitives which send messages, if possible.
(actually the compilers add additional hints to the cache management which further speeds up sends to self, constants and classes).

Primitives should be written for things which cannot be written in Smalltalk (for example: interface to databases, interface to graphics etc) or which are very time-consuming and can be considerable tuned in c (for example: copying arrays, String search, Image rotation etc.)

A good strategy is to do all nescessary Smalltalk stuff before entering the primitive code, and not sending anything from inside. (See examples in XWorkstation.st, Array.st and especially the image manipulation methods in DepthXImage.st etc.)

Also, keep in mind, that the interface will be changed if I get an idea of a faster send, and YOU have to update the code in this case; whereas all high level Smalltalk code will not be affected by these changes.

Remembering Smalltalk objects in external C-code

In some situations, it is required to keep references to certain Smalltalk objects within C-memory. For example, if semaphores are to be signalled from C-code.
In order to do this, we have to tell the garbage collector, that there are other (i.e. alien) references to that object and that these should be considered when checking for the reachability of some object. Also, when objects are moved (or changed via #become:), these references must be updated.

All of the above is done, iff the garbage collector knows about external memory locations which hold references to some Smalltalk object.

__ADD_REFCELL( OBJ *cellRef )
announces to the garbage collector, that cell contains a reference to some Smalltalk object, and should be investigated/updated when garbage is collected.
The cell will NOT survive a snapshot-save & restart - i.e. it has to be reinstalled when an image is restarted.

__REMOVE_REFCELL( OBJ *cellRef)
to deinstall a cell. After that, the value in the cell has to be considered to be invalid.
Example:
The following code passes a Smalltalk semaphore to some external C-code, which triggers this semaphore after some time delta.
    static OBJ semaRef = nil;

    void
    doTrigger(dummyArg)
    {
	__STX_SignalSemaphore(semaRef);
	semaRef = nil;
	__REMOVE_REFCELL( &semaRef );
    }

    void
    triggerSemaAfter(theSema, delta)
	OBJ theSema;
	int delta;
    {

	/*
	 * remember the semaphore
	 */
	semaRef = theSema;
	__ADD_REFCELL( &semaRef );

	/*
	 * install a timeout
	 */

	__STX_AddTimeout(doTrigger, delta, 0);
    }
from Smalltalk, the show can be started with:
    startShow
	|mySema|

	mySema := Semaphore new.
    %{
	triggerSemaAfter(mySema, 1000);
    %}.
	mySema wait

Allocating object memory in primitives

In general, direct object allocation (i.e. allocating some bytes of storage and setting the class & instance fields 'manually') should be avoided and banned from all primitives, since direct allocation makes redefinition of the 'new' method invisible to your primitive and also opens the door for many possible errors (setting fields/instvars wrong or forgetting to define or nil-out any field/instvar).

In most cases, it is possible to do the allocation outside of the primitive code as in:

    someMethod
	|localBytes|

	localBytes := ByteArray new:10000.
	%{
	    ...
	    do something with localBytes in the primitive
	    ...
	%}

If you really have to allocate in a primitive, here is how its done using a call to the "__STX___new()" function as:

	newObject = __STX___new(size, SENDER);
(size is the number of bytes and includes any object header and 0-bytes, SENDER is a macro from stc.h. It represents a reference to the current context which the garbage collector needs to trace object references. It MUST be present here.

This returns space for an object with (size - OHDR_SIZE) bytes, where OHDR_SIZE is the space (overhead) required for an object header. The object header includes the size field, the class, and some flags needed by the garbage collector.

Since an object header is always required, the call is better written as:

	newObject = __STX___new(bytesWanted + OHDR_SIZE, SENDER);

The SENDER macro passes the current context as additional argument. Since __STX___new() might trigger a garbage collect, it needs a handle to the context chain to find references and update pointers. Notice, that depending on the type of CPU and operating system, the SENDER macro may be actually empty. On some systems, the current context is kept in a statically reserver machine register or a thread local variable named "__thisContext__". Thus, the "SENDER" macro hides those details.

Notice, that this function may return nil in case the memory manager has problems allocating the memory. This may happen only in one situation: if the object memory is full, and the operating system is not willing to satisfy a request for more memory. (I.e. if the memory requirements hit any virtual memory size limits of the operating system).

Since this does not happen in normal situations, forgetting the non-nil test is a common mistake - which will get unnoticed for quite some time but may later lead to a segmentation violation, when the system is about to run out of space.
ST/X can deal with low memory situations gracefully, as it keeps some emergency memory aside, which is used to handle an out-of-memory exception. A handler for this (Smalltalk-) exception will try to free memory (sending lowSpaceCleanup) and proceed. If you do not check for a nil return, you will break this graceful emergency scheme. Therefore: you MUST check the value returned and handle the nil case (usually by raising an exception).

Initialized vs. Uninitialized Memory

The memory returned by "__STX___new" is not initialized - not even cleared and the class field in the header is not set. To avoid a later crash of your code or in the garbage collector, you MUST set the class-field and correctly set all the instance fields of the new object as soon as possible (concrete: BEFORE the next memory allocation).
Exceptions from this rule are ByteArrays, Strings, Float- and DoubleArray. Here the nilling (but not the setting of the class field) can be omitted, because the contents will not investigated as pointers. This is what the ByteArray's #uninitializedNew: method does.

Also, proper setup of the contents must be done shortly after the object is created: even if you think that no accesses are possible to the uninitialized object, the garbage collector may want to peek into it to find references to other objects.

example: allocate a ByteArray.

    {
	...
	/*
	 * notice, there may be a garbage collect here ...
	 * thus invalidating all local pointers, which are neither
	 * in the context, nor have been PROTECTED
	 *
	 * don't forget the OHDR in the given size ...
	 */
	newObject = __STX___new(100 + OHDR_SIZE, SENDER);
	if (newObject != nil) {
	    /*
	     * MUST set the class
	     */
	    __objPtr(newObject)->o_class = ByteArray;
	    /*
	     * except for ByteArrays and Strings, MUST nil-out instvars
	     * but care for the first OHDR_SIZE header bytes - don't clear
	     * those.
	     */
	    bzero((char *)__objPtr(newObject) + OHDR_SIZE, 100);
	}
	RETURN ( newObject );
    }
example: allocate a Point.
    {
	...
	newPoint = __STX___new(sizeof(OBJ)*2 + OHDR_SIZE, SENDER);
	if (newPoint != nil) {
	    /*
	     * MUST set the class
	     */
	    __objPtr(newPoint)->o_class = Point;
	    /*
	     * must set the fields
	     */
	    __InstPtr(newPoint)->i_instvars[0] = _MKSMALLINT(1);  /* p x:1 */
	    __InstPtr(newPoint)->i_instvars[1] = _MKSMALLINT(0);  /* p y:0 */
	}
	RETURN ( newPoint );
    }
in the above, there was no problem to expect in case of a garbage collect, since the stored values where smallIntegers.

The situation becomes a bit more difficult, if previously allocated objects are to be stored into a newly created object. Since every object allocation may lead to a GC, we have to make certain that these other pointers are not lost.
To help here, use the above mentioned PROTECT/UNPROTECT macros. The PROTECT macro tells the VM, that some value is still referenced from primitive code (possibly not reachable via regular object references) and that the referenced object should be protected in case of a garbage collect.
This protection stays active until the UNPROTECT macro releases the protection.
These macros have to be used in a stack-like fashion; every PROTECT must be followed by a corresponding UNPROTECT. If you forget any UNPROTECT, some VM internal table will sooner or later overflow, and bad things happen to it.
(for the curious:
PROTECT/UNPROTECT are actually stack push/pop operations for a VM protection stack. The pointers found on this stack are taken to be references to living objects by the garbage collector and updated as appropriate. If you forget to UNPROTECT, this stack may overflow. Also, you will get back the updated pointer value via UNPROTECT. Any other object reference (in a C-variable) will be osolete and may point to an invalid location.

The following example shows what has to be done if the stored values are allocated in the primitive.

    primitivePoint
    %{
	OBJ newPoint;
	OBJ newX;
	OBJ newY;

	/*
	 * create a float for some X value:
	 */
	newX = __MKFLOAT(1.2345);

	/*
	 * in case allocation failed
	 */
	if (newX == nil) {
	    RETURN (nil);
	}

	/*
	 * create a float for the y value.
	 * WARNING: newX may be invalid after that, if not protected,
	 * because the allocation of newY may lead into a garbage collect
	 */
	__PROTECT__(newX);
	newY = __MKFLOAT(3.14159);
	__UNPROTECT__(newX);
	/*
	 * in case allocation failed
	 */
	if (newY == nil) {
	    RETURN (nil);
	}

	/*
	 * create a point for the x and y values.
	 * WARNING: newX and newY may be invalid after that, if
	 * we do not protect them, because the allocation of the
	 * new point may lead into a garbage collect ...
	 */
	__PROTECT__(newX);
	__PROTECT__(newY);
	newPoint = __STX___new(sizeof(OBJ)*2 + OHDR_SIZE, SENDER);
	__UNPROTECT__(newY);  // fetch possibly changed reference
	__UNPROTECT__(newX);  // fetch possibly changed reference
	if (newPoint == nil) {
	    RETURN (nil);
	}

	/*
	 * MUST set the class
	 */
	__objPtr(newPoint)->o_class = Point;
	/*
	 * must set the fields
	 */
	__InstPtr(newPoint)->i_instvars[0] = newX;
	__InstPtr(newPoint)->i_instvars[1] = newY;
	/*
	 * see below
	 */
	__STORE(newPoint, newX);
	__STORE(newPoint, newY);

	RETURN ( newPoint );
    %}
The above primitive becomes much cleaner (and less error prone), if you keep these references in local variables of the method's context, instead of using C-language variables.
Be reminded, that these context variables are always handled correctly by the garbage collector, therefore you do not have to take care of invalid pointers here:
    primitivePoint
	|newX newY newPoint ok|

    %{
	ok = false;

	/*
	 * create a float for some X value:
	 */
	newX = __MKFLOAT(1.2345);
	/*
	 * always check for allocation failure ...
	 */
	if (newX != nil) {
	    /*
	     * create a float for the y value.
	     */
	    newY = __MKFLOAT(3.14159);
	    if (newY != nil) {
		/*
		 * create a point for the x and y values.
		 */
		newPoint = __STX___new(sizeof(OBJ)*2 + OHDR_SIZE, SENDER);
		if (newPoint != nil) {
		    /*
		     * MUST set the class
		     */
		    __objPtr(newPoint)->o_class = Point;

		    /*
		     * must set the fields
		     */
		    __InstPtr(newPoint)->i_instvars[0] = newX;
		    __InstPtr(newPoint)->i_instvars[1] = newY;

		    /*
		     * see below
		     */
		    __STORE(newPoint, newX);
		    __STORE(newPoint, newY);

		    /*
		     * good
		     */
		    ok = true;
		}
	    }
	}
    %}.
	ok ifTrue:[
	    ^ newPoint
	].

	"/ mhmh, something went wrong ...

	ObjectMemory allocationFailureSignal raise

The above also shows the preferred method of handling errors; set a flag somewhere in your primitive code, and let Smalltalk code do the error handling (typically raise a signal or perform a primitiveFailure).

For experts:
there are also macros for very hi-speed allocation (without a function call). These macros will directly manipulate the storage managers free-pointers, thus allowing an object to be allocated with just a few machine instructions. These macros are in the order of 10-20 times faster than ordinary malloc/free calls if no garbage collection is involved (*3).
The macros are:

__qNew(newObject, int size)
tries to do a quick (inline) new; if no memory is available, calls the general __STX___new() for a garbage collect.

__canDoQuickNew(int size)
returns 1 if a quick new is possible WITHOUT a garbage collect. (i.e. returns zero if a quick new operation would do a GC)

__qCheckedNew(newObject, int size)
does the quick new; is only allowed after a canDoQuickNew() which returned true.

__qMKFLOAT(newObject, double dval)
like __MKFLOAT, but tries to allocate inline; if no memory is available, the general __MKFLOAT is invoked which performs a garbage collect.
Since use of these macros can also lead to mysterious errors, they should not be used in normal situations. (the saving is NOT spectacular under normal circumstances.)

You should start to write your primitive using #new on the Smalltalk level or (at least) the normal __STX___new() function and later decide if its worth to tune the allocation (I'd say: seldom) .

Notice, that stc does generate high performance code for some classes' new operations. For example, it may decide to generate inline allocation code for a float, point or array creation.
This stc generated code may be faster than what you write - especially, if you are not too familiar with all those internals.

Allocating C memory in primitives

If C memory is allocated (i.e. malloc or calloc are called either directly or indirectly) care must be taken to cleanup this memory if the primitive performs other sends or if the there is a possibility that the method gets interrupted. Otherwise, there is a chance for a memory leak, where this C memory is never freed.
For example, the method:
    dangerousMethod
    %{
	char *cp;

	cp = malloc(100);
	SEND(...);
	...
	free(cp);
    %}
possibly creates this memory leak (and may also crash, since it does not check the return value of malloc for being nil).

Due to the message send in this method (the SEND call), it is possible, that the free() call is never executed. This happens if either a signal raise (with a handler and unwind) or a block return to some upper method, or a process termination occurs in or below the method reached via the send.

Of course, this will not lead to a crash, but instead to more and more memory being allocated over time. This memory will never be freed since the garbage collector does not know anything about this C memory.
These errors are especially hard to find, since Smalltalk does not know about it and they do not show any fatal conditions for quite a while (eventually, allocation failures will occur, though)

There are three possible solutions to fix the above, of which 2 suffer from other problems (they are described for didactic reasons):

Of course, the last example is the clean one; it handles C-memory somewhat automatic, in the Smalltalk way.

We highly recommend this solution, since it also solves the reentrancy problem. (If the memory block is passed to a C-function which remembers that memory reference (e.g. it should not be freed by the smalltalk collector but by explicit freeing), use "ExternalBytes protectedNew:"
Please read more on ExternalByte below.

Allocating C memory from Smalltalk code

The easiest (and most secure) way to allocate C-memory is to use instances of ExternalByte. These Smalltalk objects keep a handle to some memory which is allocated via malloc (and stays at a constant address, therefore).

In addition, you can either manually free the underlying C memory or let it be freed by the garbage collectors finalization procedure.

For example, to allocate a block of 100 bytes, use:

    |bytes|

    ...
    bytes := ExternalBytes new:100.
    ...
from your primitive, this memory can be accessed via:
    ...
    {
	char *cp;

	cp = (char *)(__externalBytesAddress(bytes));
	cp[i] = ...
    }
    ...
on the Smalltalk side, externalBytes provide an accessing protocol much like ByteArrays do; therefore, you can fill this memory from ST with:
    ...
    bytes at:1 put:16r13.
    ...
    foo := bytes at:10.
    ...
(notice the 1-based indexing in Smalltalk).

In the above example, the externalBytes memory is never automatically freed. You have to manually free the memory later with:

    ...
    bytes free.
    ...
Be careful in freeing all memory allocated - otherwise, memory leaks are unavoidable. ExternalBytes has been written by purpose to behave that way, to support storage which can be passed to and kept by C code. Since ST/X cannot know if C code keeps any of those references, no automatic freeing is done.

However, this may not be a good choice in your concrete application; therefore, ExternalBytes offers a mechanism for automatic freeing. This is implemented with the finalization mechanisms described elsewhere. To get some memory which is freed automatically, use:

    ...
    bytes := ExternalBytes unprotectedNew:100.
    ...
You can also create an externalBytes object from your primtive code, and pass it up to Smalltalk:
    ...
    char *cp;
    OBJ bytes;

    cp = (char *) malloc(100);
    bytes = __MKEXTERNALBYTES(cp);
    ...
    RETURN (bytes);
freeing can be done by either by C-code:
    ...
    char *cp;
    OBJ bytes;

    cp = (char *) __externalBytesAddress(bytes);
    free(cp);
    ...
    __ExternalBytesInstPtr(bytes)->e_address = (OBJ)0;
    ...
If the memory is freed by C-code, you have to nil-out the address-pointer in the externalBytes object, to tell the finalization code that the memory is already freed. Otherwise, bad things will happen, as the memory will be freed twice.

It is better to free it from Smalltalk with:

    ...
    bytes free.
    ...
See the examples and documentation in the ExternalBytes class for more info.

Modifying objects from within primitive code (storing)

Special care must be taken, when an object reference is stored somewhere in a primitive.
To give you some background, the garbage collector (GC) must know which objects have references to some special things like contexts. Also cross space references (i.e. old objects referencing new objects or any object referencing a stack-context) must be detected and the GC be given a chance to remember these. This is called a write-barrier or store-check, and the underlying code places those objects into the so called remembered list.

Therefore, every store of a non-SmallInteger, non-Nil object into another object must be checked for these special situations. To do so, after every store, a macro of the form:

	dest->i_inst[3] = value;     /* the store */
	__STORE(dest, value);        /* the writeBarrier-macro */
must be placed. This macro checks for those situations and calls a GC utility-function if required.

Omitting the STORE macro may leave you with a running program for a while or longer but lead to a bad crash at a later time. The crash will also usually hit some other object and it will be very hard to find the actual cause. The kind of crash is hard to predict, as arbitrary objects may become inconsistent or get invalid contents. For example, if a bytearray or string is affected, it may even silently get changed elements without ever getting a warning message from the compiler.

Sometimes it may even go totally unnoticed up to the time when storage allocation patterns or types of stored objects change.

Therefore: Never forget this macro for pointer stores!

It can ONLY be omitted if you are absolutely certain that the stored value is either nil nor a SmallInteger. Also for all non-pointer-arrays (i.e. ByteArray. FloatArray, DoubleArray and String) this store-check is not needed for the elements.

If you get one of the following error messages from the garbage collector:

    GC [error]: tried to mark free inst ... in ...
    GC [error]: tried to mark strange object: ...
    MEM [error]: bad class: ... in object at ...
    MEM [error]: class of ... is nil in scavenge
    MEM [error]: class of inst[...] ... is nil in scavenge
chances are high, that you forgot this STORE macro somewhere and more trouble can be expected.
The ST/X memory system tries to do its best to handle the situation by nilling invalid references to continue its operation - however, this cannot be guaranteed and crashes or endless loops in the memory handling code are likely to occur.
Even if the system seems to work properly: whenever you get messages from the VM, GC or MEM subsystems, carefully check your primitive code for assignments to instvars, classvars, globals, arrays or other objects with a missing STORE chack macro.

From our past experience, almost all (>80%) of all trouble came from user written primitives which had one or more STORE checks missing (the rest being missing PROTECT/UNPROTECT macros).

Late note:
The newest ST/X releases can be started with the VM commandLine argument "-MprotectSurv". This makes unsused portions of the object memory inaccessible (by using mprotect), and may help to find those problems, since missing STORE macros usualy lead to object references not being updated correctly and therefore tends to result in pointers into the alternative semispace of the copying garbage collector.
However, this is not a replacement for a careful code review, as not all operating systems provide this functionality and it also this does not guarantee to detect all such bugs.

Storing into another Object's Instvars (not recommended)

The only store that is actually legal in Smalltalk is a store into instance variables of the current receiver. Any change of another object's slot requires a message send. Of course, in your primitive code, a "bad guy" can store into any object without a message send, and thus circumvent this encapsulation.

Doing so is a very very bad practice, because another object's instance slots can only be accessed by index. Thus it makes your code invalid, if the accessed object's instance layout changes.

If you ever have to return structured data or fill a passed-in object from a primitive, we recommend passing in or returning an array, and moving the slots into the target object via setter calls on the smalltalk level afterwards. Take a look at the event-returning code in XWorkstation to get the idea.

This may not be the "superfast" solution, but it is save and maintainable.

If you really have to care for every nanosecond, pass in the instvar offsets of the target object to make your primitive code independent from the target object's layout (either as argument or via preinitialized class variables).

Be reminded that this is more of a theoretical advice - even the author of the ST/X system never felt a need to do this.

Storing into the receiver

To update an instance variable named 'xyz' in the current receiver, the following code is needed:
	__INST(xyz) = value;
	__STORE(self, value);
As a rule: this should normally NOT be done from primitive code; better separate the code into a primitive part and a Smalltalk part - let stc handle all these internals.
example:
    myMethod
	myInstance := someThing.
    %{
	/* other stuff */
    %}
    .
	myInstance := somethingElse
Final note: a store check is also needed when storing into a method variable of the home context from within a block context. However, writing blocks as inline c-code is very tricky and even more restricted. Therefore, forget about inline C-blocks (I would not write any !) and better write your blocks in Smalltalk.

Storing into globals and class variables

A store check is also needed when updating a value into a global variable, or classVariable. After modifying global XYZ, you should add:
	__STORE(Smalltalk, XYZ);
to your code.

For the curious: these macros are found in "stc.h"; all of them can be used in expressions and evaluate to the stored value.

Mixing Smalltalk and C-code

You can have multiple primitive code statements within one Smalltalk method, and even switch back and forth between C and Smalltalk.
However, since there are certain limitations: no C-variables can be shared between these code pieces and they are not allowed in non-inlined blocks, because the blocks' statements will end up in different C-functions after stc-compilation.
Therefore, we do not recommend this coding style, except for very special needs.

As an artifical example, here is a Smalltalk loop over a collection, collecting its (integer) elements into an externalBytes object: (you may need this, to pass an integer vector to some external C function later)

    externalBytesVectorFor:someCollection
	|sz index element vector|

	sz := someCollection size.
	vector := ExternalBytes new:(sz * ExternalBytes sizeofInt).

	index := 1.
	[index <= sz] whileTrue:[
	    element := someCollection at:index.
    %{
	    INT iVal;
	    int *pI;

	    iVal = __longIntVal(element);
	    pI = (int *)(__ExternalBytesInstPtr(vector)->e_address);
	    pI[_intVal(index) - 1] = iVal;
    %}.
	    index := index + 1
	].
	^ vector
Notice, that intimate knowledge about the generated C code may be required. The above depends on the fact, that while-loops are inlined by stc - thus the C-statements will actually be placed in the method's body. This would be different for non-inlined blocks.

Interrupting primitive code

By default, interrupts (i.e. operating system signals like timers or CTRL-C) are not handled immediately while in a primitive, but instead simply set a flag and are processed later (with the next send or when the method returns).

For most primitives, this behavior is correct and simplifies the writing of primitive code, because you don't have to care for effects due to multiple processes, reentrancy, context switching etc.
Also, typical C-libraries (Xlib, Stdio etc.) are not thread safe or reentrant and therefore, library functions should normally not be interrupted.

However, primitive code in which a blocking I/O or a (long) wait operation occurs, should be interruptable and tell the interrupt system to do immediate interrupt processing and handle the interrupt even while executing primitive code. Otherwise there would be no chance to get out of a blocking read (for example, when reading from a socket/pipe to which noone is writing).

To tell ST/X that interrupts should be handled immediately, two macros are provided in the "stc.h include file. These should be placed around interruptable code, as in:

    ...
    __BEGIN_INTERRUPTABLE__
    n = read(fd, buffer, count);
    __END_INTERRUPTABLE__
    ...
Notice, that without this flag being set, timer interrupts will not be handled while waiting for input to arrive. Instead, the interrupt would be pending and handled as soon as the primitive code returns. Thus no switching to other (Smalltalk-) processes is done without those macros.

Of course, also "CTRL-C" processing is done using signals - therefore even pressing "CTRL-C" has no effect and there is no chance to get into the debugger.
Beside the obvious read() and write() many other C library functions possibly block for longer time - sometimes unexpected as with gethostname().

The handling of interrupts is not done the same on all systems (not even within the Unix world). Especially when system calls are interrupted, things may be very different (BSD vs. SYS5 signals; i.e. restarted call vs. EINTR return).
Your code should take care of this, by checking for an errno of EINTR and retrying the operation as appropriate. Therefore, the above was an oversimplification; real code should look like:

    ...
    __BEGIN_INTERRUPTABLE__
    do {
	n = read(fd, buffer, count);
    } while ((n < 0) && (errno == EINTR));
    __END_INTERRUPTABLE__
    ...
Notice, that either the operating system or the stx runtime system keeps a private copy of the errno variable for each thread - preserving the errno value even if another thread does some system calls in the meanwhile.

As a side effect, enabling immediate interrupt processing enables the VM to reschedule to another process. If your interruptable primitive code runs at a lower priority, higher priority processes can suspend it and become active. This allows long computations (even in c-code) to be performed in the background.

On the other hand, if an interrupt occurs and is handled immediately, you have to be very careful in coding your primitive. Anything can happen there, especially recursive entry into this method, a garbage collect, context unwinds, long returns or even process termination have to be considered.

Be VERY VERY careful when enabling interrupts in a primitive method which calls c-library functions; most system's libraries are NOT prepared for reentrant calls. This may result in memory leaks, locks, flags which are not set correct, stack overflows ...

For this very reason, there are only a few places in the system (blocking select and I/O system calls) where interrupts are currently enabled.

As an example, the following does some long computation in C, which is interruptable. Therefore, it can be performed in the background without disturbing foreground processes:

    longComputation
    %{
	int run, i, j;

	__BEGIN_INTERRUPTABLE__

	for (run=0; run<20; run++) {
	  for (i=0; i<50; i++) {
	    for (j=0; j<1000000; j++) {
	    }
	  }
	  printf("looping\n");
	}

	__END_INTERRUPTABLE__
    %}

    "
     [XXX new longComputation] forkAt:3
    "
Be very very careful, when enabling interrupts, while processing some Smalltalk object, since this may be moved by the garbage collector, or (which is worse) even be changed into a completely different object by a #become: (which another process could theoretically do).
(Another reason for most primitives being executed as non interruptable, atomic operations.)
Example: interruptably processing some string:
    interruptableDestructiveUpperCaseToLowerCase:aString
    %{
	int i, len;
	char c, *cp;

	/* check if argument is a string */
	if (__isString(aString)) {
	    /*
	     * fetch size once
	     */
	    len = __stringSize(aString);
	    /*
	     * walk over string till len characters have been processed
	     */
	    for (i = 0; i < len; i++) {
		/*
		 * allow interrupts only here; let the loop body run uninterrupted
		 */
		__BEGIN_INTERRUPTABLE__
		__END_INTERRUPTABLE__
		/*
		 * check if the string has magically changed its identity or size
		 */
		if (! __isString(aString)
		 || (len != __stringSize(aString)) break;
		/*
		 * refetch the actual pointer - the old one may be obsolete
		 */
		cp = __stringVal(aString)->s_element;
		if (isUpper(c = cp[i])) {
		    cp[i] = toLower(c);
		}
	    }
	    RETURN ( aString );
	}
    %}
    .
	self primitiveFailed
    !
For fine-tuned performance optimized primitives, you can ask the VM if there are chances for any pointers to be invalidated:
The VM keeps counters which are incremented with any pointer changing operation; these are: using this information, a higher performance version of the above can be written as:
    interruptableDestructiveUpperCaseToLowerCase:aString
    %{
	int i, len;
	char c, *cp;
	extern int __pointerGeneration__;
	int currentGeneration = __pointerGeneration__;

	/* check if argument is a string */
	if (__isString(aString)) {
	    /*
	     * fetch size once
	     */
	    len = __stringSize(aString);
	    cp = __stringVal(aString)->s_element[0];
	    /*
	     * walk over string till len characters have been processed
	     */
	    for (i = 0; i < len; i++) {
		__BEGIN_INTERRUPTABLE__
		__END_INTERRUPTABLE__
		/*
		 * check if the string has magically changed its identity or size
		 */
		if (__pointerGeneration__ != currentGeneration) {
		    if (! __isString(aString)
		     || (len != __stringSize(aString)) break;
		    /*
		     * refetch the actual pointer - the old one is obsolete
		     */
		    cp = __stringVal(aString)->s_element[i];
		    currentGeneration = __pointerGeneration__;
		}
		if (isUpper(c = *cp)) {
		    *cp = toLower(c);
		}
		cp++;
	    }
	    RETURN ( aString );
	}
    %}
    .
	self primitiveFailed
    !
Of course, the above was not a real-world example: you would not care for interrupts when procesing small collections. However, if the collection is huge and the processing takes in the order of 100's of milliseconds, the system will be more responsive if you care for interrupts.
Typical situations are vector and/or matrix opertions, FFT computations etc.

Triggering interrups from C code

As mentioned above, interrupts (i.e. signals) are handled by setting a flag and checking this flag at regular times (with the next message send, when the current context returns or at a loops head).
These interrupt flags can be accessed from C code. Therefore, interrupts can be triggered from C code by setting the corresponding flag to nonNil (typically, they are set to true).
Although possible, you should not play around with interrupt flags other than the customInterrupt, which was specially designed for the purpose of triggering interrupts from a C subsystem.
For example, a C signal handler function may trigger this interrupt by calling:
    __stxRegisterCustomInterrupt__();
to set the flag. and
    __stxHandleInterrupt__();
to have the runtime system check for immediate interrupts being enabled and perform the interrupt processing sequence as required.

In the Smalltalk world, this will send a #customInterrupt message to the CustomInterruptHandler.

Using custom interrupts is especially useful for callBacks and other C functions which want to interrupt Smalltalk processing and get immediate response.

Timed callBacks into C code

You can install a timer callBack into an external C function either from the C level with:
    int id = __STX_AddTimeout(func, millis, arg)
or from Smalltalk with:
    id = Processor
	    addTimeoutFunctionCall:anExternalFunction
	    for:nil
	    afterMilliseconds:millis
	    with:arg
Both arrange for some C-function to be called after millis milliseconds, passing one single argument, arg.

The timeout is installed as a oneShot - i.e. to get some cyclic calling, you have to reinstall another timeout from within the called function.

Any pending timeout can be removed with:

    __STX_RemoveTimeout(id)
or from Smalltalk with:
    Processor removeTimeoutWithID:id
passing the id as returned by the previous addTimeout functions.

An example & demonstration can be found in the "doc/coding/cCallBack directory.

Reporting errors from C code

The obvious way to return error information from primitive code is via the return value; either directly return a returncode, or leave the returnCode in an instance variable and return some special value (such as nil) from the primitive method.

The above should work in most situations;
however, there are cases where the error is reported asynchronously, and there is noone to return an errorCode to.

For example, take the error mechanism in the Xlib:
here, in case of an error, a handler (c-)function is called from withing Xlib, which gets additional error information passed as arguments.

In ST/X, these asynchronous errors must be handled as errorInterrupts, since there is no particular method available, to which the handler could return this error information.
To do so, the (C-)handler function should call one of the VM functions

    __errorInterruptWithID__(id)
or:
    __errorInterruptWithIDAndParameter__(id, param)
and pass some unique subsystem identifier as id argument. This identifier is typically a unique number or (better) a symbol.
The second function allows for an additional value (for example, some more error detail, or a connectionID) to be passed to the Smalltalk interrupt handler. (the first entry simply passes nil as parameter) If present, the parameter must be a valid Smalltalk object (typically, a smallInteger).

For example, the XErrorHandler uses the symbol #DisplayError as errorID and passes the display connections ID as (integer-) parameter.

Calling one of the above functions leads to an error interrupt being registered, which eventually invokes the errorInterrupt:with: method (in Object). The actual interrupt processing may be done asynchronously (for example, if the VM is currently in the garbage collector, or a noninterruptable primitive).
This method uses the passed id to find a specific handler and forward the error to that handler; again by sending #errorInterrupt:with: to it.
The specific handler must have been registered previously via

ObjectMemory registerErrorInterruptHandler:aHandler forID:anId
- this is typically done at class initialization time.

If the above sounds confusing, lets see a concrete example:
assume that you want to use some C library, which calls an error function asynchronously, and this error should be translated into a private (Smalltalk-) signal.
First, create the signal in your class's initialize method and provide access to it:

!MyClass class methodsFor:'initialization'!

initialize
    ...
    MySignal := Object errorSignal newSignalMayProceed:true.
    MySignal nameClass:self message:#mySignal.
    ...

...

!MyClass class mathodsFor:'Signal constants'!

mySignal
    ^ MySignal
!
....
then, define the specific errorInterrupt method in your class, which raises the signal:

!MyClass class methodsFor:'error handling'!

errorInterrupt:id with:someParameter
    mySignal raise
!
....
define the handler (c-) function, which reports subsystem errors as an errorInterrupt with an id of #MySubsystem. It should look like:
!MyClass class primitiveFunctions!

subsystemError() {
	__errorInterruptWithIDAndParameter__(@symbol(MySubSystem), nil);
}
Finally, register this errorInterrupt handler to be invoked for all errors reported for #MySubSystem (this can be done in the classes initialize method, or when the first instance of your class is created):
...
ObjectMemory registerErrorInterruptHandler:self forID:#MySubSytem
If you don't need a specific handler, but instead want all subsystem errors be handled by the default errorInterrupt mechanism (which raises an errorSignal, typically entering the Debugger), you can call __errorInterruptWithID__(id) without ever registering a specific handler.
In this case, the passed id will be available as parameter in the exception handler (of errorSignal).

Further examples:
see how DeviceWorkstation and XWorkstation handles errors occuring in the graphic interface.

Problems

conflicting names of local variables and structures/typedefs

Names of C-Structures, structure fields and typedefs may not conflict with the names of method or block local variables. Stc will produce which leads to a syntax error in the C-compilation phase.
Example:
    %{
	struct abc {
	    int field1;
	    char field2;
	};
    %}

    method
	|local1 field2|
	...
will lead to an error, since the name "field2" is used both in a c-structure and as a method local. This may also happen with other C-names (i.e. typedefs, structure names, enum values etc.)

Workaround: rename the local variables.

other name conflicts with structures, macros and typedefs

STC does not always handle instance- and local variable names correctly, which have the same name as existing C-structures or C-typedefs. This also applies to structures defined in stc's standard header file "stc.h". (as in version 2.10.3). So, you have to avoid names such as 'byteArray', 'array' etc.

This will be changed soon - at least for names in "stc.h". Compiling code with such conflicts will usualy lead to errors in the C-compilation phase. Since stc does not parse (and understand) the structure of primitive code, it will not notice this conflict.

local variables whose name starts with an underscore

Although STC does handle local variables whose name starts with an underscore, name conflicts may arise due to the renaming of symbols (as described above). For example, the symbol #x is translated into a C variable _x, which creates a problem when a local method variable named _x exists.

This will be changed to use a different renaming scheme either for local variables or symbols (or both).

For now, be when a locals name starts with an underscore; we suggest you use 'normal' variable names for now.

Solutions for typical problems

The next chapter shows solutions to typical tasks that primitives have to deal with. Have a look at, and understand the examples; some may give you a starting point for writing your own primitives. At least, they should give hints for what needs special considerations.

Most problems involved with primitives deal with memory management; especially, when data structures as returned or filled by some C function have to be converted into Smalltalk objects.

As time goes by, more examples will be added to this section.

Hints & tips for writing primitives

The following is a list of common bugs, and can be used as a check list in case of 'mysterous' behavior. If you get a big bunch of error messages from the C-compiler, start to fix things from top; followup errors may be a consequence of a single missing parentesis, bracket or brace, which destroys the program structure.

Symptoms and Strategies to Find and Fix Bugs

Notes:


Copyright © 1995 Claus Gittinger, all rights reserved

<cg at exept.de>

Doc $Revision: 1.55 $ $Date: 2021/03/13 18:24:51 $