[prev] [up] [next]

CParser & CTypes

Contents

Introduction

The CParser and CType class hierarchy provide a framework to read C-Language header files containing type declaration and #define directives.

While parsing, the CParser generates the corresponding type information into a hierarchy of CType objects, which can then be used to create and manipulate byte-oriented data blocks.

Licensing

The CParser/CType package is not included in the standard distribution; it is delivered as an extra (non-free) add-on package.
Please contact eXept for license information & pricing.

Area of use

The CParser & CType framework is especially useful to interface to C-Language data or programs via either shared data files, or via some communication mechanism (such as Pipes or Sockets).

In contrast to ST/X's inline C-code features, CParser and CType are completely implemented as Smalltalk code and are therefore easier to use and less error prone.
However, the performance may be slower than corresponding hardcoded inline C routines, since a lot of meta information is kept in the CType hierarchy.

Overview

Use of the framework consists of three major parts:

Parsing C-Types

As a first step, a C-Header file (or a string, containing the C-Language source) must be given to CParser and parsed.
The resulting collection of C-Types should be kept by the application (typically in a class variable).
A good place to perform this task is a classes #initialize method.

Example:

    ...
    classVariableNames:'CTypes'
    ...


    initialize
	"parse C-Types from the file cDefs.h,
	 which contains C-Language types and #defines"

	CTypes isNil ifTrue:[
	    parser := CParser new.

	    parser parse:('cDefs.h' asFilename readStream).
	    "/ fetch types ...
	    CTypes := parser types.
	].
	...
Now, the classVariable CTypes refers to a dictionary of C-types, where the keys are the type names. Thus, if the C-header file contained the definition
    struct myStruct {
	int foo;
	float bar;
    };
a corresponding entry will be found in the dictionary under the key myStruct,
i.e.
    ...
    myStructType := CTypes at:'myStruct'.
    ...
Beside reading types, the CParser also keeps track of #define definitions. Defines can be retrieved from the parser via the #defines message.
Notice, that defines are typeless - i.e. the cparser treats and returns all defines as string-defines. However, some protocol exists to extract a #defines integer value (which might be required for bit-constants or array dimensions).

#include and #if directives are not handled by the CParser - if required, a cpp (c-preprocessor) filtered output must be used for CParser to handle the header file.

Meta knowledge of C-Types

CTypes keep all information as collected by the CParser; therefore, it is possible to query the type for various aspects. Of special interest are: (Notice, that there are many other query methods - see the CType implementation in the Browser, for a complete list)

Allocating C-Data

Given a CType, you can allocate a corresponding CDatum by sending one of the following messages to the CType: Use onBytes: if some data has either been allocated elsewhere (for example, in a C-primitive function or library routine), or has been read from a file or communication channel. For example, when reading data from a DataBase or via a Socket.

Use new / new: for all data which is not given to C code directly (i.e. for message/data buffers for file storage, or which are sent to another program via a pipe or socket).

Use gcMalloc / gcMalloc: for data which is passed to either inline C-code or to a C-library function and it is known that the C-code does not keep a reference to the datum internally. This memory will be automatically freed whenever smalltalk has no more references to it.

Use malloc / malloc: for data which is passed to either inline C-code or to a C-library function and it is either unknown if or certain that the C-code keeps references internally.
Be very careful to avoid memory leaks, since the storage must be freed manually (via the #free message) by the programmer.

Manipulating C-Data

CDatum objects respond to the same query protocol as described above, plus the additional protocol: CDatums provide access protocol to access indexed elements via #at: / #at:put: and field members via #memberAt: / #memberAt:put: messages.

If the elementType (for arrays) or fieldType (for struct/union) is a scalar type (i.e. char, int, float or double), the get methods return smalltalk integers or floats, and the set-methods accept smalltalk numeric objects as value.

For a non-scalar element type, the get-methods return another CDatum (i.e. a copy) and the set methods expect a cDatum.
For convenience, some smalltalk collections are allowed for setting:


In addition, the doesNotUnderstand: method is redefined to allow for member access in the typical smalltalk fashion (i.e. get/set protocol).
i.e. field members can also be accessed via:

For example, the above example data structure can be allocated and manipulated as:

    ...
    myStructType := CTypes at:'myStruct'.
    myDatum := myStructType new.
    myDatum memberAt:'foo' put:15.
    myDatum memberAt:'bar' put:3.14159.
    ...
or:
    ...
    myDatum foo:15.
    myDatum bar:3.14159.
    ...

Pointers into C-Data

Sometimes, it is useful to create pointers into a CDatum - for example, to use a common helper-method which manipulates a subStruct, or to process a substruct without a need to copy the underlying storage.

Remember, that the #memberAt: message extracts a field, which results in an expensive copy, if the field is a structure, union or array.

You can create a CPointer (which points into another CDatum) with:

ByteOrder issues

Often, when data is passed between machines, the byteorder is different between the CPU architectures.
To provide a convenient solution for this problem, CDatums keep the byteOrder of their data and allow it to be queried or changed.
By default, CDatums assume that the byteorder is that of the underlying CPU (i.e. LSB for intel/alpha, MSB for hp/sparc).

At any time, a CDatums byteOrder can be changed/queried via the

message.
Thus, when some data has been retrieved via a socket or pipe, and the data is known to be bigEndian (i.e. MSB-first), simply send the CDatum the message:
    cDatum msb:true
All followup accesses will assume bigEndian data.

You should always set the byteOrder when communicating with external processes/machines (i.e. do not depend upon the default, because it is not the same on all ST/X implementations)

Examples

The following code fragment can be used to send and receive C-structured data blocks to/from a C-program via a Socket. Data is transfered in msb-first (i.e. network-) byteOrder.

receiver:

    |buffer socket datum foo bar|

    ...
    buffer := ByteArray new:1024.
    ...
    socket readWait.
    socket nextAvailableInto:buffer.
    ...
    datum := myStructType onBytes:buffer.
    datum msb:true.
    ...
    foo := datum foo.
    bar := datum bar.
    ...
sender:
    |buffer socket datum foo bar|

    ...
    buffer := ByteArray new:1024.
    ...
    datum := myStructType onBytes:buffer.
    datum msb:true.
    ...
    datum foo:123.
    datum bar:1.2345.
    ...
    socket nextPutBytes:datum sizeof from:datum data.
    ...
The following code fragment uses a helper method to initialize the fields of a structure and a CPointer is passed to it to fill a substructure.
The corresponding C-header definitions are:
    #define NUM_CHARS  10

    typedef struct foo {
	int     foo1;
	float   foo2;
    };

    typedef struct bar {
	foo     innerFoo;
	int     bar1;
	char    bar2[NUM_CHARS];
    };
the smalltalk code is:
initFoo:aFoo
    aFoo foo1:10.
    aFoo foo2:(Float pi).
    ^ self


    ...
    cTypes := cparser types.
    cDefines := cparser defines.
    ...
    fooType := cTypes at:'foo'.
    barType := cTypes at:'bar'.
    ...
    aBar := barType new.
    ...
    self initFoo:(aBar refMemberAt:'innerFoo');
    ...
    NUM_CHARS := Integer fromString:(defines at:'NUM_CHARS').
    ...
Notice, that passing the result of #memberAt: to the initFoo method would not work in the above example, since that would pass a copy of the inner structure and leave the original (outer) datum unchanged.


Copyright © 1999 eXept Software AG

<info@exept.de>

Doc $Revision: 1.15 $