[prev] [up] [next]
			First learn computer science and all the theory.
			Next develop a programming style.
			Then forget all that and just hack.

			(George Carrette)

Coding Style used in Smalltalk/X Classes

Contents

Introduction

This document describes the coding style and conventions used in Smalltalk/X's class library.
The author is aware of the fact, that coding style is a very personal matter and should not be enforced by dictators.
However, in this system, when you add code to be published and read by others, you are not alone. Therefore, it is useful to follow some rules, to enable other programmers an easier entry into the system. Also, there exist tools which extract useful information and can format neat documents if you follow those rules in your classes. Thus, when it's about time to deliver documentation on your project, a whole bunch of work is done for free.
Experienced Smalltalk programmers may especially look into this document, because the ST/X coding style is slightly different from what other Smalltalkers consider to be "readable code".

If you have any suggestions or additions on this theme, let me know about it.

Finally: a lot of un-maintanability comes from programmers who are lazy typers - short or obfuscated class-, method- and variable names. And also: often some comment on how to get a framework started or used.

Be reminded again, that code is usually written only once but read and possibly modified many many times over its lifetime. The time you might save by ommiting documentatin while writing will be spent manyfold by others (and possibly yourself) later, when you have to decipher the code in a year or so...

I admit that some of Smalltalk/X's code does not follow those guidelines: some code is very old and the author(s) of the code have matured as we all do. Bad code is and will be refactored or reformatted, whenever we encounter it.

General: English

All program code must be written in english.

Motivation:
Some time ago, I have been shocked by getting a Smalltalk program which was written in another (national) language. All method, class and argument names where completely unreadable to me and everyone else in the team. And I could only hardly understand, what the program was doing. I still occasionally receive individual methods with local variables and comments in another language, and it is very hard at times to understand it.

Therefore this guideline forces everyone to use a language which is understood by everyone. I request programs to be written in english, using english class names, english selectors, english comments and variable names. As every programmer - even from the far east - understands english, but not vice versa, this should make communication easier.

This has nothing to do with 'western cultural imperialism', or an 'egocentric view of the world', or ignoring the culture and language of others - especially minorities, as some (pseudo liberals) might complain.

It is purely practical: it should support the comprehensability of programs among a worldwide programmer community.

Even the authors of the ST/X system are not a native english speakers, so they too have to be careful at times.

By the way: the reason for not supporting non-Latin1 characters in variable names, class names and message selectors is part of that enforcing: by not being able to use other characters, people from east europe, the near or far east are less motivated to use non-english language in the program.

There is nothing to prevent you from using non-Latin characters in string constants, the UI or any other non-code areas (although it is highly recommended to use plain english in UI-strings and provide translation files for other languages).

Class Documentation

In Smalltalk/X, every class contains a method category called "documentation" in the class protocol. You will find at least two methods named "version*" and "documentation" there:

The "version*" Methods

The "version" methods are exclusively maintained by the source code manager. They tell the system where to find a classes source code in the repository. A class may be stored in multiple repositories, so occasionally you may find multiple version methods in a class. You should never ever touch or remove them. Details are described below.

If you report an error (to eXept), this string can be used to identify the exact version of the class.

The "documentation" Method

The "documentation" method's comment should describe the class, its uses and (if of public interest) its instance and class variables.

In many classes, you will find an additional #examples method. This should contain a comment giving typical uses (often ready to select & doIt).

These methods consist of comments only; they are not meant to be executed. (actually, if evaluated, they will return the receiver; since empty methods are semantically equivalent to a "^ self" method).

In contrast to other Smalltalk systems, which keep the class comment in the "comment" instance variable, ST/X uses the method comment. The reason is simply that the comment resides in memory, whereas all method sources are typically only present in external files. So, this scheme saves a lot of memory, if long and descriptive comments are present, and there is no 'performance' or 'memory usage' argument against them - which is exactly what we want.
Another reason is that this makes it easier to put structure into multiple comments; for example, one for internals, another for examples, etc.

The SystemBrowser automatically shows the documentation text (found either in the "documentation" method or in the class comment) whenever a class is selected. Thus, to be nice to other people browsing through the system, please add a short description of what your class is about in that method.

Also, the document viewer can extract a classes' documentation method's text and present it cutely formatted - you get your documentation almost for free, if you stick to these conventions! This is similar to what JavaDoc does. Try it by switching to any class (say: OrderedCollection), and select the browser's "Class-Documentation-HTML-Documentation" menu item.

Notice, that the documentation may even include so called "executable code examples"; these are code fragments embedded in "[exBegin]" ... "[exEnd]" brackets. The documentation viewer will extract those and add an extra examples section at the end, where you can even click on those examples, to execute it directly in the viewer.

As already mentioned: do not worry about memory usage when creating documentation methods - simple methods which return self (as empty methods do) all share a common piece of code, so there will NOT be thousands of empty methods filling up your memory. (to be exact: there is some little overhead per method created by the method object itself - not by the method's code). However, for production code, stc provides a command line argument, to skip all methods in the documentation category; to allow building more compact class libraries).

BTW: from the author's experience, you should not delay documentation too much. Write them down as soon as possible - otherwise you may not find the time to do so later - or you may simply forget to do it. Also, keep in mind that it may take more time to add those comments later, since you may have to reflect about what is going on. From our experience, the later the documentation is written in a project, the higher is its cost.

Class Version

If you use one of the Smalltalk/X sourcecode managers, every class will contain a "version*" method, which returns the classes version string. The version method is maintained by the sourcecode manager; do NEVER manually change these strings (unless you know exactly what you are doing).
The actual format of that string is specific to the underlying source code mechanism (for CVS, it looks similar to: "$Header$"; for SVN, it is: "Id: filename.st ...").
These version strings will be expanded (by the source code management) to the actual version; for example, in the Array class, you will find something like:
    version_CVS
	^ '$Header: /cvs/stx/stx/libbasic/Array.st,v 1.149 2010/09/21 06:57:51 stefan Exp $'

Notice:
Currently only managers for some repository types are provided, and some are experimental or provided as guideline for imlementors. Well supported are CVS, SVN, P4 (Perforce) and HG (Mercurial). You have to write your own manager class (derived from AbstractSourceCodeManager) if another mechanism is to be used.

If such a version method is present, and the source code manager is enabled, access to the repository is possible from the browser, and done automatically to retrieve a classes' source (based on the actual version of that class in the system).

When classes are checked in via the browser, version methods are automatically updated or created (if not already present). In normal operation, the handling of those is transparent, and you can safely forget about it ... it's useful to know about it, anyway.

Method Documentation

Every method should contain (at least) two comments: Example (from Collection's enumeration protocol):
    select:aBlock
	"return a new collection
	 with all elements from the receiver,
	 for which the argument aBlock evaluates to true"

	|newCollection|

	newCollection := self species new.
	self do:[:each |
	    (aBlock value:each) ifTrue:[
		newCollection add:each
	    ].
	].
	^ newCollection

	"
	 #(1 2 3 4) select:[:e | e odd]
	 (1 to:10) select:[:e | e even]
	"

Variable and Method Naming

Of course, you should give your variables and methods descriptive names. You should do so in any programming language. In Smalltalk, a common trick is to encode the expected type of a variable in the name (which you don't have to in static typed languages). For example, names like "originPoint", "lineString" or "collectedNames" make it totally clear, what the variables/arguments are used for.

By convention, global variables and class variables should start with an upper case character - other variables and selectors start with a lower case character. You may find a small number of exceptions to these rules, where selectors start with either an underscore or an upper case character. Those are typically to avoid conflicts or to provide compatibility with other programming languages.

Global Variables

Think twice before using globals - usually there is no need for them!
Beside increasing code complexity (by introducing side effects), use of globals may lead to conflicts if packages from different programming teams are merged and both use the same global name. Although the browser offers search functions for uses of globals, you have to manually edit (and think about) the code in this case. Avoid this by banning globals from your code.

In many situations, a global can be eliminated by by passing additional method arguments (which may even be an advantage later, offering more possibilities for reuse of a method).

Use Class- or Pool Variables Instead

Any need for a globally accessible state can easily be replaced by a private class variable instead and access be provided to other parts via getter/setter methods on the class side.

As an alternative to getter/setters, you may also use Class variables (which are visible within a class and subclasses) or SharedPool variables, which are visible when explicitely named (imported) in a classes definition.

Code Comments

You don't need too many comments in your methods, if the code is clean and straight forward. Do not add comments just for the comment. For example, a comment like:
	sum := sum + 1.         "add one to sum"

is stupid and filling your methods with this kind of "information" actually makes your code less readable.
(You may wonder why this is mentioned here; we have seen departments where code "quality" was measured by counting comments, which ended in people doing above rubbish - only to make the codecheckers happy.)

Also, if you think that a variable needs a comment stating its use, think about changing the variable's name! For example, the following code is a (stupid) example for a bad variable name:
	|c "counter for blabla"|

why not give the variable that name right away? As in:
	|counterForBlaBla|

And, similar, if a group of statements need an explanation as in:
	...
	"read the data from file blabla"
	...code to read data...
	...and so on...
	...
I would suggest that you extract those statements into a separate method, name then according to what they do and invoke that method:
	...
	self readDataFromBlaBlaFile.
	...

Voila - the method's name is just as good as the original comment.

The above actually means, that as your code becomes more & more readable and less-cryptic, less comments are needed.

However, the above does not mean that your code should be completely uncommented. A lot of public domain Smalltalk code is floating around, which is very hard to understand in not providing a single informative comment. This often makes it very hard to figure out the overall structure of a framework or application from reading the code. You have to run this code (either in your mind, or for real) to see what is going on.

Especially the collaboration and interface between classes and methods is hard to see without a hint from the original coder. In some code we found, it was not even clear, how to start the framework, how to correctly setup server processes and where to find configuration parameters.

That said, use the following rules:

Tutorial Code for Beginners

In some methods (especially in some included example code), you may find comments, which seem to violate the above rules, in that they explain obvious things. For example, comments like: " ... now open the view ..."

The reason for those comments is that we expect these code-texts to be read mostly by newcomers - so that the code text is also used and read as an introductionary text. Therefore, more than usual is sometimes commented there.

Code Indentation

The question of how code should be indented is a very personal one and the discussion around it often emotional, sometimes almost "religious".

For that reason, we will not give any recommendations for your own code here (but read on, please). Instead, the two most commonly used styles are described in short here. Take the one that you (and your friends) find to be the easiest to read.

Readability is usually better if you do not have to scroll when looking at a method's code. Therefore, methods should be short. On the other hand, don't break up a method into many short methods just for this - find a useful compromise. Having too many too small methods also often hinders readability, as you will then have to navigate through all those tiny pieces of code to find your way.

Many other styles are possible. However, whichever you choose, follow these rules:

Code Format in ST/X

All of the above is valid for your own code.

However, eXept and the author already went throught the above discussion process, and if you donate your code, you are the newcomer to the team. Don't expect everyone else to adopt to your personal preferences, but adjust to theirs.

Initially, we did not ask for a strict coding guideline, but followed the "do not discuss" rule above. However, over time, a lot of code has found its way into ST/X which is hard to read simply for its different indentation and coding style. Also, some code is harder to read for beginners or needs mental analysis to understand. We do not want this situation to become worse in the future, and will reformat any old code as we encounter it and find time.

We want the ST/X code base to conform to a common indentation style for two reasons:

Also, eXept requests that any code which is to go into the main stream of the ST/X code base (i.e. is to be published as part of the ST/X deployed package) conforms to the following coding style, simply because it is usually us who will have to deal with any error reports and questions about the code later.

The rules are defined to make this "pictoral structure matching" easy. They are:

  1. Kernigham - Ritchie (KR) style indentation
    Sorry to all Smalltalkers and Lispers: 98% of the world is indenting their code this way, and ST/X does so as well. It may well be true, that some of the reasons for Lisp and Smalltalk not being mainstream was due to their indentation style, which is hard to read for beginners (it may well be for a guru with 10 years of Smalltalk experience, but...). Very short loops (collection collect, select) or if-for-value can go into a single line.

  2. Indentation level is 4
    2 is not enough to make alignment of closing bracket to opening keyword easy; 8 is too much and usually leads to a need to scroll the editor to the right. Be careful, when editing ST-files in an external editor (which is stupid, anyway): the meaning of a Tab character in the file is "move to the next multiple-of-8 column". This "8" is independent of any of your personal preferences for how much the tab-key should indent.
    No special indentations (such as a ^ statement being indented 2 colums to the left).

  3. Blocks which are NOT loops or conditionals should NOT be indented as such
    Although Smalltalk makes no difference between a block argument which is defining a callback block (as used in GUI widgets) and blocks which are loop-blocks (as in the collection protocol) or conditionals, it definitely DOES make a difference to a programmer's mental model of it.

    Therefore, we demand that ONLY loops and conditionals are indent KR-style. Callbacks, async block arguments and "if-for-value" blocks are not. Blocks which are not executed (i.e. assigned to a variable for reuse) should also not be indented as if they were control blocks.

    Thus, a block assigned to a variable should be indented as:
        var := [ blocks code... ]
    or (for long blocks):
        var :=
    	[
    	   ...
    	   blocks code
    	   ...
    	]
    instead of:
        var := [
           ...
           blocks code
           ...
        ]
    Reason: this would require the programmer to look at and read the first line, to see that it is not a loop or conditional (and maybe even have to scroll). Of course this also means, that loops should not be indented like the above.

  4. Local block variables are declared in an extra line.
    If block locals are written after the opening bracket or block args, that line looks like a boolean-or message sent to some variable. You would have to actually "read" the code, in order to understand, that it's a variable declaration. If it is on an extra line, this begins with a "|" which is immediately identifyable as a declaration. You would not write method locals after the method selector in the first line - would you?

    So block locals should be declared as:
        collection do:[:el |
    	| local1 local2 |
    	...
        ].
    instead of:
        collection do:[:el | | local1 local2 |
    	...
        ].
    Reason: because the first looks much like an expression, and I have to read it (and cannot using my "graphical structure recognizer")

  5. One empty line separates the first statement from the selector, comment and locals. Period.
    The first statement should not directly follow any of the above. In order to guide the reader where to start reading, it must be separated by an empty line. Separate local variable declarations from the comment also by an empty line. For very simple getter/setter or forwarder methods, which have no comment, you may ommit that empty line.

    There is no need for an additional empty line between the method's selector and argument declaration and the method comment or locals declaration. Especially, your code should NOT put an empty line BEFORE any local variable declarations AND at the same time OMIT the one before the first statement, as in:
        foo: arg1 bar: arg2
    
    	|local1 local2|
    	statement1.
    	statement2.
    	...
    instead, write:
        foo: arg1 bar: arg2
    	|local1 local2|
    
    	statement1.
    	statement2.
    	...
    the reason is that in the first version above, you have to look at and read the local variable declarations to find the first line of real code. Whereas in the later example, the layout already leads you there. Be reminded that the syntax highlighter usually emphasizes the first line, so your code would be presented as:

        foo: arg1 bar: arg2
    	|local1 local2|
    
    	statement1.
    	statement2.
    	...
    
  6. Method comment
    Every method, except for simple getters/setters MUST have a method comment at the top (between selector and local variable declaration or code). The comment is delimited at the bottom by an empty line from the real start of the code or the declaration.

  7. Sample usage method comment
    If there is a comment which demonstrates the example use (which is a very welcome thing to have), place it at the END, not at the beginning.
    We understand, that it is convenient during development of new code to have it at the beginning (where it is easier to select and doIt for a hacker), it disturbs the flow of reading for others later. Usually, when trying to understand unknown code, it will be read top to bottom and the usage examples may force one to scroll down.

    Also, tooltip messages and code completion hints are extracted from a method's first comment, and thus may get quite ugly and long if the usage example is at the top.

  8. When wrapping long lines, selectors are aligned to the left
    When splitting a long message send among multiple lines, the followup lines are aligned at the left.
    Thus write:
        ...
        Dialog
    	request: 'some string'
    	initialAnswer: 'blablabla'
    	initialSelection: (4 to:6)
    NOT:
        ...
        Dialog
    		 request: 'some string'
    	   initialAnswer: 'blablabla'
    	initialSelection: (4 to:6)

    The reason is that the right-aligning version makes it harder to see which keyword parts belong to the send and where it ends.

  9. Wrapping long lines: followup and/or in conditionals
    When splitting a long conditional into multiple lines, the break should be done before an and/or keyword, and the remaining code be indented one level to the right, with the actual if-keyword indented at the original column, so it is nor confused with the code of the if-body (see below).

  10. Avoid complicated boolean expressions
    Complicated combinations of and:/or:/not should be avoided. If possible, use nested ifs or guards. Sorry to mathematicans: most programmers are not, and it is often not obvious what the outcome of a complicated and-not-or-not combination will be.
    Avoid double negation ("notEmpty not" or "foo not ifFalse"); for each of them, there is a corresponding positive condition (i.e. use "isEmpty" instead or "foo ifTrue")

  11. Class documentation
    Every class should have a documentation method consiting of a comment-only method. The class documentation should also contain hints as how the class interacts with other parts of the system, and how the class/framework is used. For frameworks, the project-definition class is a good place to put framework documentation.

  12. Examples
    A convenient place to add examples (especially for complicated frameworks, which need configuration and/or special instantiated objects) is the "example" method on the class side. This may consist of comments only or contain real sample code. Unit tests alone are not a relacement for an example method which shows how a complex framework (like an application or server process) is started, because these unit tests are often badly commented and it is often difficult to decide which test is testing internal mechanisms as opposed to the outside api of a framework.

Examples for "dos" and "don'ts"

if "for value":
Do NOT write:
    foo := bla ifTrue:[
	5
    ] ifFalse:[
	6
    ].

because it hides the "used as value" aspect of the code,
and makes the state-change (i.e. assignment) less visible

Instead, write:
    foo := bla ifTrue:[5] ifFalse:[6].

For long expressions, write:

    foo := bla
	    ifTrue:[...]
	    ifFalse:[...].

or, if very long:

    foo := bla
	    ifTrue:[
		...
	    ]
	    ifFalse:[
		...
	    ].
assigning blocks:
Do not write:
    fooAction := [
	...
	statements
	...
    ].

because it looks like a loop,
and makes the state-change (assignment) less obvious

Instead, write:
    fooAction :=
	[
	    ...
	    statements
	    ...
	].
long conditions:
Do not write:
    (condition1 and:
	[condition2 and:
	    [condition3]])
		ifTrue:[
		    statements
		]
Instead, write:
    (condition1
	and:[ condition2
	and:[ condition3 ]]
    ) ifTrue:[
	statements
    ]
block locals
Do not write:
    expr do:[:arg| |a b c|
	statements]
Instead, write:
    expr do:[:arg |
	|a b c|

	statements
    ].

Avoid Obfuscated Code

The following is an incomplete list of recommendations:

Things you should not do


Copyright © 1995 Claus Gittinger, all rights reserved

<cg at exept.de>

Doc $Revision: 1.45 $ $Date: 2021/06/11 19:24:02 $