First learn computer science and all the theory.
Next develop a programming style.
Then forget all that and just hack.
(George Carrette)
Coding Style used in Smalltalk/X Classes
Contents
This document describes the coding style and conventions
used in Smalltalk/X's class library.
The author is aware of the fact, that coding style is a very personal
matter and should not be enforced by dictators.
However, in this system, when you add code to be published and read by others,
you are not alone. Therefore, it is useful to follow some rules, to enable other programmers
an easier entry into the system. Also, there exist tools which extract
useful information and can format neat documents if you follow those rules
in your classes. Thus, when it's about time to deliver documentation
on your project, a whole bunch of work is done for free.
Experienced Smalltalk programmers may especially look into this document, because
the ST/X coding style is slightly different from what other Smalltalkers consider to
be "readable code".
If you have any suggestions or additions on this theme, let me know about it.
Finally: a lot of un-maintanability comes from programmers who are lazy typers -
short or obfuscated class-, method- and variable names.
And also: often some comment on how to get a framework started or used.
Be reminded again, that code is usually written only once but read and possibly modified
many many times over its lifetime.
The time you might save by ommiting documentatin while writing will be spent manyfold by others
(and possibly yourself) later, when you have to decipher the code in a year or so...
I admit that some of Smalltalk/X's code does not follow those guidelines:
some code is very old and the author(s) of the code have matured as we all do.
Bad code is and will be refactored or reformatted, whenever we encounter it.
All program code must be written in english.
Motivation:
Some time ago, I have been shocked by getting a Smalltalk program which was
written in another (national) language. All method, class and argument names where
completely unreadable to me and everyone else in the team.
And I could only hardly understand, what the program was doing.
I still occasionally receive individual methods with local variables and comments
in another language, and it is very hard at times to understand it.
Therefore this guideline forces everyone to use a language which is understood by everyone.
I request programs to be written in english, using english class names, english selectors,
english comments and variable names.
As every programmer - even from the far east - understands english,
but not vice versa, this should make communication easier.
This has nothing to do with 'western cultural imperialism',
or an 'egocentric view of the world',
or ignoring the culture and language of others - especially minorities,
as some (pseudo liberals) might complain.
It is purely practical: it should support the comprehensability of programs among
a worldwide programmer community.
Even the authors of the ST/X system are not a native english speakers,
so they too have to be careful at times.
By the way: the reason for not supporting non-Latin1 characters in variable names,
class names and message selectors is part of that enforcing: by not being able to
use other characters, people from east europe, the near or far east are less
motivated to use non-english language in the program.
There is nothing to prevent you from using non-Latin characters in string constants,
the UI or any other non-code areas (although it is highly recommended to use plain english
in UI-strings and provide translation files for other languages).
In Smalltalk/X, every class contains a method category called
"documentation" in the class protocol.
You will find at least two methods named
"version*
" and "documentation
" there:
The "version*" Methods
The "version
" methods are exclusively maintained by the
source code manager. They tell the system where to find a classes source code
in the repository. A class may be stored in multiple repositories,
so occasionally you may find multiple version methods in a class.
You should never ever touch or remove them. Details are described below.
If you report an error (to eXept),
this string can be used to identify the exact version of the class.
The "documentation" Method
The "documentation
" method's comment should describe the class,
its uses and (if of public interest) its instance and class variables.
In many classes, you will find an additional #examples
method.
This should contain a comment giving typical uses
(often ready to select & doIt).
These methods consist of comments only; they are not meant to be executed.
(actually, if evaluated, they will return the receiver;
since empty methods are semantically equivalent to a "^ self
" method).
In contrast to other Smalltalk systems, which keep the class comment in the "comment"
instance variable, ST/X uses the method comment.
The reason is simply that the comment resides in memory,
whereas all method sources are typically only present in external files.
So, this scheme saves a lot of memory, if long and descriptive comments are present,
and there is no 'performance' or 'memory usage' argument against them -
which is exactly what we want.
Another reason is that this makes it easier to put structure into multiple comments;
for example, one for internals, another for examples, etc.
The
SystemBrowser
automatically shows the documentation text
(found either in the "documentation
" method or in the class comment)
whenever a class is selected.
Thus, to be nice to other people browsing through the system,
please add a short description of what your class is about in that method.
Also, the document viewer can extract a classes' documentation method's
text and present it cutely formatted - you get your documentation almost
for free, if you stick to these conventions! This is similar to what JavaDoc does.
Try it by switching to any class (say: OrderedCollection), and select
the browser's "Class-Documentation-HTML-Documentation" menu item.
Notice, that the documentation may even include so called "executable code examples";
these are code fragments embedded in "[exBegin]
" ... "[exEnd]
" brackets.
The documentation viewer will extract those and add an extra examples section at the
end, where you can even click on those examples, to execute it directly in the viewer.
As already mentioned: do not worry about memory usage when creating documentation methods - simple methods
which return self (as empty methods do) all share a common piece of code,
so there will NOT be thousands of empty methods filling
up your memory.
(to be exact: there is some little overhead per method created
by the method object itself - not by the method's code).
However, for production code,
stc provides a command line argument, to skip
all methods in the documentation category;
to allow building more compact class libraries).
BTW: from the author's experience, you should not delay documentation too much.
Write them down as soon as possible - otherwise you may not find the time to
do so later - or you may simply forget to do it.
Also, keep in mind that it may
take more time to add those comments later, since you may have to reflect about
what is going on. From our experience, the later the documentation is written
in a project, the higher is its cost.
If you use one of the Smalltalk/X sourcecode managers,
every class will contain a "version*
" method, which
returns the classes version string. The version method is maintained by the sourcecode manager;
do NEVER manually change these strings (unless you know exactly what you are doing).
The actual format of that string is specific to the underlying source code mechanism
(for CVS, it looks similar to: "$Header
$"
;
for SVN, it is: "Id: filename.st ..."
).
These version strings will be expanded (by the source code management)
to the actual version;
for example, in the Array
class, you will find something like:
version_CVS
^ '$
Header: /cvs/stx/stx/libbasic/Array.st,v 1.149 2010/09/21 06:57:51 stefan Exp $'
Notice:
Currently only managers for some repository types are provided, and some are
experimental or provided as guideline for imlementors. Well supported are CVS,
SVN, P4 (Perforce) and HG (Mercurial).
You have to write your own manager class (derived from AbstractSourceCodeManager
)
if another mechanism is to be used.
If such a version method is present, and the source code manager is enabled,
access to the repository is possible from the browser,
and done automatically to retrieve a classes' source
(based on the actual version of that class in the system).
When classes are checked in via the browser, version methods
are automatically updated or created (if not already present).
In normal operation, the handling of those is transparent,
and you can safely forget about it ... it's useful to know about it, anyway.
Every method should contain (at least) two comments:
- a comment giving a short description
this comment should be the very first comment in the method, and should be placed
between the selector & argument specification and the local variables declaration
or first statement.
It should give a short description of what this method does, and what the
arguments are to be used for.
Please, use this type of comment, since ST/X provides special printout
features, which allow you to create printed documentation automatically,
based on these comments (similar to javadoc).
See the
SystemBrowsers printOut protocol
functions - and the
online class documentation.
- a comment giving example uses of this method
this comment (or multiple comments) should be at the very end of the method,
and give some example(s) of how the method is used.
Also, as required, failure examples should be given.
This comment will allow select & printIt from the browser, without a need to
type in any expressions.
Thus providing simple "HowTo examples" to readers of your code.
Please put this comment at the end, not right after the selector at the top.
The reason is that those examples may hinder easy code reading - especially if they
are long. I admit, that during development, programmers tend to hate the need to scroll
down, but other readers will later appreciate the improved readability.
Example (from Collection's enumeration protocol):
select:aBlock
"return a new collection
with all elements from the receiver,
for which the argument aBlock evaluates to true"
|newCollection|
newCollection := self species new.
self do:[:each |
(aBlock value:each) ifTrue:[
newCollection add:each
].
].
^ newCollection
"
#(1 2 3 4) select:[:e | e odd]
(1 to:10) select:[:e | e even]
"
|
Of course, you should give your variables and methods descriptive names.
You should do so in any programming language.
In Smalltalk, a common trick is to encode the expected type of a variable
in the name
(which you don't have to in static typed languages).
For example, names like "originPoint", "lineString"
or "collectedNames" make it totally clear,
what the variables/arguments are used for.
By convention, global variables and class variables should start with
an upper case character - other variables and selectors start with a lower case
character.
You may find a small number of exceptions to these rules, where selectors
start with either an underscore or an upper case character. Those are typically to
avoid conflicts or to provide compatibility with other programming languages.
Think twice before using globals - usually there is no need for them!
Beside increasing code complexity (by introducing side effects),
use of globals may lead to conflicts if packages from different
programming teams are merged and both use the same global name.
Although the browser offers search functions for uses of globals,
you have to manually edit (and think about) the code in this case.
Avoid this by banning globals from your code.
In many situations, a global can be eliminated by
by passing additional method arguments
(which may even be an advantage later,
offering more possibilities for reuse of a method).
Use Class- or Pool Variables Instead
Any need for a globally accessible state can easily be replaced by a private class variable instead
and access be provided to other parts via getter/setter methods on the class side.
As an alternative to getter/setters, you may also use Class variables (which are visible within
a class and subclasses) or SharedPool variables, which are visible when explicitely named (imported) in
a classes definition.
You don't need too many comments in your methods,
if the code is clean and straight forward.
Do not add comments just for the comment.
For example, a comment like:
sum := sum + 1. "add one to sum"
|
is stupid and filling your methods with this kind of "information"
actually makes your code less readable.
(You may wonder why this is mentioned here;
we have seen departments where code "quality" was measured by counting comments,
which ended in people doing above rubbish - only to make the codecheckers happy.)
Also, if you think that a variable needs a comment stating its use,
think about changing the variable's name!
For example, the following code is a (stupid) example for a bad variable name:
why not give the variable that name right away? As in:
And, similar, if a group of statements need an explanation as in:
...
"read the data from file blabla"
...code to read data...
...and so on...
...
|
I would suggest that you extract those statements into a separate method,
name then according to what they do and invoke that method:
...
self readDataFromBlaBlaFile.
...
|
Voila - the method's name is just as good as the original comment.
The above actually means, that as your code becomes more & more readable and less-cryptic,
less comments are needed.
However, the above does not mean that your code should be completely uncommented.
A lot of public domain Smalltalk code is floating around, which is very hard to understand
in not providing a single informative comment.
This often makes it very hard to figure out the overall structure of a framework
or application from reading the code. You have to run this code (either in your mind, or for real) to
see what is going on.
Especially the collaboration and interface between classes and methods is hard
to see without a hint from the original coder. In some code we found, it was not even clear, how
to start the framework, how to correctly setup server processes and where to find configuration parameters.
That said, use the following rules:
-
Give a hint on how to startup and shutdown an application/server in a class comment or documentation method.
And/or add example methods, which setup the objects for proper use and startup the application or
provide code examples on how to use the framework.
-
Provide a set of unit-tests, to describe the public protocol.
-
If a group of objects collaborate, write a class comment which gives a rough overview.
- If you use special tricks or uncommon constructs,
you should add a comment describing what is going on -
for yourself and for others.
- still, every method should contain a comment describing what it does,
and possibly a short doIt example to try it (as described above).
In some methods (especially in some included example code),
you may find comments, which seem to violate the above rules,
in that they explain obvious things. For example, comments like:
" ... now open the view ..."
The reason for those comments is that we expect these code-texts to be read mostly by newcomers -
so that the code text is also used and read as an introductionary text.
Therefore, more than usual is sometimes commented there.
The question of how code should be indented is a very personal one
and the discussion around it often emotional, sometimes almost "religious".
For that reason, we will not give any recommendations for your own code here (but read on, please).
Instead, the two most commonly used styles are described in short here.
Take the one that you (and your friends) find to be the easiest to read.
- Lisp-style indent
In this, all closing parenthesises and brackets are lined up at the end,
for example:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar
ifTrue:[
self fooBar.
[doingBaz]
whileTrue:[
self baz].
self moreFooBar.
1 to:10 do:[:index |
1 to:10 do:[:index2 |
self doMore]]]
ifFalse:[
...
and so on]
This style of indentation is seen often in ST-80 code,
and some ST-80 formatters automatically produce output in this kind.
Some variations are possible, for example, you can put the ifTrue:
and whileTrue:
right behind the receiver (as below).
- C style indent (used by the author)
This style is roughly based on the C-indentation style used by
Kernigham & Ritchie.
for example:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar ifTrue:[
self fooBar.
[doingBaz] whileTrue:[
self baz
].
self moreFooBar.
1 to:10 do:[:index |
1 to:10 do:[:index2 |
self doMore
]
]
] ifFalse:[
...
and so on
]
Also, variations are possible; for example, the opening brackets of blocks
can be put onto a separate line, as in:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar ifTrue:
[
self fooBar
[doingBaz] whileTrue:
[
self baz
].
self moreFooBar.
1 to:10 do:
[:index |
1 to:10 do:
[:index2 |
self doMore
]
]
]
ifFalse:
[
...
and so on
]
However, this seems to spread the code for even small methods quite a lot.
To many programmers, this makes the readability worse.
A simple rule: try to make your methods fit onto a page or screenful
(of course: not by putting all into oneliners ;-)).
Readability is usually better if you do not have to scroll when looking at a method's code.
Therefore, methods should be short.
On the other hand, don't break up a method into many short methods just
for this - find a useful compromise.
Having too many too small methods also often hinders readability, as you will
then have to navigate through all those tiny pieces of code to find your way.
Many other styles are possible. However, whichever you choose, follow these rules:
- use the style consistent
try to not mix styles.
Although, there are some situations, when you have to break out
of your indentation (for example, if lines become too long) you should
stick to a general style.
Also, for very simple if- or while-blocks, you may decide to put the block
right behind the selector - within the same line.
- use indentation
Never write your code in a Fortran or Basic-style, without any
indentation. It makes your code almost impossible to read.
Don't think that you will never again look at some method and therefore don't
need to indent and/or comment it - experience shows: you or someone else will.
Also, some people think that it saves development time
when writing code without indentation and formatting initially,
and only format later, when the method is finished.
That is not the case, as from our experience,
you will actually slow down development time when the code is unreadable.
Cases are very very rare, that code runs without any debugging,
and you will need more time to read the code in the debugger then.
- the style should show the control flow
If you use a style which still requires comments like "end of while"
or "end if this or that if", rethink about it. After all, indentation should
express exactly that; without the need for further explanations.
- don't argue about styles
I hate people arguing about indentation styles.
Let others use whatever they like - just as you want them to let you do your
work as you like.
However, when working in a group, agree on a common style.
Discuss it in the group before you start writing code. If you join an existing
group, adapt to their style.
All of the above is valid for your own code.
However,
eXept and the author already went throught the above discussion process,
and if you donate your code, you are the newcomer to the team.
Don't expect everyone else to adopt to your personal preferences,
but adjust to theirs.
Initially, we did not ask for a strict coding guideline, but followed the "do not discuss" rule above.
However, over time, a lot of code has found its way into ST/X which is hard to read simply
for its different indentation and coding style.
Also, some code is harder to read for beginners
or needs mental analysis to understand.
We do not want this situation to become worse in the future,
and will reformat any old code as we encounter it and find time.
We want the ST/X code base to conform to a common indentation style for two
reasons:
- newcomers should not be confused by various different styles
- to support graphical structure recognition
It is a matter of fact, that the human brain is able to perform quite a bit of preprocessing on
the unconcious level, based on the pure layout and geometric arrangement of code.
For example, loop structures should be visible as such, without even a need to read the actual
message names - simply by how the code is arranged.
Newcomers and infrequent programmers may not experience this, but to a long-time hacker,
used to his style of indentation, a lot of pattern matching happens on this level.
Also, eXept requests that any code which is to go into the main stream of the ST/X code
base (i.e. is to be published as part of the ST/X deployed package) conforms to the following coding style,
simply because it is usually us who will have to deal with any error reports
and questions about the code later.
The rules are defined to make this "pictoral structure matching" easy. They are:
- Kernigham - Ritchie (KR) style indentation
Sorry to all Smalltalkers and Lispers: 98% of the world is indenting their code this way, and ST/X
does so as well. It may well be true, that some of the reasons for Lisp and Smalltalk not being mainstream was due to their
indentation style, which is hard to read for beginners (it may well be for a guru with 10 years of Smalltalk experience, but...).
Very short loops (collection collect, select) or if-for-value can go into a single line.
- Indentation level is 4
2 is not enough to make alignment of closing bracket to opening keyword easy;
8 is too much and usually leads to a need to scroll the editor to the right.
Be careful, when editing ST-files in an external editor (which is stupid, anyway): the meaning of a Tab character
in the file is "move to the next multiple-of-8 column". This "8" is independent of any of your personal preferences
for how much the tab-key should indent.
No special indentations (such as a ^ statement being indented 2 colums to the left).
- Blocks which are NOT loops or conditionals should NOT be indented as such
Although Smalltalk makes no difference between a block argument which is defining a callback block
(as used in GUI widgets) and blocks which are loop-blocks (as in the collection protocol) or conditionals,
it definitely DOES make a difference to a programmer's mental model of it.
Therefore, we demand that ONLY loops and conditionals are indent KR-style.
Callbacks, async block arguments and "if-for-value" blocks are not.
Blocks which are not executed (i.e. assigned to a variable
for reuse) should also not be indented as if they were control blocks.
Thus, a block assigned to a variable should be indented as:
var := [ blocks code... ]
|
or (for long blocks):
var :=
[
...
blocks code
...
]
|
instead of:
var := [
...
blocks code
...
]
|
Reason: this would require the programmer to look at and read the first line,
to see that it is not a loop or conditional (and maybe even have to scroll).
Of course this also means, that loops should not be indented like the above.
- Local block variables are declared in an extra line.
If block locals are written after the opening bracket or block args, that line looks like a boolean-or message
sent to some variable.
You would have to actually "read" the code, in order to understand, that it's a variable declaration.
If it is on an extra line, this begins with a "|" which is immediately identifyable as a declaration.
You would not write method locals after the method selector in the first line - would you?
So block locals should be declared as:
collection do:[:el |
| local1 local2 |
...
].
|
instead of:
collection do:[:el | | local1 local2 |
...
].
|
Reason: because the first looks much like an expression, and I have to read it
(and cannot using my "graphical structure recognizer")
- One empty line separates the first statement from the selector, comment and locals. Period.
The first statement should not directly follow any of the above. In order to guide the reader where to
start reading, it must be separated by an empty line.
Separate local variable declarations from the comment also by an empty line.
For very simple getter/setter or forwarder methods, which have no comment,
you may ommit that empty line.
There is no need for an additional empty line between the method's selector and argument declaration
and the method comment or locals declaration.
Especially, your code should NOT put an empty line BEFORE any local variable
declarations AND at the same time OMIT the one before the first statement, as in:
foo: arg1 bar: arg2
|local1 local2|
statement1.
statement2.
...
|
instead, write:
foo: arg1 bar: arg2
|local1 local2|
statement1.
statement2.
...
|
the reason is that in the first version above, you have to look at and read the local variable
declarations to find the first line of real code. Whereas in the later example,
the layout already leads you there. Be reminded that the syntax highlighter
usually emphasizes the first line, so your code would be presented as:
foo: arg1 bar: arg2
|local1 local2|
statement1.
statement2.
...
- Method comment
Every method, except for simple getters/setters MUST have a method comment at the top
(between selector and local variable declaration or code). The comment is delimited at the bottom by an empty line
from the real start of the code or the declaration.
- Sample usage method comment
If there is a comment which demonstrates the example use (which is a very welcome thing to have),
place it at the END, not at the beginning.
We understand, that it is convenient during development of new code to have it at the beginning (where it is
easier to select and doIt for a hacker), it disturbs the flow of reading for others later.
Usually, when trying to understand unknown code, it will be read top to bottom and the usage examples may
force one to scroll down.
Also, tooltip messages and code completion hints are extracted from a method's
first comment, and thus may get quite ugly and long if the usage example is at the top.
- When wrapping long lines, selectors are aligned to the left
When splitting a long message send among multiple lines,
the followup lines are aligned at the left.
Thus write:
...
Dialog
request: 'some string'
initialAnswer: 'blablabla'
initialSelection: (4 to:6)
|
NOT:
...
Dialog
request: 'some string'
initialAnswer: 'blablabla'
initialSelection: (4 to:6)
|
The reason is that the right-aligning version makes it harder to see which keyword
parts belong to the send and where it ends.
- Wrapping long lines: followup and/or in conditionals
When splitting a long conditional into multiple lines, the break should be done
before an and/or keyword, and the remaining code be indented one level to the right,
with the actual if-keyword indented at the original column,
so it is nor confused with the code of the if-body (see below).
- Avoid complicated boolean expressions
Complicated combinations of and:/or:/not should be avoided.
If possible, use nested ifs or guards. Sorry to mathematicans: most programmers are not,
and it is often not obvious what the outcome of a complicated and-not-or-not combination will be.
Avoid double negation ("notEmpty not
" or "foo not ifFalse
");
for each of them, there is a corresponding positive condition
(i.e. use "isEmpty
" instead or "foo ifTrue
")
- Class documentation
Every class should have a documentation method consiting of a comment-only method.
The class documentation should also contain hints as how the class interacts with other parts of the system,
and how the class/framework is used. For frameworks, the project-definition class is a good place to put
framework documentation.
- Examples
A convenient place to add examples (especially for complicated frameworks, which need configuration and/or
special instantiated objects) is the "example" method on the class side.
This may consist of comments only or contain real sample code. Unit tests alone are not a relacement for an
example method which shows how a complex framework (like an application or server process) is started, because these
unit tests are often badly commented and it is often difficult to decide which test is testing internal mechanisms as
opposed to the outside api of a framework.
if "for value": |
Do NOT write:
foo := bla ifTrue:[
5
] ifFalse:[
6
].
because it hides the "used as value" aspect of the code,
and makes the state-change (i.e. assignment) less visible
|
Instead, write:
foo := bla ifTrue:[5] ifFalse:[6].
For long expressions, write:
foo := bla
ifTrue:[...]
ifFalse:[...].
or, if very long:
foo := bla
ifTrue:[
...
]
ifFalse:[
...
].
|
assigning blocks: |
Do not write:
fooAction := [
...
statements
...
].
because it looks like a loop,
and makes the state-change (assignment) less obvious |
Instead, write:
fooAction :=
[
...
statements
...
].
|
long conditions: |
Do not write:
(condition1 and:
[condition2 and:
[condition3]])
ifTrue:[
statements
]
|
Instead, write:
(condition1
and:[ condition2
and:[ condition3 ]]
) ifTrue:[
statements
]
|
block locals |
Do not write:
expr do:[:arg| |a b c|
statements]
|
Instead, write:
expr do:[:arg |
|a b c|
statements
].
|
The following is an incomplete list of recommendations:
- ommitting a "^ self" to save some typing
Often, some alternative must return from a method after some other method is invoked.
A typical example are guards, as in:
...
foo ifTrue:[
^ self doSomethingForFoo.
].
bar ifTrue:[
^ self doSomethingForBar.
].
...etc...
...
|
the problem with the above code is that from reading, you do
not know whether the return value from "doSomethingForFoo" is really
wanted here, or if it's simply a lazy typer, and the real return value
is "self". So the reader has to navigate to implementors of "doSomethingForFoo"
and look for the return code. Even worse, if he finds different return values
in the different messages (such one returning self, another returning a boolean),
he is really in trouble to determine the correct expectations when he wants to subclass
or modify one of the doSomething methods.
Except for the cases when the return value really counts,
you should always write:
...
foo ifTrue:[
self doSomethingForFoo.
^ self
].
bar ifTrue:[
self doSomethingForBar.
^ self
].
...etc...
...
|
It has absolutely no implications on the execution speed or code size,
but makes it clear, that you are not interested in the answer from
"doSomethingForFoo".
- reusing variables
Do not reuse a local variable - a good style is to assign only once (functional style).
Give your variables useful names. The code is written only once, but usually read
many times by yourself and others later.
- use "contains:", "includes:" etc. instead of "do:"-loops whenever possible
These methods not only save typing - they are also documenting what is done.
Do not write code like:
myMethod
someCollection
do:[:eachElement |
someCondition on eachElement
ifTrue:[^ true]
].
^ false.
|
instead, write it as:
myMethod
^ someCollection
contains:[:anElement | anElement someCondition].
|
Just read it aloud and you know why.
- use "contains:", "includes:" etc. instead of "detect:ifNone:" when checking for the presence of an element
The same argument as above applies here; often, "detect:ifNone:
"
is used to check for the presence of an element - not to retrieve that element.
This should be avoided;
instead of:
(someCollection
detect:[:el | someCondition on el]
ifNone:nil) notNil
ifTrue:[
...
].
|
write it as:
(someCollection contains:[:el | someCondition on el])
ifTrue:[
...
]
|
Again, if in doubt, read the code loud for yourself.
As an extreme example, look at:
(coll detect: [:b| b == fooBlock] ifNone: [nil]) isNil
ifFalse:[ ... ]
|
which is a combination of "double-negation" confusion AND "detect:-to-test-for-inclusion" confusion.
Every reader of that code (even an experienced smalltalker) has to read and think about
this carefully, only to figure out, that it's nothing more than:
(coll includesIdentical:fooBlock)
ifTrue:[ ... ]
|
By the way: you can search for such code fragments using the SystemBrowser's "code-search" facility.
The above is found with the pattern:
(`@e1 detect: `@b1 ifNone: [nil]) isNil ifFalse: `@b2
|
Here is another example, which uses two nested loops:
| list rslt |
list := someList asOrderedCollection.
list do: [:p|
(p blocks detect: [:b| b == someObjectToSearchFor] ifNone: [nil]) isNil
ifFalse: [foundElement := p]
].
foundElement isNil ifTrue:[ ^ false ].
^ true
|
let us rewrite this in smaller steps to a more readable version;
the inner loop's statement is the one we saw in the previous example, so we can rewrite it to:
(p blocks contains: [:b| b == someObjectToSearchFor])
ifTrue: [foundElement := p]
|
which is the same as:
(p blocks includesIdentical:someObjectToSearchFor)
ifTrue: [foundElement := p]
|
the outer is simply another search, and the "do" can be rewritten to:
foundElement := list
detect:[:p | (p blocks includesIdentical:someObjectToSearchFor) ]
ifNone:[nil].
|
as the foundElement is not needed, the overall code can also be:
| list |
list := someList asOrderedCollection.
^ list
contains:[:p |
(p blocks includesIdentical:someObjectToSearchFor) ]
|
the last question we should ask is "why do we need a copy of the original collection as OrderedCollection for the search ?".
So we can remove this as well, get rid of the temporary variable, and get the final code:
^ someList
contains:[:p |
(p blocks includesIdentical:someObjectToSearchFor) ]
|
which is both faster and requires less memory (due to the removed list-copy).
If the original collection is one of the hash-based collections (i.e. a Set),
the includes-operation is probably even *much* faster ( O(1) instead of O(n) ).
And it certainly is much more readable!
- avoid using symbols as enum-values
Very often, symbols are used as enumerated values,
as in:
...
state == #foo ifTrue:[
...doSomethingForFoo...
].
...
state == #bar ifTrue:[
...doSomethingForBar...
].
...
|
Beside the fact that this code is not very object oriented and
should probably be done somehow using a message send,
this code is very sensible to typing errors, as neither the
compiler, nor the runtime system has a chance to check any such mistakes.
For example, if you mistype #bar as #Bar at some place, the corresponding if-statements
will never be executed and that kind of bug is hard to find.
There are multiple ways to make the above more secure:
- Use class variables instead of symbols
Some of the danger can be taken out of the code, by removing the possibility for
typing errors.
Add a few class variables, for example named "FooState" and "BarState",
and initialize them in the classes' initialize method as:
initialize
FooState := #FooState.
BarState := #BarState.
|
then, use those variables in the case code:
...
state == FooState ifTrue:[
...doSomethingForFoo...
].
...
state == BarState ifTrue:[
...doSomethingForBar...
].
...
|
Codewise, this does not make much of a difference. Execution speed is also not affected.
However, you will be notified when you mistype one of the state names (by the compiler),
and get an "unknown identifier" error.
- Use pool variables
This is pretty much the same solution as above, and should be used
if the state values are needed across multiple non-inheriting classes.
An alternative is to use access methods (on the class side) to
fetch the states from the other class. Old smalltalkers seem to prefer that over
class/pool variables - mostly because the early browsers only provided a convenient
"senders" search. With modern browsers, you can easily find references to class- and
pool variables, so that argument is no longer valid.
- Use state-objects
The above being first a step towards maintainability,
a much better solution, which also makes later extensions much simpler is
to provide separate state objects (state classes) and move functionality into
them.
This will eventually remove all of these switches and is generally called "object oriented".
So as a first step, create some StateClass, and two subclasses named FooState and
BarState (these can and often should be private classes).
If there is any need to check for the state explicitely,
use testing protocol:
...
state isFooState ifTrue:[
...doSomethingForFoo...
].
...
state isBarState ifTrue:[
...doSomethingForBar...
].
...
|
However, once this is done,
the obvious next step is to move the above actions into each state object class,
by definining doX there,
and get rid of the if-condtional, as in:
if required, pass the receiver, as in:
...
state doXFor:self.
...
|
the BIG advantage of this becomes apparent, when a new state is added.
BTW: this is a well-known pattern in OO-programming.
Notice, that the Smalltalk language offers the added convenience that the class itself
can be used as a state object (which means that you don't have to care for any singleton, instance creation,
identity or other problems).
In most other so called "OO"-languages, this is not possible; either because there is no such thing as
a class available at runtime, or because static (class-) methods are not inherited the way that
instance methods are.
Also notice, that in Smalltalk/X you can define the states as
private classes, to keep the number of visible public classes smaller.
- Message chains which follow the object structure
Do not hardwire knowledge about object relations into you program;
a typical example (which is very bad) is the following extract from
a piece of code found in the manchester archive:
...
aReadStream ioConnection input readWait.
^ aReadStream atEnd not
...
|
obviously, the readStream object is some kind of Socket-accessing
thingy, and the programmer wants to wait for something to arrive.
The bad thing with the above is that it hardwires the knowledge of
the (internal) layout of that socket-accessing thingy into the program.
(I.e. if the underlying stream implementation is ever changed to not use
an ioConnection-slot,
or that ioConnection-object no longer has an input-slot,
or you want to port to another Smalltalk,
your code is doomed.
Instead of the above, place the forwarding code into the message receiver,
so that the readStream instance gets a readWait method, which knows internally
that there is an ioConnection (if there is).
The same goes for the ioConnection object, which should forward a readWait message
to its input.
Thus the abve should be written as:
...
aReadStream readWait.
^ aReadStream atEnd not
...
|
with a forearder in the readStream:
ReadStream >> readWait
ioConnection readWait.
|
and the ioConnection:
IOConnection >> readWait
input readWait.
|
If required, add those forwarders as extensions.
Another example is:
...
widget wrapper wrapper wrapper controller enable.
...
|
here, the widget's internal hierarchy is reflected in the access-path,
and the code will have a very hard time if the wrapper hierarchy ever changes
- either in the framework classes or by the application programmer.
These code fragments were specially written for a ParcPlace Smalltalk system,
so they sure make problems when ported to ST/X, Squeak or another Smalltalk dialect.
However, even ParcPlace itself will not be able to change any internals (optimize or
get rid of the wrappers),
without affecting this program.
Thus, there is a chance that the above code will not work without change in the next
ParcPlace version. And don't even think of porting it without change to another Smalltalk dialect.
For the socket wait example above,
a better solution was to provide a #readWait method in the class of the socket-accessing
object, and delegate things there.
For the wrapper example,
to provide an enable-method in the widget, which delegates it to some controller.
This hides the object structure and allows both ParcPlace to change things,
and allows ST/X's Socket class to behave like other SocketAccessors.
It also makes it trivial to introduce a facade class, which forwards messages
as required.
Of course, the above examples do not only apply to system classes -
the same is very often found in user code - especially in GUI code.
Copyright © 1995 Claus Gittinger, all rights reserved
<cg at exept.de>
Doc $Revision: 1.45 $ $Date: 2021/06/11 19:24:02 $