For example, you can search for assignments of the value 1234
to a variable named
"myVar"
with the following pattern:
This will find the assignments even if the code is formatted with whitespace or comments in between or both.
It will also ignore the above character sequence inside comments or string literals
(i.e. it performs a real "code search", as opposed to a "string search"). The above pattern will even find
assignments of a hex literal number (i.e. code like "myVar := 16r4D2").
myVar := 1234
Likewise, you can search for all creations of three-element Arrays, using a constant array size,
with the pattern:
or, you can search for a particular message being sent to the Transcript with:
Array new:3
Try using the codeSearcher on some of your own code
(and place comments or whitespace into your code).
Compare the results against the results of simple text search operations.
Transcript showCR:'oops'
To allow for those to be specified in the search-code pattern, the search-syntax supports pattern variables (also called "meta variables").
Each metaPattern-variable must begin with a `
(backquote) character.
This character was choosen, because it does not occur in normal Smalltalk code.
It is a bit hard to type on non-us keyboards, though.
Immediately following the `
-character, other characters can be entered to specify
what type of node this metaPattern-variable should match.
After all the special character have been entered, you must then enter a valid variable name.
(the matched code is actually bound to this name inside the searcher and can be
later used to match other parts against the same pattern,
or by the code-rewriter to replace it by some other code.)
The special characters are listed in the following table:
"self"
)
and selectors (if followed by a colon).
`foo
matches any variable.
`foo := `foo + 1.
will match any increment operation of a variable such as "a := a + 1"
,
but will NOT match something like "a := b + 1"
. The later is found by:
`var1 := `var2 + 1.
`v
will match any (both locals and global) variables, whereas:
`V
will match only globals and classvars.
#
(Literal)
For example:
matches any literal, such as:
`#lit
#(...)
, #[...]
, #foo
, '...'
, 1
,
nil
, true
and false
etc.
Notice that "#lit" (i.e. without the backquote) is not a meta pattern, and will match the exact symbol "#literal" only. See the section on "Pattern Blocks" below for how to search for literals of a particular kind (i.e. String or Symbol) or of particular value.
@
(List - or "any number of")
When applied to a keyword in a message, it will match a list of keyword messages (i.e., any message send)
When applied with a statement character (see below), it will match a list of statements
For example:
matches a (possibly empty) list of temps
| `@Temps |
matches a (possibly empty) list of statements
`@.Statements
matches any message node, literal node or block node
`@object
matches any message sent to
foo `@message: `@args
foo
.
Notice, that this can also be used for a partial selector.
Therefore:
matches any message.
`@receiver `@keyword: `@arg1
In contrast,
matches any 2-argument message,
and
`@receiver `keyword1: `@arg1 `keyword2: `@arg2
matches any message with at least 2 arguments, which starts with
`@receiver at: `@arg1 `@keyword: `@arg2
at:
.
.
(Statement)
For example:
matches a single statement
`.Statement
And:
matches a (possibly empty) list of statements.
`.@Statements
`
(backquote) (Recurse Into)
`@object foo
matches a foo
message sent to any expression on the outer level.
However, the code "self foo foo"
would match only once (the outer message send expression).
In contrast,
also matches
``@object foo
foo
sent to any object,
plus for each match found, it will look for more matches in the ``@object
part.
Thus, this will match twice for the "self foo foo"
example.
{ .. }
(Pattern Block)
RBProgramNode
) is passed as argument to the block.
isVariable
,
name
(if it is a variable),
isLiteral
,
value
(if it is a literal),
isBlock
,
isLiteralString
,
isLiteralInteger
,
isLiteralFloat
,
isLiteralArray
For example:
matches any variable whose name starts with 'RB'.
`{:node | node isVariable and: ['RB*' match: node name]}
This allows for almost unlimited flexibility in the match:
matches any add expression sent to any variable which starts with 'co'.
`{:node | node isVariable and: ['co*' match: node name]} add: `@arg
To match empty array constants use:
`{:node | node isLiteralArray and:[ node value size == 0 ]}
To match empty string constants use:
`{:node | node isLiteralString and:[ node value = '' ]}
to match a non-block expression:
or to match a block with 1 argument:
`{:node | node isBlock not }
however, this can also be acomplished with:
`{:node | node isBlock and:[node numArgs == 1] }
[:`arg | `expr ]
The block may also be specified as a 2-argument block.
In this case, the matching dictionary is passed as second argument, allowing the block to
refer to previous match results.
`someVariable
at: `#someLiteral
put: `{:node :matchDict |
node isLiteral
and:[ node value isString
and:[ node value = matchDict at:#someLiteral ]] }
'...'
(Pattern String Literal)
`'.*'
matches any string literal, whereas
`'[aA]..'
matches any string literal of size 3, which begins with an "a".
String patterns are useful to delimit the search result; for example,
the following searches for a string concatenation (","-message) to any string which starts
with a space followed by the word 'and':
`@e , `' and.*'
If preceeded by the Recurse Into meta character ( ` ),
the search recurses into Array literals.
I.e. Array literals containing
string elements are also searched for a matching string.
For example:
matches any string literal which starts with the characters 'Di'.
It would not match an array-literal containing such a string element, though.
In contrast,
`'Di.*'
also matches string elements of array literals.
``'Di.*'
`sel
^ `sel
In the search dialog box, do not forget to set the "Method"-CheckBox;
otherwise, the search will be for a matching expression, which will probably not be found.
Another, example is the following pattern, which searches for all 2-argument methods,
which return their second argument:
Consequently, the following pattern searches for methods which simply return the result of another
self-send (typically, these are aliasing-methods for compatibility with other Smalltalk libraries):
`selPart1: `arg1 `selPart2: `arg2
^ `arg2
An finally, the following pattern searches for methods which simply
delegate to return the result of a super-send (such methods can actually be removed):
`op: `arg
^ self `op2: `arg
`op: `arg
^ super `op: `arg
For example, to search for senders of "do:
" to a constructed interval,
use a pattern like:
The following examples demonstrate more specific searches.
(`@e1 to: `@e2) do: `@block
`foo
- matches any variable or pseudo-variable.
`#foo
- matches any literal (incl. nil
, true
and false
).
`{:node | node isVariable and: [node name includesString:'Array' caseSensitive:false]}
In a similar fashoin, search for name prefixes, suffixes, etc.
Eg.
`{:node | node isVariable and: [node name endsWith:'Array' caseSensitive:true]}
16
, the code search
will also detect constants like "16r10"
or "2r10000"
but not
"160"
.
All of this is much harder (if not impossible) with a simple text search.
`{:node | node isLiteral
and:[ node value isNumber
and:[ node value between:1 and:5 ]] }
Array new:3
Search for Array instance creations with a constant size larger than 100:
Array new:
`{:node | node isLiteral
and:[ node value isNumber
and:[ node value > 100 ]] }
Search for Array instance creations with any constant size:
Array new: `#n
Search for Array instance creations where a variable specifies the size:
Array new: `v
Search for all Array instance creations (any expression as size):
Array new: `@e
`@e add: `{:n | n isLiteral and:[ n value isNil] }
of course, this particular search can be written simpler as:
`@e add: nil
However, as already noted above, the pattern block allows for very fine tuned searches (particular integer argument range,
string patterns in arguments, etc).
breakPoint:
"-message sends with a non-symbol argument:
`@e breakPoint:
`{:n |
n isLiteral not
or:[n value isSymbol not ] }
breakPoint:
message with a string argument:
`@e breakPoint:`{:n |
n isLiteral
and:[n value isSymbol not
and:[n value isString]]}
notice the need for the extra symbol check, because in ST/X isString
returns true for symbols.
at:put:
" being sent to the Smalltalk-global,
with a variable's value as argument,
use
Smalltalk at:`key put:`val
the above does not match for literal values or expression values as argument(s).
Smalltalk at:`@key put:`@val
to even look into the argument and look for sends there too,
use:
Smalltalk at:``@key put:``@val
or if you want all sends to ANY global, replace the name by a global-var match pattern (upper case):
`V at:`@key put:`@val
nameOfVariable selector: `@expr
nameOfVariable keyw1: `@expr1 keyw2: `@expr2 ...
or to search for any message:
nameOfVariable `@msg: ``@args
or, to search for any unary message, use:
nameOfVariable `msg
Use `v as receiver to search for messages to any variable,
and 'V (uppercase) to seatch for messages to any global (typically classes).
eg.
`v size
`V new: `@expr1
``@rec `@msg: ``@args
or, to search for any unary message, use:
``@rec `msg
Notice the extra backquotes, which are required to recurse into already matched
expressions (otherwise, "rec foo foo
" would only be matched once, for the
outer foo
-message)
This pattern can be used to find repeated sends of the same message, as in:
which will match typical sort operations.
`e sort: [:`a :`b | `a `sel < `b `sel ]
To search for messages with a particular argument count and argument patterns,
but with arbitrary selector, use "`m" (without the @ which indicates repetition).
For example:
will match any 2-arg message, which passes the first two elements of a collection
as arguments.
`@e `m:(`@e2 at:1) `m2:(`@e2 at:2)
And:
matches any "xxx:ifAbsent:" message.
`@e `m:`e2 ifAbsent:`e2
super `@msg `@args
to search for unary messages, use
super `msg
``@rec on: ``@arg1 do: ``@arg2
and:
``@rec handle: ``@arg1 do: ``@arg2
or, for a particular exception class:
StreamError handle: ``@arg1 do: ``@arg2
Error handle: [ :``@args | ] do: ``@blk
and:
``@blk on: Error do: [ :``@args | ]
`@e
ifTrue: `{:node | node isBlock not }
ifFalse: `{:node | node isBlock not }
``@object not ifTrue: ``@block
and:
``@object not ifFalse: ``@block
are obviously easier written by negating the if-test message.
The following code-pieces check if some element is in a collection:
( ``@expr1 detect:[:`v | ``@expr2 ] ifNone:nil ) notNil
and should be written as:
( ``@expr1 detect:[:`v | ``@expr2 ] ifNone:[] ) notNil
which is much easier to read and understand without having to decrypt what the original
programmer's intentions were.
( ``@expr1 contains:[:`v | ``@expr2 ] )
More unclean uses of the collection protocol are:
which can be replaced by:
`@coll do:[:`el|
`@condition ifTrue:[
^ true
]
]
and this pattern searches for "beginners code", which can be replaced by
a simpler and more descriptive
( `@coll contains[:`el| `@condition ] ) ifTrue:[ ^ true ]
"detect:ifNone:"
-message:
More examples are:
`@coll do:[:`el|
`@condition ifTrue:[
^ `el
]
]
and:
(`@e1 contains:[:`v | `@e2 not])
might both be replaced by a
`@e1 do:[:`v| `@e2 ifFalse:[^ false] ].
#conform:
-message.
Here are a few more patterns to search for:
`@e1 reject:[:`v1 | `@e2 not]
`@e1 select:[:`v1 | `@e2 not]
`.duplicate
matches any statement.
`.duplicate.
`.duplicate
should match two identical consecutive statements that are the whole body
of a sequence node.
However as soon as we get beyond a within-statement expression,
we are matching sequence nodes.
Matching two statements within a sequence node therefore requires
`.@beforeStatements. "<- notice the period at the end here"
`.duplicate.
`.duplicate.
`.afterStatements
Because the .
makes the tool build a sequence node,
you must provide the "zero or more statements before and after"-code,
unlike the expression case,
where it could match an expression within a longer expression.
| `@temps |
`.@beforeStatements. "<- notice the period at the end here"
`.duplicate.
`.duplicate.
`.afterStatements
which will match two duplicate statements within any sequence of
statements.
It can be tricky to match sequence nodes. Even Don (one of the original authors of the refactory code) admitted, that he usually took two or three goes to get his expression right.
One problem with the above is that the whole match is presented as search result; although you are usually only interrested in the duplicate statement(s).
For example, to match a statement sequence of an instance creation followed by a message sent to the just created object requires more than just the two statement matches.
The following pattern searches for the creation of an OrderedCollection
followed by an add:
"
| `@temps |
`.@beforeStatements.
`v = OrderedCollection new.
`v add: `@expr.
`.afterStatements
Copyright © eXept Software AG, all rights reserved