|
Class: FuzzyMatcher
Object
|
+--FuzzyMatcher
- Package:
- stx:libbasic2
- Category:
- Collections-Text-Support
- Version:
- rev:
1.10
date: 2017/08/02 14:03:33
- user: cg
- file: FuzzyMatcher.st directory: libbasic2
- module: stx stc-classLibrary: libbasic2
FuzzyMatcher is an approximate string matching algorithm that can determine if a string includes a given pattern.
For example, the string 'axby' matches both the pattern 'ab' and, 'ay', but not 'ba'.
I.e. it matches if the searched string contains a sequence of chars, probably intermixed by other chars,
which matches the given search pattern or part of it.
The algorithm is based on lib_fts[1], and includes an optional scoring algorithm
that can be used to sort all the matches based on their similarity to the pattern.
It is used (among others) in the sublime text editor.
[caveat:]
although this works great for class searches,
it is strange that 'dabc' scores lower against 'abc' than 'adbc'
(dabc has a longer common subsequence without interruptions...)
[ttps]
[ttps]
instance creation
-
new
-
return an initialized instance
-
pattern: aString
-
(self pattern:'mrp') matches:'ButtonMorph'
(self pattern:'mrp') matches:'ButtonMorh'
(self pattern:'mrp') matches:'ButtonMorph'
(self pattern:'mrp') matches:'ButtonMorh'
utilities api
-
allMatching: aPattern in: aCollection
-
Assumes that the collection is a collection of Strings;
return all those which match
usage example(s):
self
allMatching:'clu'
in:(Smalltalk allClasses collect:#name)
|
-
allMatching: aPattern in: aCollection by: aBlockReturningString
-
selects matching elements from aCollection.
aBlockReturningString is applied to elements to get the string representation
(can be used eg. to sort classes)
usage example(s):
self
allMatching:'clu'
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allMatching:'clu'
in:(Smalltalk allClasses)
by:#name
|
-
allSortedByScoreMatching: aPattern in: aCollection
-
Assumes that the collection is a collection of Strings;
returns matching strings sorted by score (level of similarity)
usage example(s):
self
allSortedByScoreMatching:'clu'
in:(Smalltalk allClasses collect:#name)
|
usage example(s):
self
allSortedByScoreMatching:'nary'
in:(Smalltalk allClasses collect:#name)
|
-
allSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
-
selects matching elements from aCollection.
aBlockReturningString is applied to elements to get the string representation.
Returns them sorted by score (i.e. similarity).
(can be used eg. to sort classes)
usage example(s):
self
allSortedByScoreMatching:''
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allSortedByScoreMatching:'nary'
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allSortedByScoreMatching:'nary'
in:(Smalltalk allClasses)
by:#name
|
-
allWithScoresSortedByScoreMatching: aPattern in: aCollection by: aBlockReturningString
-
selects matching elements from aCollection.
aBlockReturningString is applied to elements to get the string representation.
Returns them sorted by score (i.e. similarity) associated to their scores.
(can be used eg. to sort classes)
usage example(s):
self
allWithScoresSortedByScoreMatching:''
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allWithScoresSortedByScoreMatching:'OC'
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allWithScoresSortedByScoreMatching:'nary'
in:(Smalltalk allClasses)
by:[:cls | cls name]
|
usage example(s):
self
allWithScoresSortedByScoreMatching:'nary'
in:(Smalltalk allClasses)
by:#name
|
accessing
-
indexes
-
only valid inside the match callback block
-
pattern
-
-
pattern: aString
-
Modified (format): / 14-07-2017 / 12:59:15 / cg
api - comparing
-
match: aString ifScored: aBlock
-
If there is a match, evaluate aBlock, passing the score value
-
matchScoreOrNil: aString
-
return the scrore if there is a match; nil otherwise.
-
matches: aString
-
return true if there is a match; false otherwise.
initialization
-
initialize
-
Modified (format): / 14-07-2017 / 13:23:26 / cg
private
-
firstScore: aString at: anIndex
-
-
indexScore
-
Modified (format): / 14-07-2017 / 13:24:07 / cg
-
isSeparator: aCharacter
-
-
score: aString at: stringIndex patternAt: patternIndex
-
scoring-bonus
-
adjacencyBonus
-
-
adjacencyIncrease
-
-
adjacentCaseEqualBonus
-
-
caseEqualBonus
-
-
firstLetterBonus
-
-
separatorBonus
-
scoring-penalty
-
leadingLetterPenalty
-
-
maxLeadingLetterPenalty
-
-
unmatchedLetterPenalty
-
|