|
Class: KoelnerPhoneticCodeStringComparator (private in PhoneticStringUtilities
This class is only visible from within
PhoneticStringUtilities.
Object
|
+--PhoneticStringUtilities::PhoneticStringComparator
|
+--PhoneticStringUtilities::SingleResultPhoneticStringComparator
|
+--PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator
- Package:
- stx:libbasic2
- Category:
- Collections-Text-Support
- Owner:
- PhoneticStringUtilities
The 'Kölner Phonetik' (cologne phonetic) code is for the german language
what the soundex code is for english:
it returns similar strings for similar sounding words
(but is specifically aware of the pronunciation of German and eastern languages) .
There are some other differences to soundex, though:
its length is not limited to 4, but depends on the length of the original string;
it does not start with the first character of the input, but returns a pure numeric string.
This algorithm was described by Postel 1969,
See http://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik
self new phoneticStringsFor:'Müller-Lüdenscheidt' -> #('65752682')
api
-
encode: aString
-
return a koelner phonetic code.
The koelnerPhonetic code is for the german language what the soundex code is for english;
it returns simular strings for similar sounding words.
There are some differences to soundex, though:
its length is not limited to 4, but depends on the length of the original string;
it does not start with the first character of the input.
This algorithm is described by Postel 1969
Usage example(s):
#(
'Müller'
'Miller'
'Mueller'
'Mühler'
'Mühlherr'
'Mülherr'
'Myler'
'Millar'
'Myller'
'Müllar'
'Müler'
'Muehler'
'Mülller'
'Müllerr'
'Muehlherr'
'Muellar'
'Mueler'
'Mülleer'
'Mueller'
'Nüller'
'Nyller'
'Niler'
'Czerny'
'Tscherny'
'Czernie'
'Tschernie'
'Schernie'
'Scherny'
'Scherno'
'Czerne'
'Zerny'
'Tzernie'
'Breschnew'
'Breschnew'
'Breschneff'
'Breschnjeff'
'Braeschneff'
'Braessneff'
'Pressneff'
'Presznäph'
'Präschnäf'
'Breschnjeff'
'Breschnijeff'
'Breschnieff'
) do:[:w |
Transcript show:w; show:'->'; showCR:(PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:w)
].
|
Usage example(s):
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnew' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschneff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Braeschneff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Braessneff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Pressneff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Presznäph' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Präschnäf' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnjeff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnijeff' -> '17863'
PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnieff' -> '17863'
|
Usage example(s):
self basicNew encode:'müller' -> '657'
self basicNew encode:'möller' -> '657'
self basicNew encode:'miller' -> '657'
self basicNew encode:'muller' -> '657'
self basicNew encode:'muler' -> '657'
self basicNew encode:'schmidt' -> '862'
self basicNew encode:'schneider' -> '8627'
self basicNew encode:'fischer' -> '387'
self basicNew encode:'weber' -> '317'
self basicNew encode:'meyer' -> '67'
self basicNew encode:'wagner' -> '3467'
self basicNew encode:'schulz' -> '858'
self basicNew encode:'becker' -> '147'
self basicNew encode:'hoffmann' -> '036'
self basicNew encode:'schäfer' -> '837'
|
private
-
convertFirst: chars
-
#(
-
convertRest: chars
-
used to be matchpattern code,
words sounding similar (german pronunciation) will deliver a similar code:
#(
'Grossbottwar'
'Großbottwar'
'Großbottwar-Winzerhausen'
'Großbottwar-Winserhausen'
'Gr0ßbottwar-Winserhausen'
'GrOßb0ttwar-Winserhausen'
'Müller'
'Miller'
'Mueller'
'Mühler'
'Mühlherr'
'Mülherr'
'Myler'
'Millar'
'Myller'
'Müllar'
'Müler'
'Muehler'
'Mülller'
'Müllerr'
'Muehlherr'
'Muellar'
'Mueler'
'Mülleer'
'Mueller'
'Nüller'
'Nyller'
'Niler'
'Czerny'
'Tscherny'
'Czernie'
'Tschernie'
'Schernie'
'Scherny'
'Scherno'
'Czerne'
'Zerny'
'Tzernie'
'Breschnew'
'Breschnew'
'Breschneff'
'Breschnjeff'
'Braeschneff'
'Braessneff'
'Pressneff'
'Presznäph'
'Präschnäf'
'Breschnjeff'
'Breschnijeff'
'Breschnieff'
'Bräschnieff'
'Braschnieff'
'Broschnieff'
) do:[:w |
Transcript show:w; show:'->'; showCR:(PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:w)
].
|