eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: KoelnerPhoneticCodeStringComparator (private in PhoneticStringUtilities

This class is only visible from within PhoneticStringUtilities.

Inheritance:

   Object
   |
   +--PhoneticStringUtilities::PhoneticStringComparator
      |
      +--PhoneticStringUtilities::SingleResultPhoneticStringComparator
         |
         +--PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator

Package:
stx:libbasic2
Category:
Collections-Text-Support
Owner:
PhoneticStringUtilities

Description:


The 'Kölner Phonetik' (cologne phonetic) code is for the german language 
what the soundex code is for english:
   it returns similar strings for similar sounding words 
(but is specifically aware of the pronunciation of German and eastern languages) . 

There are some other differences to soundex, though: 
   its length is not limited to 4, but depends on the length of the original string;
   it does not start with the first character of the input, but returns a pure numeric string.

This algorithm was described by Postel 1969,
See  http://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik

    self new phoneticStringsFor:'Müller-Lüdenscheidt' -> #('65752682')


Instance protocol:

api
o  encode: aString
return a koelner phonetic code.
The koelnerPhonetic code is for the german language what the soundex code is for english;
it returns simular strings for similar sounding words.
There are some differences to soundex, though:
its length is not limited to 4, but depends on the length of the original string;
it does not start with the first character of the input.
This algorithm is described by Postel 1969

usage example(s):

     #(
        'Müller'
        'Miller'
        'Mueller'
        'Mühler'
        'Mühlherr'
        'Mülherr'
        'Myler'
        'Millar'
        'Myller'
        'Müllar'
        'Müler'
        'Muehler'
        'Mülller'
        'Müllerr'
        'Muehlherr'
        'Muellar'
        'Mueler'
        'Mülleer'
        'Mueller'
        'Nüller'
        'Nyller'
        'Niler'
        'Czerny'
        'Tscherny'
        'Czernie'
        'Tschernie'
        'Schernie'
        'Scherny'
        'Scherno'
        'Czerne'
        'Zerny'
        'Tzernie'
        'Breschnew'
        'Breschnew'
        'Breschneff'
        'Breschnjeff'
        'Braeschneff'
        'Braessneff' 
        'Pressneff' 
        'Presznäph'
        'Präschnäf' 
        'Breschnjeff' 
        'Breschnijeff' 
        'Breschnieff' 
     ) do:[:w |
         Transcript show:w; show:'->'; showCR:(PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:w)
     ].

usage example(s):

     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnew' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschneff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Braeschneff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Braessneff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Pressneff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Presznäph' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Präschnäf' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnjeff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnijeff' -> '17863'
     PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:'Breschnieff' -> '17863'

usage example(s):

     self basicNew encode:'müller'      -> '657'   
     self basicNew encode:'möller'      -> '657'
     self basicNew encode:'miller'      -> '657'     
     self basicNew encode:'muller'      -> '657'
     self basicNew encode:'muler'       -> '657'
     self basicNew encode:'schmidt'     -> '862'   
     self basicNew encode:'schneider'   -> '8627' 
     self basicNew encode:'fischer'     -> '387' 
     self basicNew encode:'weber'       -> '317' 
     self basicNew encode:'meyer'       -> '67' 
     self basicNew encode:'wagner'      -> '3467' 
     self basicNew encode:'schulz'      -> '858'
     self basicNew encode:'becker'      -> '147'
     self basicNew encode:'hoffmann'    -> '036'
     self basicNew encode:'schäfer'     -> '837' 

private
o  convertFirst: chars
#(

o  convertRest: chars
used to be matchpattern code,


Examples:


words sounding similar (german pronunciation) will deliver a similar code: #( 'Müller' 'Miller' 'Mueller' 'Mühler' 'Mühlherr' 'Mülherr' 'Myler' 'Millar' 'Myller' 'Müllar' 'Müler' 'Muehler' 'Mülller' 'Müllerr' 'Muehlherr' 'Muellar' 'Mueler' 'Mülleer' 'Mueller' 'Nüller' 'Nyller' 'Niler' 'Czerny' 'Tscherny' 'Czernie' 'Tschernie' 'Schernie' 'Scherny' 'Scherno' 'Czerne' 'Zerny' 'Tzernie' 'Breschnew' 'Breschnew' 'Breschneff' 'Breschnjeff' 'Braeschneff' 'Braessneff' 'Pressneff' 'Presznäph' 'Präschnäf' 'Breschnjeff' 'Breschnijeff' 'Breschnieff' 'Bräschnieff' 'Braschnieff' 'Broschnieff' ) do:[:w | Transcript show:w; show:'->'; showCR:(PhoneticStringUtilities::KoelnerPhoneticCodeStringComparator new encode:w) ].

ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Fri, 26 Apr 2024 05:49:15 GMT