eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'PhoneticStringUtilities::SpanishPhoneticCodeStringComparator':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: SpanishPhoneticCodeStringComparator (private in PhoneticStringUtilities

This class is only visible from within PhoneticStringUtilities.

Inheritance:

   Object
   |
   +--PhoneticStringUtilities::PhoneticStringComparator
      |
      +--PhoneticStringUtilities::SingleResultPhoneticStringComparator
         |
         +--PhoneticStringUtilities::SpanishPhoneticCodeStringComparator

Package:
stx:libbasic2
Category:
Collections-Text-Support
Owner:
PhoneticStringUtilities

Description:


The 'Spanish Phonetik' (spanish phonetic) code is for the spanish language 
what the soundex code is for english:
   it returns similar strings for similar sounding words 
(but is specifically aware of the pronunciation of spanish) . 

There are some other differences to soundex, though: 
   its length is not limited to 4, but depends on the length of the original string;
   it does not start with the first character of the input, 
   but returns a pure numeric string,
   it uses different character groups

This algorithm was described by Marıa del Pilar Angeles, Adrian Espino-Gamez, 
and Jonathan Gil-Moncada, in 
   'Comparison of a Modified Spanish phonetic,
    Soundex, and Phonex coding functions during data matching process'
See  https://www.researchgate.net/publication/285589803_Comparison_of_a_Modified_Spanish_Phonetic_Soundex_and_Phonex_coding_functions_during_data_matching_process


Instance protocol:

api
o  encode: aString
return a spanish phonetic code.
The spanishPhonetic code is for the spanish language what the soundex code is for english;
it returns simular strings for similar sounding words.
There are some differences to soundex, though:
its length is not limited to 4, but depends on the length of the original string;
it does not start with the first character of the input,
it uses different character groups.
This algorithm is described by Marıa del Pilar Angeles, Adrian Espino-Gamez,
Jonathan Gil-Moncada.

usage example(s):

     self new encode:'Jose'

private
o  convertFirst: chars
#(

o  convertRest: chars
used to be matchpattern code,


Examples:


words sounding similar (german pronunciation) will deliver a similar code: #( 'Marıa' 'Pilar' 'Angeles' 'Adrian' 'Gamez' ) do:[:w | Transcript show:w; show:'->'; showCR:(PhoneticStringUtilities::SpanishPhoneticCodeStringComparator new encode:w) ].

ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Sun, 27 Nov 2022 08:15:40 GMT