eXept Software AG Logo

Smalltalk/X Webserver

Documentation of class 'PhoneticStringUtilities::MetaphoneStringComparator':

Home

Documentation
www.exept.de
Everywhere
for:
[back]

Class: MetaphoneStringComparator (private in PhoneticStringUtilities

This class is only visible from within PhoneticStringUtilities.

Inheritance:

   Object
   |
   +--PhoneticStringUtilities::PhoneticStringComparator
      |
      +--PhoneticStringUtilities::SingleResultPhoneticStringComparator
         |
         +--PhoneticStringUtilities::MetaphoneStringComparator

Package:
stx:libbasic2
Category:
Collections-Text-Support
Owner:
PhoneticStringUtilities

Description:


Ongoing work - do not use at the moment

Encodes a string into a Metaphone value.

Initial Java implementation by <CITE>William B. Brogden. December, 1997</CITE>.
Permission given by <CITE>wbrogden</CITE> for code to be used anywhere.

 Hanging on the Metaphone by Lawrence Philips in Computer Language of Dec. 1990, p 39.
 Note, that this does not match the algorithm that ships with PHP, or the algorithm found in the Perl implementations:
 https://metacpan.org/source/MSCHWERN/Text-Metaphone-1.96//Metaphone.pm6

  They have had undocumented changes from the originally published algorithm.
  For more information, see https://issues.apache.org/jira/browse/CODEC-57

  Metaphone uses the following rules:

 Doubled letters except 'c' -> drop 2nd letter.
 Vowels are only kept when they are the first letter.
 B -> B unless at the end of a word after 'm' as in 'dumb'
 C -> X (sh) if -cia- or -ch-
 S if -ci-, -ce- or -cy-
 K otherwise, including -sch-
 D -> J if in -dge-, -dgy- or -dgi-; T otherwise
 F -> F
 G -> silent if in -gh- and not at end or before a vowel in -gn- or -gned- (also see dge etc. above)
 J if before i or e or y if not double gg; K otherwise
 H -> silent if after vowel and no vowel follows; H otherwise
 J -> J
 K -> silent if after 'c'; K otherwise
 L -> L
 M -> M
 N -> N
 P -> F if before 'h'; P otherwise
 Q -> K
 R -> R
 S -> X (sh) if before 'h' or in -sio- or -sia-; S otherwise
 T -> X (sh) if -tia- or -tio- 0 (th) if before 'h' silent if in -tch-; T otherwise
 V -> F
 W -> silent if not followed by a vowel W if followed by a vowel
 X -> KS
 Y -> silent if not followed by a vowel Y if followed by a vowel
 Z -> S

 Initial Letter Exceptions

 Initial kn-, gn- pn, ae- or wr- -> drop first letter
 Initial x- -> change to 's'
 Initial wh- -> change to 'w'


  self new encode:'a'
  self new encode:'dumb'
  self new encode:'MILLER'
  self new encode:'schmidt'
  self new encode:'schneider'
  self new encode:'FISCHER'
  self new encode:'HEDGY'
  self new encode:'weber'
  self new encode:'wagner'
  self new encode:'van gogh'


Instance protocol:

api
o  encode: txt
self new encode:'a'
self new encode:'MILLER'
self new encode:'schmidt'
self new encode:'schneider'
self new encode:'FISCHER'
self new encode:'HEDGY'
self new encode:'weber'
self new encode:'wagner'
self new encode:'van gogh'
self new encode:'dumb'



ST/X 7.2.0.0; WebServer 1.670 at bd0aa1f87cdd.unknown:8081; Fri, 19 Apr 2024 03:15:35 GMT