Daitch-Mokotoff Soundex
Jump to navigation
Jump to search
Daitch-Mokotoff Soundex (D-M Soundex) is a phonetic algorithm invented in 1985 by genealogist Gary Mokotoff, and later improved by Randy Daitch, both of the Jewish Genealogical Society. It is a refinement of the Russell and American Soundex algorithms designed to allow matching of Slavic and Yiddish surnames with similar pronunciation but differences in spelling.
Daitch-Mokotoff Soundex is sometimes referred to as "Jewish Soundex" and "Eastern European Soundex", although the authors discourage use of these nicknames for the algorithm.
Improvements
Improvements over the older Soundex algorithms include:
- Coded names are six digits long, resulting in greater search precision (traditional Soundex uses four characters)
- Coded names can be stored as numeric values, which can save space in some applications (regular Soundex encodes values as alphanumeric text)
- Several rules in the algorithm encode multiple character n-grams as single digits (American and Russell Soundex do not handle multi-character n-grams)
- Multiple possible encodings can be returned for a single name (traditional Soundex returns only one encoding, even if the spelling of a name could potentially have multiple pronunciations)
Examples
Some examples:
Surname | American Soundex | D-M Soundex |
Peters | P362 | 739400, 734000 |
Peterson | P362 | 739460, 734600 |
Moskowitz | M232 | 645740 |
Moskovitz | M213 | 645740 |
Auerbach | A612 | 097500, 097400 |
Uhrbach | U612 | 097500, 097400 |
Jackson | J250 | 154600, 454600, 145460, 445460 |
Jackson-Jackson | J252 | 154664, 454664, 145466, 445466, 154646, 454646, 145464, 445464 |
See also
External links
- Mokotoff, Gary. "Soundexing and Genealogy." Describes the history and the motivations behind D-M Soundex.
- JewishGen. "Soundex Coding." Describes both Russel and D-M Soundex.
- Project Dedupe http://dedupe.sourceforge.net
- Coles, Michael. "SQL 2000 DBA Toolkit, Part 3: Phonetic Matching" SQL Server-based implementation of the D-M Soundex algorithm w/source.