This is generic algorithms stuff too so please dont stop reading if you see solr in text (please skip first 3 lines)
In Solr, For spell checking component I set extendedResults to get the frequencies of the corrected word and then select the word with the best frequency. I understand the spell check algorithm based on Edit Distance. For an example:
Query to Solr: Marien
Spell Check Text Returned: Marine (Freq: 120), Market (Freq: 900) and others. My dictionary here is based on indexed words.
So I chose Market (more frequency) however which is wrong as my intent was marine. Both have Edit Distance of 2.
Now how can I improve this Algorithm to select marine instead of market (based on something more than edit distance and frequency stuff)?
Do I have to incorporate some "soundex" algorithms too?
I am looking for simple stuff which I can quickly implement.
I even tried using Peter Norvig's spell corrector Algorithm (which is great) but again I ran in same problems.