[EPIC] Support "second-try" transliteration or wrong-keyboard searches (aka N.O.R.M.)
Open, HighPublic

Description

Normalizing Orthographic Re-Mapper (aka N.O.R.M.)

Build out the necessary infrastructure to support various kinds of text-mapping "second-try" searches, including "DWIM"-style wrong-keyboard searches (i.e., accidentally typing on a Russian/Cyrillic on a US/Latin keyboard) and transliterated searches (i.e., typing Georgian or Hindi in Latin script).

A good place to start is replicating the Russian and Hebrew DWIM gadget's autocomplete results enhancement, and then extending that breadth-first to Georgian and Hindi transliteration in autocomplete, or depth-first into full-text results.

wrong keyboard tickets:

translteration tickets:

Note: Naming is hard. DWIM ("do what I mean") is/was an on-wiki gadget that supported wrong-keyboard searches on Russian and Hebrew wikis. However, it sounds a little too much like DYM ("did you mean"), our query reformulation suggestion feature. We've used second-chance and second-try in the past to refer to a number of related approaches that are a superset of what is under consideration here. Hence "N.O.R.M.", the Normalizing Orthographic Re-Mapper, which would be a shared infrastructure that would allow us to convert both Fhbcnjntkm to Аристотель ("Aristotle") on Russian wikis and devanagari ka itihas to देवनागरी का इतिहास ("history of Devanagari") on Hindi wikis in a variety of useful ways.

Previous on-wiki write ups:

Related Objects

Event Timeline

TJones renamed this task from [EPIC] Create infrastructure to support "second-try" transliteration or wrong-keyboard searches (aka N.O.R.M.) to [EPIC] Support "second-try" transliteration or wrong-keyboard searches (aka N.O.R.M.).Sep 19 2024, 3:49 PM
TJones triaged this task as High priority.

Is there any plan to make it so that search pages like https://ru.wikipedia.org/wiki/Special:Search/,fhfr_j,fvf («Барак Обама», Barack Obama) display relevant results as well?

Is there any plan to make it so that search pages like https://ru.wikipedia.org/wiki/Special:Search/,fhfr_j,fvf («Барак Обама», Barack Obama) display relevant results as well?

No firm plans yet, but we are looking into using the same techniques to make "Did you mean" suggestions. In this case, because there are zero results for the search, it would also roll over to the suggestion, and you'd see something like "Showing results for Барак Обама. No results found for ,fhfr j,fvf." (but in Russian, of course).

We started with the autocomplete suggestions because they are more forgiving. If the transformed query is off by one letter, for example, it might still match a good title. We're working on autocomplete suggestions in different languages for now—Russian and Hebrew DWIM are live, Georgian transliteration is live, Hindi transliteration is being researched—but we do eventually want to explore expanding some or all of these second-try methods to "Did you mean" if it makes sense.