Wiktionary:Ottoman Turkish entry guidelines
| This is a Wiktionary policy, guideline or common practices page. Specifically it is a policy think tank, working to develop a formal policy. | |
| Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES. |
Language
editOttoman Turkish is the variety of the Turkish language as spoken or written around the Ottoman Empire from the 15th century until its dissolution. The precise cut-off date with modern Turkish is conveniently marked by 1929 Turkish alphabet reform, lagging behind for expatriates and in the French-controlled areas but nonetheless marked by the script. Whether Turkish of the occasional Latin publications in the twenty years before the reform should count as Turkish or added as quotes under Ottoman Turkish entries – at Arabic script page titles – may remain ambiguous for now.
The reason why Ottoman Turkish is distinguished at all as a language from Turkish and its spellings are not simply added as alternative spellings of Turkish entries, as Azerbaijani does, is that Ottoman linguistics is a distinct field of study. Unlike Azerbaijani in Arabic script which lives on in linguistic unity with Azerbaijani in Latin script, Turkish had a break.
Alphabet
editOttoman Turkish entries are lemmatised in the Ottoman Turkish variant of the Perso-Arabic script, the predominant script of the empire. However, since there was no notable printing by the Arabic-writing world until the end of 18th century,[1][2] the Armenian alphabet for Turkish was heavily used in print centuries ahead. Entries in the Armenian alphabet should be handled as alternative forms merely.
Arabic script encoding
editAbout the encoding of entries in the Arabic script the following cases should be noted:
- ه ARABIC LETTER HEH (U+0647) should be used. Whenever it does not connect with the following letter, ZERO WIDTH NON-JOINER (U+200C) should be employed, not ە ARABIC LETTER AE (U+06D5).
- ی ARABIC LETTER FARSI YEH (U+06CC) should be used, not ي ARABIC LETTER YEH (U+064A) or ى ARABIC LETTER ALEF MAKSURA (U+0649).
- ك ARABIC LETTER KAF (U+0643) is used, for the dominating practice of writing and printing Ottoman Turkish resembled this shape, not ک ARABIC LETTER KEHEH (U+06A9). This differs from the practice for Azerbaijani. However the immediate ancestor of both Azerbaijani and Ottoman Turkish, Old Anatolian Turkish, uses ARABIC LETTER KAF (U+0643) again.
- There should be neither soft nor hard redirects accounting for alternative encodings of ه, ی, or ك, since the software already automatically redirects a user typing in a Unicode variant to an existing page. Likewise if an Ottoman text is typed out as quote then this encoding should be adhered to.
- The use of گ ARABIC LETTER GAF (U+06AF) and ڭ ARABIC LETTER NG (U+06AD) should be reserved to the
|head=parameter of the headword template whenever appropriate and in quotes if the quoted passage does contain such distinction. If گ and ڭ are not distinguished in quoted texts, then the distinction should not be introduced by the editor. Page titles should use ك ARABIC LETTER KAF (U+0643) exclusively.
Lexicographic notation
edit| Source | o | ö | u | ü |
|---|---|---|---|---|
| Sami (1899) | وٓ | ۊ | و | ۏ |
| Seydi (1912) | ۆ | وࣸ | ۉ | وࣷ |
| Bahaeddin (1927) | و | ۊ | وٓ | ۏ |
Some monolingual dictionaries use specific notation to differentiate the many vocalic values of و, most of which are already represented by Unicode.
- ۏ ARABIC LETTER WAW WITH DOT ABOVE (U+06CF)
- ۊ ARABIC LETTER WAW WITH TWO DOTS ABOVE (U+06CA)
- وٓ ARABIC LETTER WAW (U+0648) + ARABIC MADDAH ABOVE (U+0653)
- ۆ ARABIC LETTER OE (U+06C6)
- ۉ ARABIC LETTER KIRGHIZ YU (U+06C9)
- an acute-like diacritic and a grave-like diacritic.
In absence of these last two diacritics from Unicode, we recommend to substitute them with وࣷ ARABIC LEFT ARROWHEAD ABOVE (U+08F7) for the acute-like diacritic and وࣸ ARABIC RIGHT ARROWHEAD ABOVE (U+08F8) for the grave-like diacritic.
Romanisation
editOur romanisation system is heavily based on the modern Turkish orthography. Note however some differences:
- Circumflex signs should not be used whenever used simply to infer the Arabic script spelling, as many scholarly works do, but here it is not needed since we have the Arabic form right beside. They similarly should not be employed to tell vowel length, nor on final nisba î. They are however expected on top of a u following k g l pronounced as /c ɟ l/.
- ك whenever inferring a pronunciation /ŋ/ should be romanised as ⟨ñ⟩ LATIN SMALL LETTER N WITH TILDE (U+00F1), unlike modern Turkish n.
- Devoicing, assimilation and word-final degemination should not be transcribed, e.g. بیچاقجی (bıçakcı) yet mod. bıçakçı, ولد (veled) yet mod. velet, شرانپول (şaranpol) yet şarampol, حل (hall) yet hal.
- Spaces of the original script should be preserved, e.g. فیل دیشی (fil dişi), yet mod. fildişi, etc.
- The glottal stop /ʔ/, originating from Arabic hamza and ʿayn, should be transcribed as ⟨ʼ⟩ MODIFIER LETTER APOSTROPHE (U+02BC), so اعتماد (iʼtimad) yet mod. itimat, فعل (fiʼl) yet mod. fiil.
- Capitalisation should not be employed.
The pronunciation section should be employed to give information that the romanisation cannot give, such as the distinction between /h/ and /x/, /ɛ/ and /e/, etc.
See also
editReferences
edit- ^ Ian Dooley (2016), “Cotsen's Covert Collections: The First Illustrated Book Printed in Turkey”, in blogs.princeton.edu[1], archived from the original on 28 July 2021
- ^ Ekrem Buğra Ekinci (2015), “Myths and reality about the printing press in the Ottoman Empire”, in www.dailysabah.com[2], archived from the original on 4 June 2023