Loading AI tools
Transliteration for Indian languages From Wikipedia, the free encyclopedia
WX notation is a transliteration scheme for representing Indian languages in ASCII. This scheme originated at IIT Kanpur for computational processing of Indian languages, and is widely used among the natural language processing (NLP) community in India. The notation (though unidentified) is used, for example, in a textbook on NLP from IIT Kanpur.[1] The salient features of this transliteration scheme are: Every consonant and every vowel has a single mapping into Roman. Hence it is a prefix code,[2] advantageous from a computation point of view. Typically the small case letters are used for un-aspirated consonants and short vowels while the capital case letters are used for aspirated consonants and long vowels. While the retroflexed voiceless and voiced consonants are mapped to 't, T, d and D', the dentals are mapped to 'w, W, x and X'. Hence the name of the scheme "WX", referring to the idiosyncratic mapping. Ubuntu Linux provides a keyboard support for WX notation.
This article needs additional citations for verification. (February 2014) |
In the following tables, such idiosynratic assignment of letters to phonemes with which they are normally not associated are boldfaced. Besides 'w', 'W', 'x', and 'X', these are 'f', 'F', 'q', 'Q', 'L', and 'R'.
अ | आ | इ | ई | उ | ऊ | ए | ऐ | ओ | औ |
a | A | i | I | u | U | e | E | o | O |
ऋ | ॠ | ऌ |
q | Q | L |
ं | ः |
M | H |
The Anunasika is represented by 'z'. For example, अँ = az. In Sanskrit, the Avagraha is represented by 'Z'. For example, वमतोऽन्तः = vamawoZnwaH. This may cause confusion as 'Z' is also used for another purpose in the case of other Indic languages (see below, last paragraph).
क् | ख् | ग् | घ् | ङ् | Velar |
k | K | g | G | f | |
च् | छ् | ज् | झ् | ञ् | Palatal |
c | C | j | J | F | |
ट् | ठ् | ड् | ढ् | ण् | Retroflex |
t | T | d | D | N | |
त् | थ् | द् | ध् | न् | Dental |
w | W | x | X | n | |
प् | फ् | ब् | भ् | म् | Labial |
p | P | b | B | m | |
य् | र् | ल् | व् | Semi-vowel | |
y | r | l | v | ||
श् | ष् | स् | ह् | Fricative | |
S | R | s | h |
This scheme was further extended to represent all the Indian scripts derived from Brahmi. To account for the characters from other Indian languages that are missing in Devanagari, three operators are used: 'Y' to get the next ISCII character, 'V' to get the previous ISCII character and 'Z' to add the nuqta. Thus for example, 'l' represents ल (U0932) of Devanagari, and 'lY' represents ळ (U0933) in Marathi. 'e' represents ए (U090F) of Devanagari or ఏ (U0C0F) of Telugu and eV represents ऎ (U090E) or ఎ (U0C0E) of Telugu. Similarly 'ka' represents क of Devanagari, and 'kZa' represents क़.
Seamless Wikipedia browsing. On steroids.
Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.
Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.