Qur'an | Word by Word | Audio | Prayer Times
__ Sign In
 
__

Java API - Simple Encoding

__

Simple encoding is an easy to read encoding scheme that shows the name of each letter within Arabic text, together with the names of any attached diacritics. The toSimpleEncoding() method can be used to convert any ArabicCharacter to simple encoding. The output will be name of the character, followed by any attached diacritics seperated by a + (plus) sign. An example of a single character converted to simple encoding is shown below:

Ya + Shadda + Fatha

The ArabicText class also supports the toSimpleEncoding() method. In this case a sequence of characters will be converted to simple encoding. Each character will be seperated by a | sign (vertical bar), with diacritics within a character seperated by a + (plus) sign. The simple encoding below represents a single token with 5 characters:

Ya + Fatha | HHa + Sukun | Ya + Fatha | AlifMaksura | AlifKhanjareeya

Note that in the case of alif khanjarīya only the diacritic name will be displayed for readability. The character itself is actually encoded is an Alif character type together with an AlifKhanjareeya diacritic type.

See Also

Language Research Group
University of Leeds
__