Simple encoding is an easy to read encoding scheme that shows the name of each letter within Arabic text, together with the names of any attached diacritics. The toSimpleEncoding() method can be used to convert any ArabicCharacter to simple encoding. The output will be name of the character, followed by any attached diacritics seperated by a + (plus) sign. An example of a single character converted to simple encoding is shown below:
Ya + Shadda + Fatha
The ArabicText class also supports the toSimpleEncoding() method. In this case a sequence of characters will be converted to simple encoding. Each character will be seperated by a | sign (vertical bar), with diacritics within a character seperated by a + (plus) sign. The simple encoding below represents a single token with 5 characters:
Ya + Fatha | HHa + Sukun | Ya + Fatha | AlifMaksura | AlifKhanjareeya
Note that in the case of alif khanjarīya only the diacritic name will be displayed for readability. The character itself is actually encoded is an Alif character type together with an AlifKhanjareeya diacritic type.