Qur'an | Word by Word | Audio | Prayer Times
__ Sign In
 
__

Java API - Buckwalter Transliteration

__

Buckwalter transliteration uses ASCII characters to represent Arabic orthography. As there is a one-to-one correspondence with Unicode, the encoding scheme is reversible. JQuranTree uses a superset of Buckwalter transliteration to enable reversible transliteration of Tanzil XML.

Extended Buckwalter Transliteration

There are 4 non-arabic characters in the original encoding scheme with are not found in the Quranic text: P (peh), J (tcheh), V (veh) and G (gaf). The combination character alif + maddah (|) is also not used in Tanzil XML. These characters are not implemented by the JQuranTree Buckwalter encoder.

Likewise, 14 Quranic symbols do not feature in the original scheme. In the extended scheme used by JQuranTree, these have been assigned to ASCII punctuation marks. This is not ambiguous, since modern punctuation does not occur in the Quran:

- Maddah (^)
- HamzaAbove (#)
- SmallHighSeen (:)
- SmallHighRoundedZero (@)
- SmallHighUprightRectangularZero (")
- SmallHighMeemIsolatedForm ([)
- SmallLowSeen (;)
- SmallWaw (,)
- SmallYa (.)
- SmallHighNoon (!)
- EmptyCentreLowStop (-)
- EmptyCentreHighStop (+)
- RoundedHighStopWithFilledCentre (%)
- SmallLowMeem (])

The extended Buckwalter transliteration scheme is shown in Fig 1. below. Sections highlighed in yellow indicate those parts of the scheme which have been extended over the original:

UNICODE BUCKWALTER
Decimal Hex Glyph ASCII Orthography
1569 U+0621 ' Hamza
1571 U+0623 > Alif + HamzaAbove
1572 U+0624 & Waw + HamzaAbove
1573 U+0625 < Alif + HamzaBelow
1574 U+0626 } Ya + HamzaAbove
1575 U+0627 A Alif
1576 U+0628 b Ba
1577 U+0629 p TaMarbuta
1578 U+062A t Ta
1579 U+062B v Tha
1580 U+062C j Jeem
1581 U+062D H HHa
1582 U+062E x Kha
1583 U+062F d Dal
1584 U+0630 * Thal
1585 U+0631 r Ra
1586 U+0632 z Zain
1587 U+0633 s Seen
1588 U+0634 $ Sheen
1589 U+0635 S Sad
1590 U+0636 D DDad
1591 U+0637 T TTa
1592 U+0638 Z DTha
1593 U+0639 E Ain
1594 U+063A g Ghain
1600 U+0640 _ Tatweel
1601 U+0641 f Fa
1602 U+0642 q Qaf
1603 U+0643 k Kaf
1604 U+0644 l Lam
1605 U+0645 m Meem
1606 U+0646 n Noon
1607 U+0647 h Ha
1608 U+0648 w Waw
1609 U+0649 Y AlifMaksura
1610 U+064A y Ya
1611 U+064B F Fathatan
1612 U+064C N Dammatan
1613 U+064D K Kasratan
1614 U+064E a Fatha
1615 U+064F u Damma
1616 U+0650 i Kasra
1617 U+0651 ~ Shadda
1618 U+0652 o Sukun
1619 U+0653 ^ Maddah
1620 U+0654 # HamzaAbove
1648 U+0670 ` AlifKhanjareeya
1649 U+0671 { Alif + HamzatWasl
1756 U+06DC : SmallHighSeen
1759 U+06DF @ SmallHighRoundedZero
1760 U+06E0 " SmallHighUprightRectangularZero
1762 U+06E2 [ SmallHighMeemIsolatedForm
1763 U+06E3 ; SmallLowSeen
1765 U+06E5 , SmallWaw
1766 U+06E6 . SmallYa
1768 U+06E8 ! SmallHighNoon
1770 U+06EA - EmptyCentreLowStop
1771 U+06EB + EmptyCentreHighStop
1772 U+06EC % RoundedHighStopWithFilledCentre
1773 U+06ED ] SmallLowMeem

Fig 1. Extended Buckwalter transliteration.


See Also

Language Research Group
University of Leeds
__