|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.jqurantree.arabic.encoding.ArabicEncoderBase
public abstract class ArabicEncoderBase
ArabicEncoderBase
is an abstract base class providing a common
implementation for ArabicText
encoders. The
class supports the
ArabicText.toString(EncodingType)
method by
implementing table-driven encoding. An EncodingTableBase
instance is
used to lookup the mapping for each character in the source text.
The following encoding algorithm is reversible, ensuring that round trip
testing is possible. For each ArabicCharacter
:
Step 1. If the letter or Quranic symbol has a diacritic that forms a well known combination, then map this onto a single output character. If Hamza above was the diacritic used, then remove this from the list of diacritics to consider. The 6 well known combinations are:
- Alif/Waw/Ya + Hamza above
- Alif + Hamza below
- Alif + Hamzat
wasl
- Alif + Khanjareeya (superscript Alif)
Step 2. If Step 1 did not apply, then use the EncodingTableBase
instance to determine the output character to use for the letter or Quranic
symbol, without its diacritics.
Step 3. Use the encoding table to form output characters out any remaining diacritics, in the following order:
- Hamza above
- Shadda
- Fathatan
- Dammatan
-
Kasratan
- Fatha
- Damma
- Kasra
- Sukun
-
Maddah
Field Summary | |
---|---|
protected java.lang.StringBuilder |
text
A string buffer used to hold the encoder's plain text
output. |
Constructor Summary | |
---|---|
protected |
ArabicEncoderBase()
Creates a new encoder. |
protected |
ArabicEncoderBase(EncodingTableBase encodingTable)
Creates a new encoder using the specified encoding table. |
Method Summary | |
---|---|
java.lang.String |
encode(byte[] buffer,
int offset,
int characterCount,
EncodingOptions options)
Encodes the internal ByteFormat into plain
text according to the encoding scheme. |
protected void |
encodeCharacter(byte[] buffer,
int offset)
Encodes a single ArabicCharacter in the
internal ByteFormat . |
protected void |
writeCharacterSeperator()
Overriden by derived encoders to write a seperator between each ArabicCharacter . |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected final java.lang.StringBuilder text
string
buffer used to hold the encoder's plain text
output.
Constructor Detail |
---|
protected ArabicEncoderBase()
protected ArabicEncoderBase(EncodingTableBase encodingTable)
encodingTable
- the encoding table to use when performing table-driven
encoding.Method Detail |
---|
public java.lang.String encode(byte[] buffer, int offset, int characterCount, EncodingOptions options)
ArabicEncoder
ByteFormat
into plain
text according to the encoding scheme.
encode
in interface ArabicEncoder
buffer
- the byte[]
array to encode in the internal
ByteFormat
offset
- the starting offset in the buffercharacterCount
- the number of characters to encode. Each character is
represented by 3 bytes in the buffer.
string
protected void writeCharacterSeperator()
ArabicCharacter
.
protected void encodeCharacter(byte[] buffer, int offset)
ArabicCharacter
in the
internal ByteFormat
.
buffer
- the byte[]
buffer holding the characteroffset
- the offset of the character within the buffer
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |