Qur'an | Word by Word | Audio | Prayer Times
__ Sign In
 
__

Java API - Analysis Table Example

__

This example demonstrates how to use an analysis table. The program finds the 5 longest verses in the Quranic text, measuring length by the number of tokens in each verse. The Java program below searches the orthography model, using an analysis table to collect the results.

Step 1 creates a new empty table by specifying a list of column names. In step 2, each verse is added to the table. In step 3, the table is sorted by the TokenCount column in descending order, and the top 5 rows are displayed. A frequency table is then constructed by grouping the original table by the TokenCount count. The resulting frequency table shows the most common token lengths in the Quranic text.

Java Example

public class AnalysisTableExample {

    public static void main() {

        // -----------------
        // Tabulate and sort
        // -----------------

        // Step 1. Create a new analysis table.
        AnalysisTable table = new AnalysisTable(
            "ChapterNumber", "VerseNumber", "TokenCount");

        // Step 2. Tabulate the number of tokens in each verse.
        for (Verse verse : Document.getVerses()) {
            table.add(
                verse.getChapterNumber(),
                verse.getVerseNumber(),
                verse.getTokenCount());
        }

        // Step 3. Sort the table, then display the first 5 rows.
        table.sort("TokenCount", SortOrder.Descending);
        System.out.println(table.toString(5));

        // -------------
        // Group results
        // -------------

        // Group the token count table by number of tokens.
        AnalysisTable groupTable = table.group("TokenCount");

        // Sort the group table by the Count column, in descending order.
        groupTable.sort("Count", SortOrder.Descending);

        // Display the first 5 rows of the group table.
        System.out.println(groupTable.toString(5));
    }
}

Program Output

ChapterNumber VerseNumber TokenCount
------------- ----------- ----------
2             282         128
4             12          88
24            31          78
73            20          78
24            61          76

TokenCount Count
---------- -----
4          530
5          419
3          402
6          358
11         345

Discussion

A discussion of this example can be found at the page on the analysis table.

See Also

Language Research Group
University of Leeds
__