alwfa 1.0: Internet edition
[May 2, 2008] ALFAS name is now changed to ALWFA (Arabic Letter & Word Frequency Analyzer).[April 26, 2008] This is the first post of this article. Please help make it better with your feedback. This date-stamped line will appear here until it matures.
ALWFA: Arabic Letter & Word Frequency Analyzer
ALWFA allows you to paste any Arabic text in a text area to compute its frequency distribution. At this time, ALWFA counts only Arabic letters; it skips anything else (e.g., it skips English letters, numbers, punctuation characters, and diacritics (tashkeel or harakaat)). Further, it only accepts input from the your computer's paste sequence (typically, Control+V) or from a drag-and-drop operation.You can resize ALWFA by centering the GUI into your browser vertically using the scrollbar and resizing the browser. ALWFA comes into three flavors: i) ALWFA 1.0 -- this Internet Edition, ii) ALWFA 2.0, computes word frequency in addition to computing letter frequency, and iii) ALWFA 3.0. Visit ALFA for information on history and interesting statistics on Arabic alphabet letters.
While you are at it, try test the accuracy of ALWFA as follows. Take some word with tashkeel, this expression for example: غُفْرَانَكَ رَبَّنَا (if the font is too small for you in Firefox, press the keys Control and "+" simultaneously for bigger font, and press Control and "-" to get smaller font). How many letter do you see? Of course, you would count without the tashkeel. Now, take the same two words into an MS Office Word document and run the word statistics on them ... get it? Much more than what you expected, right? Conclusion? Mine is better ;-). Visit ALFA to find out exactly how ALWFA consumes and processes text.
The GUI is self-explanitory and you should be able to jump right into pasting and analyzing text. Just in case, however, below ALWFA is a description of how to use it.
<If you do not see an applet running below this line, most probably because you are using IE. Please browse using Firefox or Safari>.
Test Drive
To get a feel for running ALWFA, copy or drag the text below (the last three versus from Surat Al-Baqara (The Cow)) into the beige text area of ALWFA and calculate its frequency. Notice that with large input, it make several seconds to show in the text area.
284. لِّلَّهِ ما فِي السَّمَاواتِ وَمَا فِي الأَرْضِ وَإِن تُبْدُواْ مَا فِي أَنفُسِكُمْ أَوْ تُخْفُوهُ يُحَاسِبْكُم بِهِ اللّهُ فَيَغْفِرُ لِمَن يَشَاء وَيُعَذِّبُ مَن يَشَاء وَاللّهُ عَلَى كُلِّ شَيْءٍ قَدِيرٌ
285. آمَنَ الرَّسُولُ بِمَا أُنزِلَ إِلَيْهِ مِن رَّبِّهِ وَالْمُؤْمِنُونَ كُلٌّ آمَنَ بِاللّهِ وَمَلآئِكَتِهِ وَكُتُبِهِ وَرُسُلِهِ لاَ نُفَرِّقُ بَيْنَ أَحَدٍ مِّن رُّسُلِهِ وَقَالُواْ سَمِعْنَا وَأَطَعْنَا غُفْرَانَكَ رَبَّنَا وَإِلَيْكَ الْمَصِيرُ
286. لاَ يُكَلِّفُ اللّهُ نَفْسًا إِلاَّ وُسْعَهَا لَهَا مَا كَسَبَتْ وَعَلَيْهَا مَا اكْتَسَبَتْ رَبَّنَا لاَ تُؤَاخِذْنَا إِن نَّسِينَا أَوْ أَخْطَأْنَا رَبَّنَا وَلاَ تَحْمِلْ عَلَيْنَا إِصْرًا كَمَا حَمَلْتَهُ عَلَى الَّذِينَ مِن قَبْلِنَا رَبَّنَا وَلاَ تُحَمِّلْنَا مَا لاَ طَاقَةَ لَنَا بِهِ وَاعْفُ عَنَّا وَاغْفِرْ لَنَا وَارْحَمْنَآ أَنتَ مَوْلاَنَا فَانصُرْنَا عَلَى الْقَوْمِ الْكَافِرِينَ
ALWFA GUI Dissection
The GUI of ALWFA is composed of three parts; they are described next from top to bottom, right to left:
- Top part: control panel: composed of five items:
- Button Reset: resets the memory and prepares the applet for your next text input.
- Button Calculate Frequency: calculates the frequency of Arabic letters based on the input in the text area shown in beige. The output is shown in two modes: textually in the table, and graphically in the bottom panel.
- Button Sort: sorts the frequency bars shown in the bottom panel. Consecutive presses toggle between two sort modes: i) according to frequency, and ii) according to the letter's unicode value.
- Label (Words, Letters): shows the count of letters and words in input text.
- Combo Box Letter/Word Frequency: Chooses between displaying frequency analysis for letters or for words. This version analyzes only letter frequency.
- Button Reset: resets the memory and prepares the applet for your next text input.
- Middle part: Input/Tabular output panel: composed of two items:
- Text Area: accepts text by means of a keyboard paste (Control+V), or by means of drag-and-drop (e.g., dragging some highlighted text in a Word document and dropping it into the text area)
- Table Letter/Word Frequency: lists the letters and their frequencies in sorted order. More functions are available in the Home Edition and the Professional Edition. Note that you can select any combination of cells in this table to paste into MS Excel for example. To do that, follow these instructions: i) open an MS Excel file first, ii) select some cells by dragging over them, or, click anywhere in the table and issue the key sequence Control + A and C to select the whole table, iii) move to the open Excel document, click on some cell and issue a Control+V command.
- Text Area: accepts text by means of a keyboard paste (Control+V), or by means of drag-and-drop (e.g., dragging some highlighted text in a Word document and dropping it into the text area)
- Bottom part: Visual Frequency Analysis Display. This component shows a visual animated display of the frequency bars for a soft entertainment :-)
You can always reach me on my email on mmadi@intellaren.com.