SIRAT an Arabic Text Editor Makes Real-Time Indexing and Based on the Extraction of Keywords

Citation:

Dilekh T, Benharzallah S. SIRAT an Arabic Text Editor Makes Real-Time Indexing and Based on the Extraction of Keywords. ICCSA’2021 : the 2nd International Conference on Computer Science’s Complex Systems and their Applications, May 25–26, [Internet]. 2021.

Abstract:

Indexing stage in information retrieval process has a great importance as an essential tool for the performance of recall and precision. Despite the many studies that have been done on the indexing conducted in the last few decades, to our knowledge, no study has investigated whether indexing real-time based on keywords extraction is efficient to perform of recall and precision. Moreover, relatively fewer Arabic text indexing studies are currently available despite the enormous efforts put together to satisfy the needs of the growing number of Arabic internet users. This paper suggests a method for Arabic text indexing based on keywords extraction. The proposed method consists of two stages. The first stage conducts a real-time indexing. The second stage is a keywords extraction and updating of initial index taking into account the output of keywords extraction process. We illustrate application and the performance of this method of indexing using an Arabic text editor (SIRAT) developed and designed for this aim. We also illustrate the process of building a new form of Arabic corpus appropriate to conduct the necessary experiments. Our findings show that SIRAT successfully identifies the keywords most relevant to the document. Finally, the main contribution of this experiment is to demonstrate the effectiveness of this method compared to other methods. In addition, the paper proposes a solution to issues and deficiencies Arabic language processing suffers from, especially regarding corpora building and keywords extraction evaluation systems.

Publisher's Version