Digital Islamic Humanities

Digital Islamic Humanities

The Study on Quranic surahs' Structured-ness and their Order Organization Using NLP Techniques

Document Type : Original Article

Authors
Assistant Professor of Computer Science Department, Faculty of Engineering, Shahed University,Tehran,Iran
Abstract
In recent years, the structure of the Quranic surahs has garnered significant attention from Quranic structural researchers. One of the theories presented in this regard is the theory of thematic unity, which posits that each surah of the Quran is formed around a single, unified topic. The theory of "commandments and elaboration," as one of the most important branches of thematic unity, suggests that God presents the subject of each surah in the first section, elaborates on it in various ways in the following sections, and finally draws a conclusion in the closing section. In this paper, we aim to study these two theories using natural language processing techniques. In this regard, we used three methods: TF-IDF, Word2Vec word embeddings, and the co-occurrence of roots within the verses, to calculate the similarity of Quranic roots. Subsequently, the degree of conceptual similarity between the surahs was computed and compared to a random state. The results show that the surahs of the Quran exhibit semantic coherence, in the sense that the concepts and topics within the surahs revolve around a specific subject. Furthermore, the examination of the similarity between the first and last sections, as well as the first and subsequent sections of each surah, indicates that the theory of "commandments and elaboration" holds true for many surahs. Finally, by analyzing the similarity between the surahs based on their sequential order in the Quran and their chronological revelation throughout history, it was determined that surahs that are closer in arrangement also show greater similarity to each other. Thus, it appears that the Quran as a whole has a structured arrangement, with surahs grouped into related clusters within the Quran.
Keywords

1. Alfaifi, A., & Atwell, E. (2016). Comparative evaluation of tools for Arabic corpora search and analysis. **International Journal of Speech Technology**, 19, 347-357.
2. Alhawarat, M. (2015). Extracting topics from the Holy Quran using generative models. **International Journal of Advanced Computer Science and Applications**, 6, 288-294.
3. Aram, M. R., & Laiqi, F. (2016). Structural analysis of Surah Al-Ma'idah with a tree-structure approach. **Quran and Hadith Research Journal**, (19), 55-77.
4. Arberry, A. J. (1996). **The Koran interpreted: A translation**. Simon and Schuster.
5. Atwell, E. (2009). A corpus-based computational model for knowledge representation of the Quran. In **Proceedings of the Corpus Linguistics Conference 2009 (CL2009)** (p. 169).
6. Bell, R. (1953). **Introduction to the Quran**. Edinburgh University Press.
7. Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. **SIAM Review**, 51, 661-703.
8. Dehghani Farsani, Y. (2008). The structure of Surah Al-Inshiqaq. **Balagh Mubeen**, (14), 3-14.
9. Dukes, K., & Buckwalter, T. (2010). A dependency treebank of the Quran using traditional Arabic grammar. In **Informatics and Systems (INFOS), 2010 The 7th International Conference on** (pp. 1-7).
10. Fattahi Zadeh, F., & Zakeri, M. (2016). A structuralist approach to Surah Al-Kahf. **Interpretive Studies**, 7(25), 101-120.
11. Hamed, S. K., & Ab Aziz, M. J. (2016). A question answering system on Holy Quran translation based on question expansion technique and neural network classification. **Journal of Computer Science**, 12, 169-177.
12. Iqbal, R., Mustapha, A., & Yusoff, Z. M. (2013). An experience of developing Quran ontology with contextual information support. **Multicultural Education & Technology Journal**, 7, 333-343.
13. Jigareh, M., & Sadeghi, Z. (2015). Review and analysis of Surah Al-Infitar based on the theory of structuralism. **Journal of Arabic Literary Criticism**, 7(13), 50-74.
14. Khan, H. U., Saqlain, S. M., Shoaib, M., & Sher, M. (2013). Ontology based semantic search in Holy Quran. **International Journal of Future Computer and Communication**, 2, 570.
15. Khamegar, M. (2002). The geometric structure of the Quranic surahs: An introduction to the structural interpretation of the Quran. **Golestan-e-Quran**, (138), 9-13.
16. Khamegar, M. (2003). An introduction to the structural interpretation of the Quran. **Quranic Studies**, (29-30), 206-271.
17. Khamegar, M. (2004). Phrasing of Quranic stories and the objectives of the surahs (in Persian). **Golestan-e-Quran**, (179), 13-17.
18. Khamegar, M. (2006). A look at the first structural translation of the Holy Quran. **Beynat**, 49(50), 278-291.
19. Khamegar, M. (2008). The theory of the purposeful surahs: Foundations and background. **Quranic Studies**, 13(51-52), 182-213.
20. Larson, R. R. (2010). Introduction to information retrieval. **Journal of the American Society for Information Science and Technology**, 61, 852-853.
21. Le, Q. V., & Mikolov, T. (2014). Distributed representations of sentences and documents. In **ICML** (pp. 1188-1196).
22. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
23. Mitzenmacher, M. (2004). A brief history of generative models for power law and lognormal distributions. **Internet Mathematics**, 1, 226-251.
24. Pitchay, S. A., & Ridzuan, F. (2016). A systematic review analysis for Quran verses retrieval. **Journal of Engineering and Applied Sciences**, 100, 629-634.
25. Sharaf, A.-B. M., & Atwell, E. (2012). QurAna: Corpus of the Quran annotated with pronominal anaphora. In **LREC** (pp. 130-137).
26. Sharaf, A.-B. M., & Atwell, E. (2012). QurSim: A corpus for evaluation of relatedness in short texts. In **LREC** (pp. 2295-2302).
27. Sherif, M. A., & Ngonga Ngomo, A.-C. (2015). Semantic Quran. **Semantic Web**, 6, 339-345.
28. Shoaib, M., Yasin, M. N., Hikmat, U. K., Saeed, M. I., & Khiyal, M. S. H. (2009). Relational WordNet model for semantic search in Holy Quran. In **Emerging Technologies, 2009. ICET 2009. International Conference on** (pp. 29-34).
29. Soucy, P., & Mineau, G. W. (2005). Beyond TFIDF weighting for text categorization in the vector space model. In **IJCAI** (pp. 1130-1135).
30. Tabataba’i, M. H. (1995). **Al-Mizan fi Tafsir al-Qur'an** (Vol. 2). Islamic Publications affiliated with the Society of Seminary Teachers of Qom.
31. Tabataba’i, M. H. (1996). **Al-Mizan fi Tafsir al-Qur'an** (Vol. 2). Qom: Islamic Publications Society of Seminary Teachers of Qom.
32. Yauri, A. R., Kadir, R. A., Azman, A., & Murad, M. A. A. (2013). Quranic verse extraction based on concepts using OWL-DL ontology. **Research Journal of Applied Sciences, Engineering and Technology**, 6, 4492-4498.
33. Zhang, Y., Jin, R., & Zhou, Z.-H. (2010). Understanding bag-of-words model: A statistical framework. **International Journal of Machine Learning and Cybernetics**, 1, 43-52.