- Overview
- Notable Features
- Installation and Setup for Windows users
- Installation and Setup for Mac/Linux Users
The Arabic Syntactic Analyzer (ARSA) is an open-source Natural Language Processing (NLP) tool designed for the analysis of syntactic features in Arabic written texts. It is based on python and employs the camel_parser library to identify and measure 13 distinct syntactic indices, comprising 9 syntactic complexity indices and 4 syntactic fluency indices.
The ARSA tool can be applied to study the following topics:
- Writing assessment: evaluating syntactic features in Arabic compositions
- Text readability: investigating the linguistic accessibility of Arabic texts
- Second language acquisition: analyzing syntactic development in Arabic learners' writing
- Automatic Analysis: automatically evaluates 13 syntactic indices
- Batch Processing: capable of analyzing multiple text files simultaneously
- User-Friendly Interface: implemented as an interactive command-line interface (CLI) for ease of use
- Dual Functionality: operates as both a local application and a cloud-based tool via Google Colab
-
Download the ARSA_notebook.ipynb from this repository
-
Open Google Colab
-
Upload the notebook:
- Go to 'File' -> 'Upload notebook'
- Select the downloaded 'ARSA_notebook.ipynb'
-
Follow the step-by-step instructions within the notebook
Note
You can run the notebook on Mac and Linux devices.
- Clone this repository:
git clone https://github.com/AlaaAlzahrani/ARSA.gitL
- Install the required packages:
cd ARSA/camel_parser
pip install -r requirements.txt
python download_models.py
camel_data -i morphology-db-msa-s31
camel_data -i disambig-bert-unfactored-msa
cd ..
pip install -r ARSA_requirements.txt
pip install --upgrade huggingface_hub
pip install camel-tools
- Analyze your texts using the ARSA tool
Run the following command:
cd path/to/ARSA/directory # change the working dircotry to the ARSA repository folder
python get_analysis.py
The command will prompt you to enter the input folder:
Please select the text file(s) folder: <write-the-input-folder-name-here>
The command will also prompt you to enter the output folder:
Please select the output folder: <write-the-output-folder-name-here>
- Example
cd D:/my_projects/ARSA
python get_analysis.py
Please select the text file(s) folder: example/corpus
Please select the output folder: example/results
Note
This local installation method is currently unsupported on Windows because some dependencies of the camel parser library are incompatible with Windows.
This work is licensed under an MIT license.
If you use this tool, please cite the following papers to support the authors and encourage the development of open-source Arabic language processing tools:
@inproceedings{Elshabrawy:2023:camelparser,
title = "{CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic}",
author = {Elshabrawy, Ahmed and AbuOdeh, Muhammed and Inoue, Go and Habash, Nizar},
booktitle = {Proceedings of The First Arabic Natural Language Processing Conference (ArabicNLP 2023)},
year = "2023"
}
@misc{Alzahrani:2024:ARSA,
title = "{Arabic Syntactic Analyzer (ARSA): An Automated Tool for the Analysis of Arabic Written Texts}",
author = {Alzahrani, Alaa and Alfaify, Adel},
year = "2024",
note = {Preprint}
}