Source code for the paper A Memory-Augmented Neural Model for Automated Grading in L@S 2017. Note that recent check-in updates the python from python 2.5 to python 3.7.
The dataset comes from Kaggle ASAP competition. You can download the data from the link below.
https://www.kaggle.com/c/asap-aes/data
Glove embeddings are used in this work. Specifically, 42B 300d is used to get the best results. You can download the embeddings from the link below.
https://nlp.stanford.edu/projects/glove/
git clone https://github.com/siyuanzhao/automated-essay-grading.git
-
Download training data file 'training_set_rel3.tsv' from Kaggle and put it under the root folder of this repo.
-
Download 'glove.42B.300d.zip' from https://nlp.stanford.edu/projects/glove/ and unzip all files into 'glove/' folder.
- Tensorflow 1.10
- scikit-learn 0.19
- six 1.10.0
- python 3.7
# Train the model on an essay set <essay_set_id>
python cv_train.py --essay_set_id <eassy_set_id>
There are serval flags within cv_train.py. Below is an example of training the model on essay set 1 with specific learning rate, and epochs.
python cv_train.py --essay_set_id 1 --learning_rate 0.005 --epochs 200
Check all avaiable flags with the following command.
python cv_train.py -h
Note: The model is trained on the training data with 5-fold cross validation. By default, the output layer of the model is a classification layer. There is another model whose output layer is a regression layer in memn2n_kv_regression.py. To train the model with the regression output layer, set flag is_regression to True. For example,
python cv_train.py --essay_set_id 1 --learning_rate 0.005 --epochs 200 --is_regression True