This repo contains the code for the paper "Improving Specificity in Review Response Generation with Data-Driven Data Filtering".
This work investigates the effect of filtering generic responses from the training data and shows that training on smaller, refined datasets improves specificity of the generated responses.
# either create from the yml provided
conda env create -f environment.yml
# or setup manually with requirements.txt
conda create -n hospo_respo_bart python=3.8
conda activate hospo_respo_bart
pip install requirements.txt
The script commands specified make the following assumptions about the directory structure and contents. Note: to reproduce, some paths may need to be adapted!
- models and data are accessible from the top level. This can easily be achieved by creating sym-links, e.g.
ln -s /path/to/storage/data_dir data
ln -s /path/to/storage/model_dir models
- bart-base model has been downloaded from huggingface and is stored in
models/huggingface/bart-base
. - if fasttext classifiers are trained for evaluation purposes, these should be stored in
models/classifiers
We provide a script to collect all 500,000 review-response pairs from TripAdvisor used for our study. To reproduce the dataset splits, use
python scripts/data_prep/scrape_review_response_pairs.py dataset/train.trip_url data/train.csv
python scripts/data_prep/scrape_review_response_pairs.py dataset/test.trip_url data/test.csv
python scripts/data_prep/scrape_review_response_pairs.py dataset/valid.trip_url data/valid.csv
For mobile app data, access to the original dataset needs to be requested from the orignal authors. See https://ieeexplore.ieee.org/document/8952476.
Majority of the scripts used are in the form of Jupyter Notebooks. These are useful for streamlining the data analysis.
The three scoring methods described in the paper are defined in their resepctive IPython Notebooks
scripts/data_prep/score_lex_freq.ipynb # word-level
scripts/data_prep/score_sent_avg.ipynb # sentence-level
scripts/data_prep/score_lm_ppl.ipynb # document-level
Once scored, filtering can be done using the script scripts/data_prep/filter_data.ipynb
. This allows for inspecting and filtering data according to score thresholds.
We fine-tuned our models with pytorch lightning.
The scipts for model training and inference are in scripts/modelling/
.
Note: these are adapted from skeleton code used in LongMBART (A. Rios, UZH).
scripts/modelling/train.py
- accepts path to pretrained model dir (from 🤗 Transformers)
- training, validation, test data
- runs fine-tuning
scripts/modelling/inference.py
- decodes test set
Scripts for evaluating model outputs are in scripts/evaluation/
. In the paper, we report a collection of automatic metrics to measure the impact of our data filtering on model outputs.
These metrics are computed with the script evaluate_line_aligned.py
. For example,
scripts/evaluation/evaluate_line_aligned.py
- accepts line-aligned model outputs, source text and reference text files
All commands used to perform our experiments are provided in the bash scripts located in scripts/
.
For more details on the commands used (e.g. in case of reproduction), see also scripts/experiment_commands.md
.
To finetune a model with default settings, use run_finetuning.sh
, e.g.
bash run_finetuning.sh \
4 \ # GPU device ID
filt_freq_distro \ # save directory name (gets appended to out_dir for full path)
models/ \ # out_dir
models/huggingface/bart-base/ \ # model_dir containing original pre-trained model downloaded from HuggingFace
data/hotel/filt_freq_distro_0.0_0.883 # data_dir containing source target line-aligned files
To run inference with a fine-tuned model, use one of the functions in run_inference.sh
, specifying the GPU device ID, e.g.
bash run_inference.sh inference_filt_freq_distro 4
To evaluate the model generations, use one of the functions in run_eval.sh
, e.g.
bash run_eval.sh eval_hospo_respo_filtering
If you find any of these scripts useful for your own research, please cite Improving Specificity in Review Response Generation with Data-Driven Data Filtering (Kew & Volk, ECNLP 2022).
@inproceedings{kew-volk-2022-improving,
title = "Improving Specificity in Review Response Generation with Data-Driven Data Filtering",
author = "Kew, Tannon and
Volk, Martin",
booktitle = "Proceedings of The Fifth Workshop on e-Commerce and NLP (ECNLP 5)",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.ecnlp-1.15",
pages = "121--133",
}