Authors: Peter Macinec, Timotej Zatko
To run this project, please make sure you have Docker installed. After, follow the steps:
- Get into project root repository.
- Setup repository using command:
sh ./scripts/setup.sh
- Download data (you need to have Kaggle API installed). Don't for get to accept the rules of competition.
sh ./scripts/download_data.sh
- Build docker image:
sh ./scripts/build.sh
- Run docker container using command:
sh ./scripts/run.sh
- Run initial dataset processing:
- Get into docker container:
docker exec -it transactions_fraud_detection_con bash
- Run script with data preprocessing:
Note: You can run this script also outside the container, but make sure you have python with
python src/preprocessing/initial_preprocessing.py
pandas
library installed.
- Get into docker container:
To solve our problem of transactions fraud detection, we are using the data from IEEE-CIS Fraud Detection competition at Kaggle. The large-scale dataset contains real-world e-commerce transactions with a variety of features included. Features are of different types. describing transactions, products or other useful information (such as categorical DeviceType
). Many of the features are not described properly, so results explanation will not be so clear.
Paper LaTex source files are available in the paper
dircetory. The readable document in pdf format is in the following location -- paper/main.pdf.
Presentation describing whole project and results is available at presentation.pdf.