Project for the course NLP in University of Tartu
Goal of this project is to classifiy the fake job posts from a highly unbalanced data.
For this project we explored the data, cleaned it and using pretrained NLP model BERT classified the fake job posts. With the results of data expoloration we removed the fields that has more tha 60% missing values. We have cleaned the non-ascii characters.
In the end with our way of cleaning the data, feature engineering and modifying the machine learning model we were able to get decent results.
For more detailed explanation report can be read