Created a machine learning model to predict whether the salary of a person is greater than 50K/yr or not. The dataset contains information about age, gender, occupation, education and workclass of over randomly generated 30,000 employees.
The initial stages of project involved a lot of data assessment and cleaning, performing EDA and drawing conclusion from the data.
Trained a AdaBoost model
to predict income with 85.44% accuracy score.
- Python Libraries like Pandas, Numpy, Matplotlib, Seaborn for EDA
- Sci-kit Learn and its fragmented modules for training the model
- Jupyter Notebook