Here I played with the dataset by P. Cortez, A. Cerdeira et al., which contains data on the chemical composition of the Portuguese red wine 'Vinho Verde'. My final goal was to build a model that would predict the quality of the wine from its chemical composition and classify the quality as good or bad, based on the input.
The project is simple. However, I did a little research on the wine composition features to retrieve some useful insights and build some new features;
I used the library to perform charts and plots in the EDA part. Those plots look astonishing in the Kaggle notebook environment but don't render on GitHub. So if interested, check my Kaggle notebook first.
The models work well, but not perfect as the size of the dataset was the biggest issue. Note there is a slight overtraining in the models.