Credit Risk Analysis with Machine Learning
The purpose of this analysis is to build several Machine Learning models and algorithms to predict credit risk for loan applications. After completion of this analysis, approving or denying application for loan will be more efficient, accurate and lower default rates. I will utilize Python and Scikit-learn libraries and several machine learning models to compare the ML models and determine how well each model classifies and predicts data.
In this project, I am utilizing following models and algorithms to find best prediction model for credit risk analysis:
- Oversampling Models : RandomOverSampler and SMOTE algorithms.
- Undersampling Model : ClusterCentroids algorithm.
- Combining Models : SMOTEENN algorithm that combines the SMOTE and Edited Nearest Neighbors (ENN) algorithms.
- Comparing Models : BalancedRandomForestClassifier and EasyEnsembleClassifier.
After applying exploratory data analysis with Pandas and Numpy for the dataset, I am using the imbalanced-learn and scikit-learn libraries for evaluating three machine learning models by using resampling to determine which is better at predicting credit risk. I will start the analysis with RandomOverSampler and SMOTE Oversampling algorithms, and then use the undersampling ClusterCentroidsalgorithm. Using these algorithms, I will resample the dataset, view the count of the target classes, train a logistic regression classifier lastly compare each model to determine best model that fit for this analysis.
Note: A random state of 1 for each sampling algorithm to ensure consistency between tests.
In this section, foll0wing metrics will be provided in order to discover which algorithm results in the best performance between Naive random oversampling algorithm and the SMOTE algorithm
- Calculate the balanced accuracy score from
sklearn.metrics
- Calculate the confusion matrix from
sklearn.metrics
- Generate a classication report using the
imbalanced_classification_report
fromimbalanced-learn
.
In this section, following metrics will be provided in order to discover which algorithm results in the best performance between Cluster Centroids undersampling and SMOTEENN.
- Calculate the balanced accuracy score from
sklearn.metrics
- Calculate the confusion matrix from
sklearn.metrics
- Generate a classication report using the
imbalanced_classification_report
fromimbalanced-learn
.
In this section, foll0wing metrics will be provided in order to discover which algorithm results in the best performance between Balanced Random Forest Classifier and Easy Ensemble AdaBoost Classifier
- Calculate the balanced accuracy score from
sklearn.metrics
- Calculate the confusion matrix from
sklearn.metrics
- Generate a classication report using the
imbalanced_classification_report
fromimbalanced-learn
.
Before moving forward with a summary report, I would like to point out a few reminders regarding following metrics:
- Classifying a single point can result in a true positive (truth = 1, guess = 1), a true negative (truth = 0, guess = 0), a false positive (truth = 0, guess = 1), or a false negative (truth = 1, guess = 0).
- Accuracy measures how many classifications your algorithm got correct out of every classification it made.
- Recall measures the percentage of the relevant items your classifier was able to successfully find.
- Precision measures the percentage of items your classifier found that were relevant.
- Precision and recall are tied to each other. As one goes up, the other will go down.
- F1 score is a combination of precision and recall.
- F1 score will be low if either precision or recall is low.
I have created 6 different algorithms and models to discover optimum way to predict credit risk. Comparing Accuracy Scores
for each model would be a quick and efficient way to decide which model would perform better than others for this data set. We can observe that Easy Ensemble AdaBoost Classifier is performed significantly better than the other models as its Accuracy Scores
is around 92%. However, despite of having high Accuracy Scores
the F-Score
is relatively low as 0.16.Also,having a low precision would cause low risk loans to be labeled as high risk falsely. Making wrong decision on loan application will decrease the revenue and trustworthiness of the bank.
Therefore, I would reject using these algorithms as decision mechanisms for predicting credit risk for loan applications. My recommendation would be creating larger scale dataset with more features and better selections on features to improve Precision and F1 scores which improve confidence of our machine learning algorithms.
- Data Source: LoanStats_2019Q1.csv
- Software/Languages: Jupyter Notebook- Google Colab, Python
- Libraries: Scikit-learn