Skip to content

rjnp2/loan_prediction

Repository files navigation

loan_prediction

Welcome to this project on Loan Prediction Practice Problem.

Objective:

  1. Introduction to the problem
  2. Exploratory Data Analysis (EDA) and PreProcessing
  3. Model building and Feature engineering

Dream Housing Finance company deals in all home loans. They have presence across all urban, semi urban and rural areas. Customer first apply for home loan after that company validates the customer eligibility for loan. Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have given a problem to identify the customers segments, those are eligible for loan amount so that they can specifically target these customers.

Loan prediction is a very common real-life problem that each retail bank faces atleast once in its lifetime. If done correctly, it can save a lot of man hours at the end of a retail bank. It is a classification problem where we have to predict whether a loan would be approved or not. In a classification problem, we have to predict discrete values based on a given set of independent variable(s).

hypothesis generation This is a very important stage in any data science/machine learning pipeline. It involves understanding the problem in detail by brainstorming as many factors as possible which can impact the outcome. It is done by understanding the problem statement thoroughly and before looking at the data.

Below are some of the factors which I think can affect the Loan Approval (dependent variable for this loan prediction problem):

Salary: Applicants with high income should have more chances of loan approval. Previous history: Applicants who have repayed their previous debts should have higher chances of loan approval. Loan amount: Loan approval should also depend on the loan amount. If the loan amount is less, chances of loan approval should be high. Loan term: Loan for less time period and less amount should have higher chances of approval. EMI: Lesser the amount to be paid monthly to repay the loan, higher the chances of loan approval.

These are some of the factors which i think can affect the target variable, you can come up with many more factors.

Specifications

Python = 3.7

pandas = 0.20.3

seaborn = 1.0.0

sklearn = 0.19.1