Team Member: Jarad Angel, Meredith Wang
Date: Aug - Sep 2022
Whether out at a restaurant or buying tickets to a concert, modern life counts on the convenience of a credit card to make daily purchases. It saves us from carrying large amounts of cash and also can advance a full purchase that can be paid over time. How do card issuers know we’ll pay back what we charge? That’s a complex problem with many existing solutions—and even more potential improvements, to be explored in this competition.
Credit default prediction is central to managing risk in a consumer lending business. Credit default prediction allows lenders to optimize lending decisions, which leads to a better customer experience and sound business economics. Current models exist to help manage risk. But it's possible to create better models that can outperform those currently in use.
▪️ Apply our machine learning skills to predict credit default.
▪️ Leverage an industrial scale data set to build a machine learning model that challenges the current model in production.
▪️ May 25, 2022 - Start Date.
▪️ August 17, 2022 - Entry Deadline. You must accept the competition rules before this date in order to compete.
▪️ August 17, 2022 - Team Merger Deadline. This is the last day participants may join or merge teams.
▪️ August 24, 2022 - Final Submission Deadline.
Training, validation, and testing datasets include time-series behavioral data and anonymized customer profile information.
The objective of this competition is to predict the probability that a customer does not pay back their credit card balance amount in the future based on their monthly customer profile. The target binary variable is calculated by observing 18 months performance window after the latest credit card statement, and if the customer does not pay due amount in 120 days after their latest statement date it is considered a default event.
The dataset contains aggregated profile features for each customer at each statement date. Features are anonymized and normalized, and fall into the following general categories:
- D_* = Delinquency variables
- S_* = Spend variables
- P_* = Payment variables
- B_* = Balance variables
- R_* = Risk variables
acqure.py
Data Cleaning
- [x]
- Clone the repo
- [x]
- [x]
- [x]
▪️
▪️
▪️
▪️
▪️
▪️
▪️