Skip to content

Unsupervised machine learning model to predict fraudulent credit card transactions on a highly imbalanced dataset.

Notifications You must be signed in to change notification settings

harikishorep122/Credit-card-fraud-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Credit-card-fraud-detection

Tools used

numpy pandas for importing dataset matplotlib for data visualization seaborn for plotting heat map sklearn for ml algorithm

Visualising the data.

Dataset of around 284807 credit card transactions. Highly imbalanced with around 0.001 class imbalance.

Seperated the dataset to x & y. x contains around 30 parameters which are a result of PCA dimensionality reduction. y represent the classes of each of the datapoints. 0 means normal transaction 1 means fraudulent.

Data preprocessing

Plotting the correlation matrix to find if there are any relation between the parameters

Performed a basic downsampling to improve class imbalance.

Algorithms used

Isolation forest & Local outlier factor.

Results

Second algorithm worked better than the first. Support vector machine wasn't used because it would take longer to train.

Achieved precision of 0.78 and recall of 0.67 on the imbalanced class.

About

Unsupervised machine learning model to predict fraudulent credit card transactions on a highly imbalanced dataset.

Topics

Resources

Stars

Watchers

Forks