Customer-Segmentation

Business Value

Customer segmentation is the process of grouping customers together based on some common characteristics, based on their interactions with the business. In most cases this interaction is in terms of their purchase behavior and patterns. These groups are beneficial for marketing campaigns, in identifying potential profitable customers and in developing customer loyalty.

Problem Statement

To identify clusters of customer based on their purchase behaviour by taking into account the recency, frequency and monetary value of their transactions.

Data

Each row of data represents a transaction and each column contains a transaction's attributes.

InvoiceNo : A unique identifier for the invoice. An invoice number shared across rows means that those transactions were performed in a single invoice (multiple purchases).

StockCode : Identifier for items contained in an invoice.

Description : Textual description of each of the stock item.

Quantity : The quantity of the item purchased.

InvoiceDate : Date of purchase.

UnitPrice : Value of each item.

CustomerID : Identifier for customer making the purchase.

Country : Country of customer.

Approach

Loading Dependencies
Loading Data
Data Exploration
Data Processing
Focussing on One Market (UK in this case)
Building Recency Feature
Calculating Frequency and Monetary Values
Customer Segmentation Kmeans Algorithm Silhouette Score Metric
Visualize Customer Segments

We use the silhouette score for finding out the optimal number of clusters during our clustering process.

Data Exploration

Sales By Counntry

Top 15 customers contributing to 10.5% of total sales

Sales Recency

Processed Data

Data with Recency, Frequency and Monetary feature

Model Building and Clustering

Developed and tested 3,4 and 5 number of clusters for their silhouette score. The results are as follows:

Clusters 3

There is a stark difference in Monetary vallue of customer
Cluster 2 is the cluster with high value customers who shop frequently and is certainly an important segement for each business.
Cluster 0 and 1 has customer groups with low spend and medium spends

Clusters 4

The high value customers are subdivided into two groups, one with lower spends and lower frequency (represented by cluster 0) and another with high amount and higher frquency but lower recency represented by cluster 1.

Clusters 5

With 5 clusters too we have two subgroup for higher spend customers and 3 subgroup for customers with lower spend but varying frequency and recency.

Visualizing Clusters

Amount vs Frequency

Recency vs Amount

Recency vs Frequency

Conclusion

Going by mathematical metrics we see the silhouette score for 3 clusters is max suggesting that 3 clusters is the optimal number of clusters for this dataset. However we need to include business metrics and domain insights in our modelling process to obtain the best suited data-focussed solution for the bsuiness problem at hand :-)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Images		Images
Customer Segmentation.ipynb		Customer Segmentation.ipynb
Online Retail.xlsx		Online Retail.xlsx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer-Segmentation

Business Value

Problem Statement

Data

Approach

Data Exploration

Processed Data

Model Building and Clustering

Visualizing Clusters

Conclusion

About

Releases

Packages

Languages

satishrath185/Customer-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Customer-Segmentation

Business Value

Problem Statement

Data

Approach

Data Exploration

Processed Data

Model Building and Clustering

Visualizing Clusters

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages