Week #7 in Machine Learning

Applications of Unsupervised Machine Learning Models

3 min readMar 21, 2022

Supose you own a supermarket which offers loyalty cards to the customers. From these cards, you get the customer demograhic data. You also get their spending data which you assign a score based on parameters that you define based on the customer behavior.

Problem statement

You want to understadn the customers according ot their differnt atttributes and group them.

Exploratory Data Analysis

There are 200 rows and 5 columns in our datasetdf['Age'].hist();

There are 112 Female customers and 88 Male customers in teh datasetax = df['Gender'].value_counts().plot.bar()

df['Spending Score (1-100)'].hist(bins=[20, 40, 60, 80, 100]);

df['Annual Income (k$)'].hist();

df.isnull().sum()CustomerID                0
Gender                    0
Age                       0
Annual Income (k$)        0
Spending Score (1-100)    0
dtype: int64

Summary

A majority of the customers are aged between 20 and 40 years.
There are more females than male customer (112, 88 respectively).
The dataset does not have missing values.

Let’s look at a heatmap

<AxesSubplot:>

Clustering

Cluster on two features

Next we look at how this data is scattered using a scatterplot below. After that we cluster the data using Kmeans and plot the results.

4    81
0    39
2    35
1    23
3    22
Name: Label, dtype: int64df1_clusters.plot.bar();

Cluster on three features

2    79
0    39
1    36
3    23
4    23
Name: Label, dtype: int64df1_clusters.plot.bar();