Week #7 in Machine Learning

Applications of Unsupervised Machine Learning Models

Eliud Nduati
3 min readMar 21, 2022

Supose you own a supermarket which offers loyalty cards to the customers. From these cards, you get the customer demograhic data. You also get their spending data which you assign a score based on parameters that you define based on the customer behavior.

Problem statement

You want to understadn the customers according ot their differnt atttributes and group them.

png

Exploratory Data Analysis

There are 200 rows and 5 columns in our datasetdf['Age'].hist();
png
There are 112 Female customers and 88 Male customers in teh datasetax = df['Gender'].value_counts().plot.bar()
png
df['Spending Score (1-100)'].hist(bins=[20, 40, 60, 80, 100]);
png
df['Annual Income (k$)'].hist();
png
df.isnull().sum()CustomerID                0
Gender 0
Age 0
Annual Income (k$) 0
Spending Score (1-100) 0
dtype: int64

Summary

  • A majority of the customers are aged between 20 and 40 years.
  • There are more females than male customer (112, 88 respectively).
  • The dataset does not have missing values.

Let’s look at a heatmap

<AxesSubplot:>
png

Clustering

Cluster on two features

png

Next we look at how this data is scattered using a scatterplot below. After that we cluster the data using Kmeans and plot the results.

png
png
png
4    81
0 39
2 35
1 23
3 22
Name: Label, dtype: int64
df1_clusters.plot.bar();
png
png

Cluster on three features

png
png
png
2    79
0 39
1 36
3 23
4 23
Name: Label, dtype: int64
df1_clusters.plot.bar();
png
png

--

--