Week #7 in Machine Learning
Applications of Unsupervised Machine Learning Models
3 min readMar 21, 2022
Supose you own a supermarket which offers loyalty cards to the customers. From these cards, you get the customer demograhic data. You also get their spending data which you assign a score based on parameters that you define based on the customer behavior.
Problem statement
You want to understadn the customers according ot their differnt atttributes and group them.
Exploratory Data Analysis
There are 200 rows and 5 columns in our datasetdf['Age'].hist();
There are 112 Female customers and 88 Male customers in teh datasetax = df['Gender'].value_counts().plot.bar()
df['Spending Score (1-100)'].hist(bins=[20, 40, 60, 80, 100]);
df['Annual Income (k$)'].hist();
df.isnull().sum()CustomerID 0
Gender 0
Age 0
Annual Income (k$) 0
Spending Score (1-100) 0
dtype: int64
Summary
- A majority of the customers are aged between 20 and 40 years.
- There are more females than male customer (112, 88 respectively).
- The dataset does not have missing values.
Let’s look at a heatmap
<AxesSubplot:>
Clustering
Cluster on two features
Next we look at how this data is scattered using a scatterplot below. After that we cluster the data using Kmeans and plot the results.
4 81
0 39
2 35
1 23
3 22
Name: Label, dtype: int64df1_clusters.plot.bar();
Cluster on three features
2 79
0 39
1 36
3 23
4 23
Name: Label, dtype: int64df1_clusters.plot.bar();