問題描述
根據(jù)airbnb用戶信息對客戶進行分群
數(shù)據(jù)字段

image.png

一、數(shù)據(jù)準備

1.引入數(shù)據(jù)

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
airbnb=pd.read_csv(airbnb_kmeans.csv)
airbnb.describe()
airbnb.info()

1.1發(fā)現(xiàn)age的異常值有“2歲”和“2014歲”

age.png

1.2data_account和data_first為object型且gender為字符型變量

image.png

2.異常值處理

2.1處理年齡

#篩選出年齡大于5歲且小于100歲的用戶
airbnb = airbnb[airbnb['age']>5]
airbnb = airbnb[airbnb['age']<100]

2.2處理時間變量

#將object轉(zhuǎn)化為datetime變量
airbnb['date_account_created']=pd.to_datetime(airbnb['date_account_created'])
airbnb['date_first_booking']=pd.to_datetime(airbnb['date_first_booking'])

#創(chuàng)建新變量：“用戶注冊了多少年”
airbnb['year_date_account_created']=airbnb['date_account_created'].map(lambda x:2020-x.year)
#創(chuàng)建新變量：“距離用戶第一次訂單多少年”
airbnb['year_date_first_booking']=airbnb['date_first_booking'].map(lambda x:2020-x.year)

2.3處理gender字符型

airbnb['gender']=pd.get_dummies(airbnb['gender'])

二、建模

1.利用肘方法看分多少類

import sklearn.cluster
import matplotlib.pyplot as plt 
%matplotlib inline

kmeans_score = [] #存放模型擬合后的inertia
for i in range(1,11):
    model = KMeans(n_clusters=i,random_state=10)
    model.fit(airbnb_5)
    kmeans_score.append(model.inertia_) #inertia表示各點到中心之和
plt.plot(range(1,11),kmeans_score)
plt.xlabel('clusters')
plt.ylabel('interia')
plt.show()#用肘方法查看最合適的類數(shù)

可見n=3時，出現(xiàn)拐點，所以分類數(shù)定為3

image.png

2.分3類


model = KMeans(n_clusters=3,random_state=10)#注意KMeans兩個大寫字母
model.fit(airbnb_5)
result = model.predict(airbnb_5)
airbnb['K-means'] = result #創(chuàng)建“類別”列

3.查看分類后各中心點位置并且解讀

#查看分類后的各中心
model.cluster_centers_

分類中心.png

解讀：
1.第一類為安卓使用者偏度，喜好通過PC端下單
2.第二類為通過手機移動網(wǎng)頁下單
3.第三類為“果粉”，特征為很少使用PC端網(wǎng)頁下單

image.png

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Airbnb客戶分群——Kmeans簡單運用

Airbnb客戶分群——Kmeans簡單運用

一、數(shù)據(jù)準備

1.引入數(shù)據(jù)

2.異常值處理

二、建模

1.利用肘方法看分多少類

2.分3類

3.查看分類后各中心點位置并且解讀

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Airbnb客戶分群——Kmeans簡單運用

一、數(shù)據(jù)準備

1.引入數(shù)據(jù)

2.異常值處理

二、建模

1.利用肘方法看分多少類

2.分3類

3.查看分類后各中心點位置 并且解讀

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

一、數(shù)據(jù)準備

二、建模

3.查看分類后各中心點位置并且解讀