빅데이터/머신 러닝 with Python (old)

Scikit-Learn을 이용한 KMean 군집화(Clustering)

언제나휴일 2020. 8. 6. 18:32
반응형

 

import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.DataFrame(columns=['x','y'])
df.loc[0] = [1,4]
df.loc[1] = [1,3]
df.loc[2] = [2,5]
df.loc[3] = [2,2]
df.loc[4] = [1,12]
df.loc[5] = [2,13]
df.loc[6] = [3,12]
df.loc[7] = [4,6]
df.loc[8] = [4,8]
df.loc[9] = [5,7]
print(df)

sns.lmplot('x','y',data=df,fit_reg=False,scatter_kws={"s":20})
plt.show()

data_points = df.values
kmeans = KMeans(n_clusters=3).fit(data_points)
df['label']=kmeans.labels_
print(df)

sns.lmplot('x','y',data=df,fit_reg=False,scatter_kws={"s":20},hue='label')
plt.show()
반응형