![]() |
How to create correct scatter plot for PCA? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: How to create correct scatter plot for PCA? (/thread-23113.html) |
How to create correct scatter plot for PCA? - LK91 - Dec-11-2019 I clustered my data (using kmeans) with high dimensions in Python and after I wanted to build scatter plot with using PCA. But my plot is very strange and I don't understand why? (image in attachment) Also I found that PCA components have negative values. Can someone advise how to build correct scatter plot? My main steps : 1.normalize data 2.Kmeans clustering 3.create scatter plot My code: #Normalize data scaler = MinMaxScaler() new2 = pd.DataFrame(scaler.fit_transform(dd)) #Kmeans kmeans = KMeans(n_clusters=5) kmeans.fit(new2) clusters = kmeans.predict(new2) #PCA and scatter plot pca = PCA(n_components=2) principalComponents = pca.fit_transform(new2) principalDf = pd.DataFrame(data = principalComponents , columns = ['principal component 1', 'principal component 2']) finalDf = pd.concat([principalDf, new2[['Cluster']]], axis = 1) fig = plt.figure(figsize = (10,10)) ax = fig.add_subplot(1,1,1) ax.set_xlabel('Principal Component 1', fontsize = 15) ax.set_ylabel('Principal Component 2', fontsize = 15) ax.set_title('2 component PCA', fontsize = 20) targets = ['0','1','2','3','4'] colors = ['red','blue','black','pink','green'] for target, color in zip(targets,colors): indicesToKeep = finalDf['Cluster'] == target ax.scatter(finalDf.loc[indicesToKeep, 'principal component 1'] , finalDf.loc[indicesToKeep, 'principal component 2'] , c = color , s = 50) ax.legend(targets) ax.grid() |