Sep-22-2020, 02:49 PM
I want to use the silhouette index to examine each item in the array (X) with each cluster (0,1,2). I take array (X) as an example but my dataset is far bigger. I tried with this code
the results I looking for, look like (calculating silhouette score for each sample of a dataset with each cluster)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
from sklearn.cluster import KMeans from sklearn.metrics import silhouette_score from sklearn.metrics import silhouette_samples import pandas as pd import numpy as np from sklearn_extra.cluster import KMedoids from sklearn.metrics.pairwise import euclidean_distances X = np.array([ 0.85142858 , 0.85566274 , 0.85364912 , 0.81536489 , 0.84929932 ]) X = X.reshape( - 1 , 1 ) kmedoids = KMedoids(n_clusters = 3 , random_state = 0 ).fit(X) cluster_labels = kmedoids.predict(X) df = pd.DataFrame({ 'label' : kmedoids.labels_[kmedoids.medoid_indices_], 'medoid' : np.squeeze(X[kmedoids.medoid_indices_]), 'index' : kmedoids.medoid_indices_}) for i in range ( len (X)): print () print ( "item" , i + 1 , X[i]) for n_clusters in clusters: silhouette_samples = silhouette_samples(X[i], n_clusters) print ( "For clusters =" , n_clusters, " The average silhouette_score is :" , silhouette_samples) |
Output:item 1 [0.85142858]
For clusters = 0 The average silhouette_score is : ???
For clusters = 1 The average silhouette_score is : ???
For clusters = 2 The average silhouette_score is : ???
item 2 [0.85566274]
For clusters = 0 The average silhouette_score is : ???
For clusters = 1 The average silhouette_score is : ???
For clusters = 2 The average silhouette_score is : ???
item 3 [0.85364912]
For clusters = 0 The average silhouette_score is : ???
For clusters = 1 The average silhouette_score is : ???
For clusters = 2 The average silhouette_score is : ???
item 4 [0.81536489]
For clusters = 0 The average silhouette_score is : ???
For clusters = 1 The average silhouette_score is : ???
For clusters = 2 The average silhouette_score is : ???
item 5 [0.84929932]
For clusters = 0 The average silhouette_score is : ???
For clusters = 1 The average silhouette_score is : ???
For clusters = 2 The average silhouette_score is : ???