Purposes I want to plot feathers importance for data prediction and training and testing
Running Time Error: AttributeError: 'DataFrame' object has no attribute 'Articles'
Error:
Traceback (most recent call last):
File "D:/Clustering/text-cluster-master/similarity.py", line 68, in <module>
y = X.Articles.copy()
File "D:\Python3.8.0\Python\lib\site-packages\pandas\core\generic.py", line 5460, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'Articles'
Codes lines:
y = X.Articles.copy()
X.drop(['Articles'], axis=1, inplace=True)
you are not showing enough code.
Show where X is defined.
X is defined as
X = pd.read_csv(r"D:\\Clustering\\text-cluster-master\\Articles.csv", error_bad_lines=False)
X.head()
The error is
Error:
AttributeError: 'DataFrame' object has no attribute 'Articles'
I don't see how X could have any attribute named Articles.
It has never been defined.
@Larz60+ have a look at the overall codes
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.decomposition import PCA
from sklearn.neighbors import KNeighborsClassifier
matplotlib.style.use('ggplot') # Look Pretty
def plotDecisionBoundary(model, X, y):
fig = plt.figure()
ax = fig.add_subplot(111)
padding = 0.6
resolution = 0.0025
colors = ['royalblue','forestgreen','ghostwhite']
# Calculate the boundaris
x_min, x_max = X[:, 0].min(), X[:, 0].max()
y_min, y_max = X[:, 1].min(), X[:, 1].max()
x_range = x_max - x_min
y_range = y_max - y_min
x_min -= x_range * padding
y_min -= y_range * padding
x_max += x_range * padding
y_max += y_range * padding
xx, yy = np.meshgrid(np.arange(x_min, x_max, resolution),
np.arange(y_min, y_max, resolution))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# Plot the contour map
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.terrain)
# Plot the test original points as well...
for label in range(len(np.unique(y))):
indices = np.where(y == label)
plt.scatter(X[indices, 0], X[indices, 1], c=colors[label], label=str(label), alpha=0.8)
p = model.get_params()
plt.axis('tight')
plt.title('K = ' + str(p['n_neighbors']))
X = pd.read_csv(r"D:\\Clustering\\text-cluster-master\\Articles.csv", error_bad_lines=False)
X.head()
y = X.Articles.copy()
X.drop(['Articles'], axis=1, inplace=True)
y = y.astype("category").cat.codes
X.fillna(X.mean(), inplace=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
random_state=1)
normaliser = preprocessing.Normalizer().fit(X_train)
X_train_normalised = normaliser.transform(X_train)
X_train = pd.DataFrame(X_train_normalised)
X_test_normalised = normaliser.transform(X_test)
X_test = pd.DataFrame(X_test_normalised)
pca_reducer = PCA(n_components=2).fit(X_train_normalised)
X_train = pca_reducer.transform(X_train_normalised)
X_test = pca_reducer.transform(X_test_normalised)
knn = KNeighborsClassifier(n_neighbors=9)
knn.fit(X_train, y_train)
plotDecisionBoundary(knn, X_train, y_train)
print(knn.score(X_test, y_test))
plt.show()
Anldra12 Wrote:@Larz60+ have a look at the overall codes
FYI: Because of the volume of posts, it's important to provide enough code on your first post.
Error message shows error to be on line 68, which is now line 58, so doesn't match.
code should match error message.