Python Forum

Full Version: AttributeError: 'DataFrame' object has no attribute 'Articles'
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Purposes I want to plot feathers importance for data prediction and training and testing

Running Time Error: AttributeError: 'DataFrame' object has no attribute 'Articles'

Error:
Traceback (most recent call last): File "D:/Clustering/text-cluster-master/similarity.py", line 68, in <module> y = X.Articles.copy() File "D:\Python3.8.0\Python\lib\site-packages\pandas\core\generic.py", line 5460, in __getattr__ return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'Articles'
Codes lines:

y = X.Articles.copy()
X.drop(['Articles'], axis=1, inplace=True)
you are not showing enough code.
Show where X is defined.
X is defined as
X = pd.read_csv(r"D:\\Clustering\\text-cluster-master\\Articles.csv", error_bad_lines=False)
X.head()
The error is
Error:
AttributeError: 'DataFrame' object has no attribute 'Articles'
I don't see how X could have any attribute named Articles.
It has never been defined.
@Larz60+ have a look at the overall codes
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.decomposition import PCA
from sklearn.neighbors import KNeighborsClassifier


matplotlib.style.use('ggplot') # Look Pretty


def plotDecisionBoundary(model, X, y):
  fig = plt.figure()
  ax = fig.add_subplot(111)

  padding = 0.6
  resolution = 0.0025
  colors = ['royalblue','forestgreen','ghostwhite']

  # Calculate the boundaris
  x_min, x_max = X[:, 0].min(), X[:, 0].max()
  y_min, y_max = X[:, 1].min(), X[:, 1].max()
  x_range = x_max - x_min
  y_range = y_max - y_min
  x_min -= x_range * padding
  y_min -= y_range * padding
  x_max += x_range * padding
  y_max += y_range * padding


  xx, yy = np.meshgrid(np.arange(x_min, x_max, resolution),
                       np.arange(y_min, y_max, resolution))


  Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
  Z = Z.reshape(xx.shape)

  # Plot the contour map
  cs = plt.contourf(xx, yy, Z, cmap=plt.cm.terrain)

  # Plot the test original points as well...
  for label in range(len(np.unique(y))):
    indices = np.where(y == label)
    plt.scatter(X[indices, 0], X[indices, 1], c=colors[label], label=str(label), alpha=0.8)

  p = model.get_params()
  plt.axis('tight')
  plt.title('K = ' + str(p['n_neighbors']))



X = pd.read_csv(r"D:\\Clustering\\text-cluster-master\\Articles.csv", error_bad_lines=False)
X.head()

y = X.Articles.copy()
X.drop(['Articles'], axis=1, inplace=True)



y = y.astype("category").cat.codes


X.fillna(X.mean(), inplace=True)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
                                                    random_state=1)


normaliser = preprocessing.Normalizer().fit(X_train)


X_train_normalised = normaliser.transform(X_train)
X_train = pd.DataFrame(X_train_normalised)

X_test_normalised = normaliser.transform(X_test)
X_test = pd.DataFrame(X_test_normalised)


pca_reducer = PCA(n_components=2).fit(X_train_normalised)

X_train = pca_reducer.transform(X_train_normalised)
X_test = pca_reducer.transform(X_test_normalised)


knn = KNeighborsClassifier(n_neighbors=9)
knn.fit(X_train, y_train)

plotDecisionBoundary(knn, X_train, y_train)


print(knn.score(X_test, y_test))

plt.show()
Anldra12 Wrote:@Larz60+ have a look at the overall codes

FYI: Because of the volume of posts, it's important to provide enough code on your first post.

Error message shows error to be on line 68, which is now line 58, so doesn't match.
code should match error message.