Python Forum
ValueError: Found input variables
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
ValueError: Found input variables
#1
OS: Ubuntu 18.04
Python3
Editors PyCharm and Jupyter Lab

hi All Pals.
i looked at the link in same title but i found that i have data same length. and still give me this error.
Error :
Error:
Traceback (most recent call last): File "Regressor.py", line 103, in <module> rdg_regressor() File "Regressor.py", line 95, in rdg_regressor rdg.fit(rdg_poly_regression.fit_transform(salary_features_train), salary_labels_train) File "/home/ahmdwd/.local/lib/python3.6/site-packages/sklearn/linear_model/_ridge.py", line 766, in fit return super().fit(X, y, sample_weight=sample_weight) File "/home/ahmdwd/.local/lib/python3.6/site-packages/sklearn/linear_model/_ridge.py", line 547, in fit multi_output=True, y_numeric=True) File "/home/ahmdwd/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 765, in check_X_y check_consistent_length(X, y) File "/home/ahmdwd/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 212, in check_consistent_length " samples: %r" % [int(l) for l in lengths]) ValueError: Found input variables with inconsistent numbers of samples: [24, 6]
Shapes For Features and Labels Outputs :
Output:
(24, 1) (6, 1)
also code is here to test.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

# Importing the dataset
os.chdir('/home/ahmdwd/Documents/ML Lcture/Salay_Data')
salary_data_frame = pd.read_csv('Salary_Data.csv')
salary_features = salary_data_frame.iloc[:, :-1].values
salary_labels = salary_data_frame.iloc[:, -1].values

# Reshape Features and Labels
salary_features = salary_features.reshape(-1, 1)
salary_labels = salary_labels.reshape(-1, 1)
# print(salary_features)
# print(salary_labels)


from sklearn.model_selection import train_test_split

salary_features_train, salary_labels_train, salary_features_test, salary_labels_test = train_test_split(
    salary_features, salary_labels, test_size=0.2, random_state=0, shuffle=False)

# Fitting Linear Regression to the dataSet
from sklearn.linear_model import LinearRegression

linear_regression = LinearRegression()
linear_regression.fit(salary_features, salary_labels)

# Fitting Polynomial Regression to the dataSet
from sklearn.preprocessing import PolynomialFeatures

poly_regression = PolynomialFeatures(degree=10)
poly_salary_features = poly_regression.fit_transform(salary_features)
linear_poly_regression = LinearRegression()
linear_poly_regression.fit(poly_salary_features, salary_labels)

# Evaluation
from sklearn.metrics import r2_score

error_one = r2_score(salary_labels, linear_regression.predict(salary_features))
error_two = r2_score(salary_labels, linear_poly_regression.predict(poly_regression.fit_transform(salary_features)))
print(f'R Squared For Linear Regression Is : {error_one} ')
print(f'R Squared For Polynomial Linear Regression Is : {error_two} ')

# Fitting Polynomial Regression to the dataSet With Degree Of 7.
poly_regression_7 = PolynomialFeatures(degree=7)
poly_salary_features_7 = poly_regression_7.fit_transform(salary_features)
linear_poly_regression = LinearRegression()
linear_poly_regression.fit(poly_salary_features_7, salary_labels)

error_one_7 = r2_score(salary_labels, linear_regression.predict(salary_features))
error_two_7 = r2_score(salary_labels, linear_poly_regression.predict(poly_regression_7.fit_transform(salary_features)))
print(f'R Squared For Linear Regression With Degree " 7 " Is : {error_one_7} ')
print(f'R Squared For Polynomial Linear Regression With Degree " 7 " Is : {error_two_7} ')


# Fitting Polynomial Regression to the dataSet With Degree as set Of List.
def best_degree_range():
    degree_list = []
    error_list = []
    for dgr in range(1, 21):
        degree_list.append(dgr)
        poly_regression_dgr = PolynomialFeatures(degree=dgr)
        poly_salary_features_dgr = poly_regression_dgr.fit_transform(salary_features)

        linear_poly_regression_degree = LinearRegression()
        linear_poly_regression_degree.fit(poly_salary_features_dgr, salary_labels)

        # Evaluation
        error_poly = r2_score(salary_labels,
                              linear_poly_regression_degree.predict(poly_regression_dgr.fit_transform(salary_features)))
        error_list.append(error_poly)

    error_list_max = max(error_list)
    print(error_list_max)

    for e, d in zip(error_list, degree_list):
        if e == error_list_max:
            print(f'Highest R Squared is {e}, and Degree For It Is {d}')
            best_degree = d
            print('----------------')
            return best_degree


best_degree_range()

# Fitting Polynomial Regression to the dataSet With Ridge Regression and Alpha is (1).
from sklearn.linear_model import Ridge
def rdg_regressor():
    print(salary_features_train.shape)
    print(salary_labels_train.shape)
    best_degree = best_degree_range()
    rdg = Ridge(alpha=1, normalize=True)
    rdg_poly_regression = PolynomialFeatures(degree=best_degree)

    rdg.fit(rdg_poly_regression.fit_transform(salary_features_train), salary_labels_train)

    plt.title('Alpha = 1')
    plt.plot(salary_labels_train, '.', rdg.predict(rdg_poly_regression.fit_transform(salary_features_train)), '-o')
    plt.show()
    print('------------------')


rdg_regressor()
Reply
#2
It seems that unlucky line in your code is line # 13. Try to comment it.
Your dataset has 6 rows, since shape of label-array is (6, 1). You reshaped the feature matrix to (-1, 1)-shape, that was a mistake. Originally, the feature matrix has shape (6, 4) (4 the number of features, 6 - the number of rows). After reshaping it became (6*4, 1), because passing "-1" to reshape method stands for "find appropriate dimension to keep the number of elements as in original array". Originally, salary_features.shape = (6, 4); after applying salary_features.reshape(-1, 1) you got salary_features.shape=(6*4, 1).
Reply
#3
(Mar-03-2020, 02:08 PM)scidam Wrote: It seems that unlucky line in your code is line # 13. Try to comment it.
Your dataset has 6 rows, since shape of label-array is (6, 1). You reshaped the feature matrix to (-1, 1)-shape, that was a mistake. Originally, the feature matrix has shape (6, 4) (4 the number of features, 6 - the number of rows). After reshaping it became (6*4, 1), because passing "-1" to reshape method stands for "find appropriate dimension to keep the number of elements as in original array". Originally, salary_features.shape = (6, 4); after applying salary_features.reshape(-1, 1) you got salary_features.shape=(6*4, 1).

First Thanks For Support My Friend ..,
I Tried Commenting The Line 13 and Also 14 And Still Same Result i Guess It IS About The Ridge Regressor, Because File Runs Till this Function Line (86)
best_degree_range()
Reply
#4
Thank You Pal, I Found There Is Switching for Variables Names in Train_test_Split,
in was salary_features_train, salary_labels_train, salary_features_test, salary_labels_test

and it must be :
salary_features_train, salary_features_test, salary_labels_train, salary_labels_test
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  ValueError: Found input variables with inconsistent numbers of samples: [5, 6] bongielondy 6 25,771 Jun-28-2021, 05:23 AM
Last Post: ricslato
  ValueError: Found array with 0 samples marcellam 1 5,162 Apr-22-2020, 04:12 PM
Last Post: jefsummers
  ValueError: Found input variables with inconsistent numbers of sample robert2joe 0 4,248 Mar-25-2020, 11:10 AM
Last Post: robert2joe
  ValueError: Input contains infinity or a value too large for dtype('float64') Rabah_r 1 12,895 Apr-06-2019, 11:08 AM
Last Post: scidam
  ValueError: could not broadcast input array from shape (75) into shape (25) route2sabya 0 6,479 Mar-14-2019, 01:14 PM
Last Post: route2sabya
  ValueError: Found input variables with inconsistent numbers of samples: [0, 3] ayaz786amd 2 9,607 Nov-27-2018, 07:12 AM
Last Post: ayaz786amd

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020