How do I convert my data so it works with Pandas?

Oliver · Dec-11-2017, 04:09 PM

I was thinking of looking at linear regression for a set of error data. Then, be able to make a prediction what the error count might be in some future week.

These data have two columns:
(1) "ErrorDate" -> week number (of year), and
(2) "ErrorCount" (how many errors did the system have in that week).

I would imagine these data are pretty noisy (random), but who knows?

Anyway, I tried to load this data and do a basic LinearRegression fit test with Panda and scikit-Learn but got an error.

ERROR: "ValueError: Expected 2D array, got 1D array instead:"

--
The code seems so simple, like it should work:
# Read CSV data into dataframe

thedf = pd.read_csv("Errors.csv", sep=",") # Read 2 column data into Pandas DataFrame

X_train, X_test, y_train, y_test = train_test_split(
thedf['ErrorCount'], thedf['ErrorDate'], random_state=0)

print (ussdf.head())

>>>> Prints:
ErrorDate ErrorCount
0 1 80
1 2 118
2 3 249
3 4 397
4 5 159

So far, so good..

But, the shape is apparently wrong and I get the error noted above.

print("X_test shape: {}".format(X_test.shape))
print("y_test shape: {}".format(y_test.shape))

>>>> Prints:
X_test shape: (13,)
y_test shape: (13,)

--

So, I see the shape is the problem, but it's not clear to me from searches I did how to change it. This is probably a super simple question. I have a Pandas book on order but it won't arrive for another week.

Suggestions?

Thanks very much in advance,

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Grouping in pandas/multi-index data frame	Aleqsie	3	669	Jan-06-2024, 03:55 PM Last Post: deanhystad
	How to further boost the data read write speed using pandas	tjk9501	1	1,267	Nov-14-2022, 01:46 PM Last Post: jefsummers
	can't access data from URL in pandas/jupyter notebook	aaanoushka	1	1,863	Feb-13-2022, 01:19 PM Last Post: jefsummers
	Sorting data with pandas	TheZaind	4	2,345	Nov-22-2021, 07:33 PM Last Post: aserian
	Pandas Data frame column condition check based on length of the value	aditi06	1	2,697	Jul-28-2021, 11:08 AM Last Post: jefsummers
	[Pandas] Write data to Excel with dot decimals	manonB	1	5,875	May-05-2021, 05:28 PM Last Post: ibreeden
	pandas.to_datetime: Combine data from 2 columns	ju21878436312	1	2,455	Feb-20-2021, 08:25 PM Last Post: perfringo
	pandas read_csv can't handle missing data	mrdominikku	0	2,499	Jul-09-2020, 12:26 PM Last Post: mrdominikku
	Pandas data frame creation from Kafka Topic	vboppa	0	1,942	Jul-01-2020, 04:23 PM Last Post: vboppa
	Generate Test data (.csv) using Pandas	Ashley	5	3,056	Jun-15-2020, 02:51 PM Last Post: jefsummers

How do I convert my data so it works with Pandas?

User Panel Messages

Announcements