Posts: 17
Threads: 8
Joined: Mar 2018
Hi,
I am using random forest method for regression,
I sue below comment:
X_train,X_test,Y_train,Y_test=train_split(x,y,test_size=0.3,random_state=0)
With above comment, it is splitting randomly, but I want take first 70% as train test, and next 30% as test ,
How to do this,
Posts: 100
Threads: 3
Joined: Dec 2017
I assume you're using sklearn here.
The train_test_split method randomly breaks up your data; that is its purpose. By specifying random_state=0 , you will always get the same output for the same input. If your data is already in a form you want, you can just split it up yourself using splicing and what-not.
Posts: 17
Threads: 8
Joined: Mar 2018
Yes, I am using sklearn
my definition as below:
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=0)
My data size is 1000, and I want to split first 700 as train data and next 300 data as test data,
But using above comment, it splitting randomly,
Posts: 100
Threads: 3
Joined: Dec 2017
(Mar-05-2018, 01:39 PM)Raj Wrote: Yes, I am using sklearn
my definition as below:
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=0)
My data size is 1000, and I want to split first 700 as train data and next 300 data as test data,
But using above comment, it splitting randomly,
As I said, train_test_split() is implemented to break up the data randomly. If you don't want it random, don't use the function. x and y are numpy arrays, correct? If yes, you can just slice them: https://docs.scipy.org/doc/numpy/referen...exing.html
Posts: 17
Threads: 8
Joined: Mar 2018
Sir,
I can not get an example, do you have any precise command(code)to do this?
Posts: 100
Threads: 3
Joined: Dec 2017
Here's a simple example of slicing a numpy array...
>>> import numpy as np
>>> dataset = np.array([[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7],[5,6,7,8]])
>>> dataset
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8]])
>>> np.shape(dataset)
(5, 4)
>>> train_data = dataset[:3]
>>> train_data
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6]])
>>> test_data = dataset[3:]
>>> test_data
array([[4, 5, 6, 7],
[5, 6, 7, 8]])
Posts: 17
Threads: 8
Joined: Mar 2018
|