question about train_test_split()

fadi · Feb-28-2018, 08:14 AM

Hello

I using the train_test_split function in the following code

 # Load the data set for training and testing the logistic regression classifier
    dataset = pd.read_csv(DATA_SET_PATH)
 
    training_features = ['TVnews', 'PID', 'age', 'educ', 'income']
    target = 'vote'
    
    # Train , Test data split
    train_x, test_x, train_y, test_y = train_test_split(dataset[training_features], dataset[target], train_size=0.7)
    
    print "train_x size :: ", train_x.shape
    print "train_y size :: ", train_y.shape
 
    print "test_x size :: ", test_x.shape
    print "test_y size :: ", test_y.shape

Here is the output

Output:train_x size :  (35, 4)
train_y size :  (35L,)
test_x size :  (15, 4)
test_y size :  (15L,)

Question is what does L mean?

mpd · Mar-05-2018, 07:37 PM

It's just indicating that it's a long integer.

fadi · Mar-06-2018, 07:18 AM

(Mar-05-2018, 07:37 PM)mpd Wrote: It's just indicating that it's a long integer.

But the data I have are mix of negative and positive fraction integers, how can I avoid displaying the L?

mpd · Mar-06-2018, 12:40 PM

(Mar-06-2018, 07:18 AM)fadi Wrote:
(Mar-05-2018, 07:37 PM)mpd Wrote: It's just indicating that it's a long integer.

But the data I have are mix of negative and positive fraction integers, how can I avoid displaying the L?

Shape is just the number of samples multiplied by the number of features; it has nothing to do with what the data actually is.

This will print without the L:

print("test_y size = {0}".format(test_y.shape[0]))

Incidentally, Python 3 doesn't differentiate between int's and long's, so there is no 'L'. And unless you have a really good reason for using 2, you should be working with Python 3, anyway.

question about train_test_split()

User Panel Messages

Announcements