cross validate

amilie1234 · (This post was last modified: Feb-07-2017, 07:12 PM by ichabod801.)

Could somebody please help with the implementation of this code.

We are having the following with the current error where the code is bold and underlined: "TypeError: slice indices must be integers or None or have an __index__ method"

Additionally, we are having a problem with the evaluation of the predicted outcomes with the actual labels

def crossValidate(dataset, folds):
    shuffle(dataset)
    results = []
    foldSize = len(dataset)/folds
#for i in range(0,len(dataset),foldSize):
    
    for i in range(folds):
        pass
        Train_This_Data = dataset[:i*foldSize] + dataset[(i+1) * foldSize:]

        Test_This_Data = dataset[i*foldSize:(i+1) * foldSize]
        
        x_train= trainClassifier(Train_This_Data)
        y_pred = predictLabels(Test_This_Data, x_train)
        
        print(y_pred[0:10])
        
        for row in Test_This_Data, :
            y_true = row[0]
            print(y_true)
            
    #sklearn.metrics.precision_recall_fscore_support(y_true, y_pred, beta=1.0, labels=None, pos_label=1, average=None, warn_for=('precision', 'recall', 'f-score'), sample_weight=None)
    return results

***ichabod801*** · (This post was last modified: Feb-07-2017, 07:17 PM by ichabod801.)

For mulit-line code, use python tags, without formatting tags. To others, the formatting I took out, which indicated the error, is from the Train_This_Data = line.

Are you working in Python 3.x? len(dataset)/folds may be returning a float and causing the error you're seeing. I would check the foldSize value before the slicing.

As for your predicted outcome problems, that sounds more like a model problem than a Python problem.

mcmxl22 · Feb-07-2017, 07:26 PM

My observations:

There is HTML code mixed with the Python code.

Line 2 shuffle(dataset) is an undefined global variable.

amilie1234 · Feb-09-2017, 12:51 PM

I fixed the problem, but now I am having a problem with train_set = dataset[:i*foldSize] + dataset[(i+1)*foldSize]
The error outputted is TypeError: can only concatenate list (not "tuple") to list

***sparkz_alot*** · Feb-09-2017, 01:37 PM

When posting an error, please post the entire Traceback within the "Error" tags (the little red "X" on the menu bar). Some times the error code(s) can be misleading and therefore it is helpful to see them in their entirety.

amilie1234 · (This post was last modified: Feb-09-2017, 06:30 PM by Larz60+.)

A new error occurs. I read through my cross validate function it seems perfectly fine and should cross validate the dataset then classifier those datasets.

Traceback (most recent call last):
  File "C:\Users\users\Desktop\python\template.py", line 147, in <module>
    cv_results = crossValidate(trainData,10)
  File "C:\Users\users\Desktop\python\template.py", line 93, in crossValidate
    d_train = trainClassifier(train_set)
  File "C:\Users\users\Desktop\python\template.py", line 76, in trainClassifier
    return SklearnClassifier(LinearSVC()).train(trainData)
  File "C:\Python27\lib\site-packages\nltk\classify\scikitlearn.py", line 117, in train
    self._clf.fit(X, y)
  File "C:\Python27\lib\site-packages\sklearn\svm\classes.py", line 213, in fit
    self.loss)
  File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 885, in _fit_liblinear
    " class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0

This is the new function for cross validate

def crossValidate(dataset,folds):
        shuffle(dataset)
        results = []
        foldSize = len(dataset)//folds
        for i in range(folds):
                # split data into train_set and test_set
                train_set = dataset[:i*foldSize] + dataset[(i+1)*foldSize:]
                test_set = dataset[i*foldSize:(i+1)*foldSize]
                # train classifier and predicted labels
                d_train = trainClassifier(train_set)
                y_pred = predictLabels(test_set, d_train)
                y_true = []
                for row in test_set:
                        y_true.append(row[1])
                        print(y_pred[0:10])
                        cv_results = sklearn.metrics.precision_recall_fscore_support(y_true,y_pred)
        return results.append(cv_results)

amilie1234 · Feb-09-2017, 08:28 PM

Its done after 9 hours and thank you.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Cross 2 arrays	dylan261999	4	1,020	Feb-09-2023, 01:06 PM Last Post: thensun
	validate large json file with millions of records in batches	herobpv	3	1,257	Dec-10-2022, 10:36 PM Last Post: bowlofred
	Create SQL connection function and validate	mg24	1	936	Sep-30-2022, 07:45 PM Last Post: deanhystad
	how to validate user input from database	johnconar	3	1,907	Sep-11-2022, 12:36 PM Last Post: ndc85430
	Unable to Validate csv blanck data and write in csv	prashant18	0	1,526	Jul-25-2020, 12:08 PM Last Post: prashant18
	Validate JSON file	BellaMac	12	5,336	Feb-27-2020, 03:17 PM Last Post: snippsat
	building functions to validate strings , date etc	metro17	2	2,355	Aug-08-2019, 12:42 PM Last Post: ichabod801
	Python validate excel values data types	Useruser00	0	4,835	Apr-08-2019, 01:29 PM Last Post: Useruser00
	I want to validate that there is not more than two blank spaces in a string of charac	jlpavon1987	4	2,928	Mar-29-2019, 10:49 PM Last Post: woooee
	How to cross compile python for ARM ?	pankaj	4	5,684	Mar-06-2019, 05:59 AM Last Post: pankaj

cross validate

User Panel Messages

Announcements