Nov-25-2017, 05:23 PM
Hello!
I am taking a introductory course in python programming at uni., 3 weeks in., and have been assigned to make a student grading script.
I am having problems iterating over grade values found in a pandas data frame. The grade values consist of valid values (values between -3.0 and 12) and invalid values (everything else, including nan values). The idea is to find occurrences of invalid values and track these by modifying a 'memory' array (length len(orginal data)) consisting of ones with zeroes (see code).
I would love to upload files but I can't until I post 5 times.
The problematic code:
------------------------------------------------------------------------------------------------
for i in range(len(Grades)):
if(i < -3.0 or i > 12.0):
Print("The grade value is out of range. Grade was {} in Line {}.".format(i, Rowcount))
#modify -Valid- array per index, with zero if the if statement is satisfied.
Valid[Rowcount]=0
Rowcount+=1
------------------------------------------------------------------------------------------------
The Output:
------------------------------------------------------------------------------------------------
The grade value is out of range. Grade was 13 in Line 13.
The grade value is out of range. Grade was 14 in Line 14.
The grade value is out of range. Grade was 15 in Line 15.
The grade value is out of range. Grade was 16 in Line 16.
The grade value is out of range. Grade was 17 in Line 17.
The grade value is out of range. Grade was 18 in Line 18.
The grade value is out of range. Grade was 19 in Line 19.
------------------------------------------------------------------------------------------------
Thus it seems that the code is assigning row number to -i- and using this row value with regard to the -if- statement and not the actual value in the dataframe.
I also tried using:
------------------------------------------------------------------------------------------------
for float in range(len(Grades)):
------------------------------------------------------------------------------------------------
This does not work either. Im confused with regard to setting up for loops and apparently haven't understood it properly.
The entirety of the code can be found below...
Thankyou very much in advance.
Regards
Spyder - Python 3.6.1 64bits, Qt 5.6.2, PyQt5 5.6 on Darwin on Mac OS 10.12.6
I am taking a introductory course in python programming at uni., 3 weeks in., and have been assigned to make a student grading script.
I am having problems iterating over grade values found in a pandas data frame. The grade values consist of valid values (values between -3.0 and 12) and invalid values (everything else, including nan values). The idea is to find occurrences of invalid values and track these by modifying a 'memory' array (length len(orginal data)) consisting of ones with zeroes (see code).
I would love to upload files but I can't until I post 5 times.
The problematic code:
------------------------------------------------------------------------------------------------
for i in range(len(Grades)):
if(i < -3.0 or i > 12.0):
Print("The grade value is out of range. Grade was {} in Line {}.".format(i, Rowcount))
#modify -Valid- array per index, with zero if the if statement is satisfied.
Valid[Rowcount]=0
Rowcount+=1
------------------------------------------------------------------------------------------------
The Output:
------------------------------------------------------------------------------------------------
The grade value is out of range. Grade was 13 in Line 13.
The grade value is out of range. Grade was 14 in Line 14.
The grade value is out of range. Grade was 15 in Line 15.
The grade value is out of range. Grade was 16 in Line 16.
The grade value is out of range. Grade was 17 in Line 17.
The grade value is out of range. Grade was 18 in Line 18.
The grade value is out of range. Grade was 19 in Line 19.
------------------------------------------------------------------------------------------------
Thus it seems that the code is assigning row number to -i- and using this row value with regard to the -if- statement and not the actual value in the dataframe.
I also tried using:
------------------------------------------------------------------------------------------------
for float in range(len(Grades)):
------------------------------------------------------------------------------------------------
This does not work either. Im confused with regard to setting up for loops and apparently haven't understood it properly.
The entirety of the code can be found below...
Thankyou very much in advance.
Regards
Spyder - Python 3.6.1 64bits, Qt 5.6.2, PyQt5 5.6 on Darwin on Mac OS 10.12.6
import numpy as np import chardet import pandas as pd # Open and read csv file with open('testfilex2.csv', 'rb') as f: #detect encoding of csv file, assigning encoding to -Result- Result = chardet.detect(f.read()) # Use panda to read csv file relative to above detected encoding Data = pd.read_csv('testfilex2.csv', encoding=Result['encoding'], header= None) #Display duplicates (student ids) found pd.dataframe -Data- print(Data[Data.duplicated([0], keep=False)]) # Drop duplicated rows based on ID[0] and Names[1] Data = Data.drop_duplicates([0], keep='last') Data = Data.drop_duplicates([1], keep='last') # Compute number of rows and columns Columns = len(Data.columns) Rows = len(Data.index) # Create a selection of columns to group data Ids = Data.iloc[:, 0] Names = Data.iloc[:, 1] # The amount of grades columns is unknown in the assignment, I must therefore # create code that will work with x amount of columns Grades = Data.iloc[:, range(2,Columns)] # Create an array of ones for indexing, in order to remove rows from # original df-Data by modifying this array with zeros based on below for/ if loop. Valid = np.ones(len(Grades)) # Create variable to keep track of rows, where invalid data might exist Rowcount=0 # My problems starts here... for i in range(len(Grades)): if(i < -3.0 or i > 12.0): Print("The grade value is out of range. Grade was {} in Line {}.".format(i, Rowcount)) #modify -Valid- array per index, with zero if the if statement is satisfied. Valid[Rowcount]=0 Rowcount+=1 # Modify orginal df-Data by the Valid array, in order to attain valid data Data=Data[Valid==1,:]