Feb-15-2020, 12:34 AM
Ok, so I've managed to get it working, although it might not be the most elegant solution. I build the new dataframe based on data extracted from the old dataframe. Innitially I got a lot of 'nan's in the new dataframe. I traced it to an indexing issue where the data I extract from the old dataframe keep it's index, which causes a mismatch with the new dataframe's index, therefore causing the 'nan'. So before adding the data extracted from the old dataframe, I needed to reindex it to align with the new dataframe's index. See code below. As I'm new to python I welcome comments on how to do this better, more efficiently. For some reason the reindexing takes quite a bit of processing time.
# read csv into dataframe df = pd.read_csv("MBV2rawdata.csv") #CPID, Date, Value, Class # I need this in the format Date, CPID#1, CPID#2, ... , CPID#n, Class cpid = df.CPID.unique() sf = df.Date[df.CPID == 24021] #Sf is the newly created dataframe sf = pd.DataFrame(sf) i=0 for colname in cpid: col = pd.DataFrame(df.Value[df.CPID==colname]) #get the column from df for j in range(len(col)): #have to reindex col to align with sf's index col.rename(index = { j + len(col)*i : j}, inplace=True) sf.insert(i+1, colname, col, True) #add col to sf i+=1 sf['Class']=df.Class[df.CPID == 24021] # add the 'Class' vlaues to sf