Jul-09-2022, 11:01 AM
I never use pandas for my puny needs, so I have no experience, but I was interested.
Perhaps the experts here can improve this! I'd like to know how to do this better!
I think this does what you want.
For small amounts of data, I would do this with openpyxl, I understand that better!
Perhaps the experts here can improve this! I'd like to know how to do this better!
I think this does what you want.
For small amounts of data, I would do this with openpyxl, I understand that better!
import pandas as pd import glob savepath = '/home/pedro/myPython/pandas/xl_files/' saveoutputpath = '/home/pedro/myPython/pandas/outputxl/' # I put an extra column in each excel file named country with either Canada, Mexico or USA filelist = glob.glob(savepath + "*.xlsx") df = pd.concat((pd.read_excel(f) for f in filelist)) #df = df.fillna('') # replace nan #df_usa = pd.read_excel(csv_file) # change the indexes to get unique indexes # df.index.size gets how many indexes there are indexes = [] for i in range(df.index.size): indexes.append(i) # now change the indexes pass a list to df.index # never good to have 2 indexes the same df.index = indexes # this gets the dictionary ok mydict = {i:[df.iloc[i, 0], df.iloc[i, 1], df.iloc[i,2]] for i in indexes} print(len(mydict)) # get a list of the ids mylist = df["id"].values.tolist() # get a set of the unique ids myset = set(mylist) #create new DataFrame df_out = pd.DataFrame(columns=['id', 'Canada', 'Mexico', 'USA'], index=range(0, len(myset))) # make a list of unique ids id_names = list(myset) id_names.sort() # populate the id column with id_names df_out["id"] = id_names # see how many rows and columns print(df_out.shape) # mydict[key][0] is the id column , mydict[key][2]]is the country for key in mydict.keys(): df_out.loc[df_out["id"] == mydict[key][0], mydict[key][2]] = "X" # put zeroes in all cells with NaN df_out.fillna(0) df_out.to_excel(saveoutputpath + 'my_saved_data.xlsx', sheet_name='mydata', index=True)df_out.fillna(0) seems to work in the Python shell, but is not saved to Excel, I don't know why!