Jun-09-2018, 08:56 PM
Your code worked OK as is. I tested the code on Windows 10 with Python 3.6.5
NOTES:
a. Constants should be all CAPS (e.g. SKEL)
b. There should be a space after the '#' in comments
c. All 'import' statements should be at the top of the file
d. Create the 'SubFolder' if necessary
e. Create a function that reads and processes the Excel File (replaces original code)
f. Process all .xlsx files (CASE INSENSITIVE) in the Data Folder
g.
Key lines in the code:
NOTES:
a. Constants should be all CAPS (e.g. SKEL)
b. There should be a space after the '#' in comments
c. All 'import' statements should be at the top of the file
d. Create the 'SubFolder' if necessary
e. Create a function that reads and processes the Excel File (replaces original code)
f. Process all .xlsx files (CASE INSENSITIVE) in the Data Folder
g.
if __name__ == '__main__':
Every module in python has a special attribute called __name__ . The value of the __name__ attribute is set to '__main__' when a module is run as the main program. Otherwise the value of __name__ is set to contain the name of the module.Key lines in the code:
#Line 85 - this is the baseline folder current_dir = pathlib.Path(".") #Line 94 - this defines the 'data folder' (underneath the baseline folder) # Use the 'tempdata' Subfolder # NOTE: The '/' operator creates the proper path in Windows too data_dir = full_path / "tempdata" #Line 101 - this defines the 'processed data folder' underneath the 'data folder' new_data_dir = data_dir / "processed_data_folder" #Line 102 - this creates the 'processed data folder' underneath the 'data folder' # If parents is true, any missing parents of this path are created as needed (otherwise Traceback error) # If exist_ok is true, no Traceback error if the Subfolder already exists pathlib.Path(new_data_dir).mkdir(parents=False, exist_ok=True) #Line 112 - this defines the 'output folder and file name combination' from 'pieces parts' new_name = new_data_dir / (obj.stem + " processed" + obj.suffix)Complete code follows:
#!/usr/bin/env python3 # -*- coding: utf-8 -*- # Reference: https://python-forum.io/Thread-Batch-processing-and-saving-into-new-folder # Reference: https://docs.python.org/3/library/pathlib.html import pandas as pd import numpy as np import xlsxwriter import pathlib # Modify the original code into a self contained function def process_excel_file(infile, outfile): # print(counter, infile) # print(counter, outfile) xl_file = pd.ExcelFile(infile) # Select spreadsheet data Data= xl_file.parse('Sheet1') #"parse" reads a sheet from the Excel file #Select values from Columns Skel = Data['Skeleton ID'] Brnch = Data['Branch length'] Euc = Data['Euclidean distance'] tort= Data['Branch length']/Data['Euclidean distance'] # make DataFrame (new spreadsheet) for the values DF_3 = pd.DataFrame({'Branch Number': Skel, 'Brnchln': Brnch, 'Eucl': Euc, 'Tort': tort}) DF_4 = DF_3[['Branch Number','Brnchln','Eucl','Tort']] DF_4.head() totalNumber = len(DF_4.dropna()) dic = DF_4.to_dict() dic2 = {} for key in dic.keys(): for key2 in dic[key]: dic2[key + ' ' + str(key2)] = dic[key][key2] key_names = ['Brnchln','Eucl','Tort'] numbers = range(totalNumber) titlearray=[infile.name] ordered_names = [] ordered_values = [] for i in range(totalNumber): for name in key_names: temp_name = name + ' ' + str(i) ordered_names.append(temp_name) ordered_values.append(dic2[temp_name]) #print(temp_name, dic2[temp_name]) array = np.array((ordered_names, ordered_values)) array.shape clmntitle=array[0,:] mesur=array[1,:] newclmntitle=np.insert(clmntitle, 0, 'Filename') newmesur=np.insert(mesur,0,titlearray) array3=np.array((newclmntitle,newmesur)) array3 array3.shape workbook = xlsxwriter.Workbook(outfile) worksheet = workbook.add_worksheet() row=0 col=0 worksheet.write_row(row, col, newclmntitle) row=1 col=0 worksheet.write_row(row, col, newmesur) workbook.close() return if __name__ == '__main__': # Get the Current Directory current_dir = pathlib.Path(".") print("Current Path: {}".format(current_dir)) # Convert the current path to a full path name full_path = current_dir.absolute() print("Full Path: {}".format(full_path)) # Use the 'tempdata' Subfolder # NOTE: The '/' operator creates the proper path in Windows too data_dir = full_path / "tempdata" print("Data Path: {}".format(data_dir)) # print(type(data_dir)) # Create the 'New Data Directory' as a subfolder of the 'tempdata' Subfolder # If parents is true, any missing parents of this path are created as needed (otherwise Traceback error) # If exist_ok is true, no Traceback error if the Subfolder already exists new_data_dir = data_dir / "processed_data_folder" pathlib.Path(new_data_dir).mkdir(parents=False, exist_ok=True) print("Output Path: {}".format(new_data_dir)) print("") # iterate over all objects in the data folder (process .xlsx files only - CASE INSENSITIVE file names) counter = 0 for obj in data_dir.iterdir(): if obj.is_file() and obj.suffix.lower() == ".xlsx": counter += 1 print("Processing {:04} - {}".format(counter, obj)) new_name = new_data_dir / (obj.stem + " processed" + obj.suffix) process_excel_file(obj, new_name)Typical output:
Output:Current Path: .
Full Path: C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis
Data Path: C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis\tempdata
Output Path: C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis\tempdata\processed_data_folder
Processing 0001 - C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis\tempdata\test1.xlsx
Processing 0002 - C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis\tempdata\test2.xlsX
Processing 0003 - C:\Users\Owner\Documents\Lew\WindowsBatchFiles\Python Lewis\tempdata\test3.xlsx
Lewis
To paraphrase: 'Throw out your dead' code. https://www.youtube.com/watch?v=grbSQ6O6kbs Forward to 1:00