Python Forum

Full Version: Renaming PDF files using Excel data - Python
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have been trying to rename some PDF files in a specific folder. When I run the code, the first file in the "original" folder gets moved to the "renamed" folder and is correctly renamed according to my rules. Afterwards, I get an error message because the code successfully retrieves the following name from the spreadsheet, but tries to rename the same first PDF file already processed. And since, the original file no longer exist in the "original" folder...hence the error message.

Any ideas on how I get the code to select the next file?...or a better code to solve my issue? - Thank you.

import os, re
import xlrd

def rename_pdfs():

    path = r"C:\Users\...\original"

    for fname in os.listdir(path):
        excel_file = xlrd.open_workbook(r"C:\Users\...\data.xlsx")
        work_sheet = excel_file.sheet_by_index(0)
        for rownum in range(work_sheet.nrows):
            inv = work_sheet.cell_value(5+rownum, 4)
            for index in re.finditer("1718-", inv):
                rfr = inv[index.end():index.end() + 10]
                new_filename = work_sheet.cell_value(5+rownum, 1) + " " + "1718-" + rfr
                os.rename(path + "\\" + fname, r"C:\Users\...\renamed" + "\\" + new_filename + ".pdf")

rename_pdfs()
In the double for loop, you are always renaming the same file path + "\\" + fname. This is confusing.

Can you describe the rules without python code?
Hello...

the idea is to rename the PDFs in folder "original" based on information in the excel spreadsheet "data.xlsx", and save them to "renamed" folder. I use regex to limit the data I want to use in the names, and xlrd to access the cells containing the data.
(Mar-13-2018, 09:43 PM)okanaira Wrote: [ -> ]the idea is to rename the PDFs in folder "original" based on information in the excel spreadsheet "data.xlsx", and save them to "renamed" folder. I use regex to limit the data I want to use in the names, and xlrd to access the cells containing the data.
This does not explain the rules. Which files are renamed, how do you compute the new name?
I solved my problem...Here is the solution:

import os, re
import xlrd

def rename_pdfs():

    path = r"C:\\Users\\...\\original"
    excelFile = xlrd.open_workbook(r"C:\\Users\\...\\data.xlsx")
    workSheet = excelFile.sheet_by_index(0)
    fileNum = 1

    for rownum in range(workSheet.nrows):
        inv = workSheet.cell_value(5+rownum, 4)
        for index in re.finditer("1718-", inv):
            rfr = inv[index.end():index.end() + 10]
            newFilename = workSheet.cell_value(5+rownum, 1) + " " + "1718-" + rfr
            os.rename(os.path.join(path, str(fileNum)+".pdf"), os.path.join(r"C:\\Users\\...\\renamed", newFilename+".pdf"))
            fileNum += 1

if __name__ == "__main__":
    rename_pdfs()