Python Forum

Full Version: Python PDF merging from an excel pandas for loop
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have an excel sheet, with some dropdown lists. (Working) Now i'm in Python, trying to read the date from the excel sheet (xlsx file) and reading the data into a for loop (Also working)

I have 3 column with a name, the name ref. to a pdf file, all pdf files are located the same place. I need to merge the 3 random PDF files into one.

So I can see i can use PyPDF2... But how can I do it in my for loop, so it will read the 3 values row by row and merge the files into one PDF, row by row?

My code is this ATM and i'm getting the right values from the xlsx sheet row by row.

import os
import pandas as pd
from PyPDF2 import PdfFileMerger

data = pd.read_excel(r'Resources\liste.xlsx', sheet_name='Ark1', skiprows=3)
dataread = pd.DataFrame(data)
for index, row in dataread.iterrows():
    print(index, row)
I can see (ref to PyPDF2)how to get the files into to PyPDF2, my problem is that i'm getting 4 values from the excel sheet row by row. ex. Value1=u6AB, Value2=FUO0002, Value3=FUO0004, Value4=u34_driblinger

From that I then have a location c:\users\myuser\document\master\pdf\ in here i have u6ABx.pdf, FUO0002_xxxxxxx.pdf and FUO0004_xxxxxxx.pdf these 3 files I want to merge into u34_driblinger.pdf

How can I do that from the ex. from the link, like:

for index, row in dataread.iterrows():
    print(index, row)
    try:
    # if doc exist then merge
        if os.path.exists(row):
            input = PyPDF2.PdfFileReader(open(row, 'rb'))
            merger.append((input))
        else:
            print(f"problem with file {row}")

    except:
        print("cant merge !! sorry")
    else:
        print(f" {row} Merged !!! ")

merger.write("Merged_doc.pdf")