Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Combine 2 PDF pages into 1
#4
Here's something that I wrote a while back (forgot about is, I'm in my mid 70's and easy for me to do).

the pdf file used in the exmple is downloaded if not available

this expects a starting directory structure of:
PdfSplitter/
Output:
├── __init__.py ├── src │   ├── __init__.py
it was run from a virtual environment, but that's not necessary
make sure requests and pdfrw are installed:
pip install requests
pip install pdfrw

from there:
  1. cd to .../PdfSplitter/
  2. add __init__.py to PdfSplitter directory:
    src/
        __init__.py
        PdfSplitter.py
  3. Add an empty __init__.py script to src directory
  4. add the following module to the src directory name it pypdfsplit.py:
    from pathlib import Path
    from pdfrw import PdfReader, PdfWriter
    import requests
    import os
    import sys
    
    
    class Ppaths:
        def __init__(self, depth=0):
            os.chdir(os.path.abspath(os.path.dirname(__file__)))
            dir_depth = abs(depth)
    
            HomePath = Path(".")
    
            while dir_depth:
                HomePath = HomePath / ".."
                dir_depth -= 1
    
            rootpath = HomePath / ".."
    
            self.datapath = rootpath / "data"
            self.datapath.mkdir(exist_ok=True)
    
            self.csvpath = self.datapath / 'csv'
            self.csvpath.mkdir(exist_ok=True)
    
            self.pdfpath = self.datapath / 'pdf'
            self.pdfpath.mkdir(exist_ok=True)
    
            self.pdfsplitspath = self.pdfpath / 'splilts'
            self.pdfsplitspath.mkdir(exist_ok=True)
    
    
    class pypdfsplit:
        def __init__(self):
            self.ppath = Ppaths()
            self.pdf_reader = None
            self.pdf_writer = PdfWriter()
        
        def dispatch(self, pdffile, page_range=[1]):
            self.pdf_reader = PdfReader(pdffile)
            self.split_pdf(pdffile, page_range)
    
        def split_pdf(self, pdffile, page_range):
            outbase = pdffile.stem
            for pagenum in page_range:
                page = self.pdf_reader.getPage(pagenum)
                self.pdf_writer.addpage(page)
                outfile = self.ppath.pdfsplitspath /  f"{outbase}{pagenum}.pdf"
                self.pdf_writer.write(outfile)
    
        def get_page(self, url, bin=True):
            page = None
            response = requests.get(url)
            if response.status_code == 200:
                if bin:
                    page = response.content
                else:
                    page = response.text
            return page
    
    
    def main():
        psp = pypdfsplit()
        mypdffile = psp.ppath.pdfpath / 'l78.pdf'
        if not mypdffile.exists():
            page_url = 'https://www.st.com/resource/en/datasheet/l78.pdf'
            page = psp.get_page(url=page_url, bin=True)
            if page:
                with mypdffile.open('wb') as fp:
                    fp.write(page)
            else:
                print(f"Can't load {url}")
                sys.exit(-1)
        
        myrange = [1,3,5]
        psp.dispatch(pdffile=mypdffile, page_range=myrange)
        
    
    if __name__ == '__main__':
        main()
  5. run from PdfSplitter directory: python src/pypdfsplit.py
  6. when done, directory structure will look like:
    Output:
    PdfSplitter/ ├── data │   ├── csv │   └── pdf │   ├── l78.pdf │   └── splilts │   ├── l781.pdf │   ├── l783.pdf │   └── l785.pdf ├── __init__.py └── src └── pypdfsplit.py
    pages 1, 3 and 5 were split from the main pdf and stored in PdfSplitter/data/pdf/splits

Edit Jul13, 11:13 PM (UTF)
removed redundant import for pathlib
Reply


Messages In This Thread
Combine 2 PDF pages into 1 - by Cyberduke - Jul-13-2021, 10:13 AM
RE: Combine 2 PDF pages into 1 - by Larz60+ - Jul-13-2021, 11:08 AM
RE: Combine 2 PDF pages into 1 - by Cyberduke - Jul-13-2021, 11:42 AM
RE: Combine 2 PDF pages into 1 - by Larz60+ - Jul-13-2021, 05:30 PM
RE: Combine 2 PDF pages into 1 - by Pedroski55 - Jul-14-2021, 03:36 AM
RE: Combine 2 PDF pages into 1 - by Cyberduke - Jul-14-2021, 12:10 PM
RE: Combine 2 PDF pages into 1 - by Cyberduke - Jul-14-2021, 11:01 AM
RE: Combine 2 PDF pages into 1 - by Pedroski55 - Jul-15-2021, 12:23 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020