Mar-07-2019, 11:58 PM
I have a Human Resources document, a lot of pages, more than 100. I need them as a pdf.
I can batch scan to pdf with my little Epson DS-510, great little scanner.
I can first scan all the odd pages, 1,5,7,9 ... to PDF, then the even pages 2,4,6, ... to PDF
Each page is just an image, not text.
I can get text with:
I have the module PyPDF2 so I think I should be able to merge the oddpages.pdf and the evenpages.pdf to allpages.pdf.
I think I need a file allpages.pdf, then append page1 of oddpages.pdf to allpages.pdf, then page1 of evenpages.pdf to allpages.pdf, page2 of oddpages.pdf to allpages.pdf, page2 of evenpages.pdf to allpages.pdf and so on.
However, I have never done this before, so I would appreciate any tips!
I looked here but it is not very clear to me as a non-geek.
I also looked here. There are 3 examples of merger, but I think they would just append the whole pdf to another pdf.
I can batch scan to pdf with my little Epson DS-510, great little scanner.
I can first scan all the odd pages, 1,5,7,9 ... to PDF, then the even pages 2,4,6, ... to PDF
Each page is just an image, not text.
I can get text with:
print(pytesseract.image_to_string(Image.open('page1.jpg'), lang='chi_sim'))I tried this, it works well. I'll have to figure out how to do that with each page of a PDF, but, first I need to merge the 2 PDFs
I have the module PyPDF2 so I think I should be able to merge the oddpages.pdf and the evenpages.pdf to allpages.pdf.
I think I need a file allpages.pdf, then append page1 of oddpages.pdf to allpages.pdf, then page1 of evenpages.pdf to allpages.pdf, page2 of oddpages.pdf to allpages.pdf, page2 of evenpages.pdf to allpages.pdf and so on.
However, I have never done this before, so I would appreciate any tips!
I looked here but it is not very clear to me as a non-geek.
I also looked here. There are 3 examples of merger, but I think they would just append the whole pdf to another pdf.