Hello,
I need to loop through a bunch of PDFs, each containing one or more articles.
I notice titles can use different fonts, but all seem to have the same size (19,5 points).
Can PyMuPDF (or some other library) grab all strings of a given size in a PDF, so I can build a master Table of Contents?
Thank you.
I need to loop through a bunch of PDFs, each containing one or more articles.
I notice titles can use different fonts, but all seem to have the same size (19,5 points).
Can PyMuPDF (or some other library) grab all strings of a given size in a PDF, so I can build a master Table of Contents?
Thank you.