Python Forum

Full Version: How to switch table area coordinates in Python Camelot and Tabula-Py
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Dear All,

I have obtained the coordinates of a table bounding box using Camelot, but I need to use tabula-py to extract the table data, as camelot is only extracting the first line in each table cell, even in lattice mode. I have noticed that when defining the same table region in tabula-py, 2 of the resulting coordinates are largely different form the camelot values (shown in the code sample below). Whilst the 2nd and 4th values in tabula are similar to 1st and third values in camelot, the others are largely different, how can I translate these readings from the camelot values please? I have been trying to use proportion, and to add and subtract values but all were in vain...

df= tabula.read_pdf(pdf_path, lattice=True, area=(71, 627, 325, 1160), pages=page)

#but camelot coordinate values from bounding box are: 631, 518, 1154, 765