Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Install pdftables-api
#14
Thank you guys,

Finally it works, the file as been converting to HTML File.

But, to be honest,I am disappointed with the result, there are a lot of characters that are not recognized.
I don't know if it's possible to specify the encoding (like UTF-8 for accented characters).

I have also content which is not visible (transparent text). This is not a problem because I am interested in recovering the text from the source code.

I would like also to have cleaning content,I mean that there are words with spaces beetwen them, for example the word "Information" will be written in this way "I N F O R M A T I O N". This is the most incovenient for me.
Reply


Messages In This Thread
Install pdftables-api - by MagicTrees - Oct-27-2020, 01:01 PM
RE: Install pdftables-api - by bowlofred - Oct-27-2020, 03:46 PM
RE: Install pdftables-api - by Larz60+ - Oct-27-2020, 04:46 PM
RE: Install pdftables-api - by bowlofred - Oct-27-2020, 04:59 PM
RE: Install pdftables-api - by Larz60+ - Oct-27-2020, 09:51 PM
RE: Install pdftables-api - by bowlofred - Oct-27-2020, 10:51 PM
RE: Install pdftables-api - by MagicTrees - Oct-28-2020, 04:48 PM
RE: Install pdftables-api - by bowlofred - Oct-28-2020, 05:10 PM
RE: Install pdftables-api - by MagicTrees - Oct-30-2020, 05:25 PM
RE: Install pdftables-api - by bowlofred - Oct-30-2020, 05:29 PM
RE: Install pdftables-api - by MagicTrees - Oct-31-2020, 03:33 PM
RE: Install pdftables-api - by Larz60+ - Oct-31-2020, 04:39 PM
RE: Install pdftables-api - by bowlofred - Oct-31-2020, 07:00 PM
RE: Install pdftables-api - by MagicTrees - Nov-06-2020, 11:21 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020