Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read and tokenize doc/docx?
#7
(Oct-25-2018, 05:32 PM)JP_ROMANO Wrote: So, we're going to bail on this approach - but surely there is some other way that python can, right out of the box, read a word document, right?
No, there is no out of the box way...
1. docx
2. pywin32
3. ctypes
4. use some native xml library (docx is just bunch of zip-ed xml files), but you need to implement all xml work on your own.

As you don't say what are these set of exceptions we cannot help. Also, if problem with install you can download a wheel from Gohlke and use it e.g. pip install docx‑0.2.4‑py2.py3‑none‑any.whl

also docx is hosted on PyPI (https://python-docx.readthedocs.io/en/la...ml#install) I don't think there is problem with the PyPI SSL certificate
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Messages In This Thread
Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 03:12 PM
RE: Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 04:15 PM
RE: Read and tokenize doc/docx? - by buran - Oct-25-2018, 04:22 PM
RE: Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 05:08 PM
RE: Read and tokenize doc/docx? - by nilamo - Oct-25-2018, 05:29 PM
RE: Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 05:32 PM
RE: Read and tokenize doc/docx? - by buran - Oct-25-2018, 06:23 PM
RE: Read and tokenize doc/docx? - by Larz60+ - Oct-25-2018, 06:24 PM
RE: Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 07:10 PM
RE: Read and tokenize doc/docx? - by buran - Oct-25-2018, 07:25 PM
RE: Read and tokenize doc/docx? - by JP_ROMANO - Oct-25-2018, 07:32 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  no module named 'docx' when importing docx MaartenRo 1 1,059 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  python-docx regex: replace any word in docx text Tmagpy 4 2,335 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  My Python Console doesn´t work ModuleNotFoundError: No module named 'tokenize' RuanKishibe 1 3,192 Aug-06-2020, 10:07 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020