Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
.doc (word) readers
#1
Hi,
This time I came across a massive contingent of legacy .doc files (not .docx)
Unlike .xml files, that you can read easily, with a choice of tools,
.doc files prove to be difficult.
Textract is reputed to do the job, but you need all sorts of strange softwares to make it work.
I tried this, and it works in principle, but i cannot find a way to close the word document.
So it opens hundreds simultaneously.
import win32com.client
        word = win32com.client.DispatchEx("Word.Application")
        word.visible = False  # does not seem to work, because word shows
        wb = word.Documents.Open(docpath)
        doc = word.ActiveDocument
        text = doc.Range().Text
Anybody know what and how to close: word ? doc ? wb ?
All I need is the text, never mind any font or formatting, just the text.
thx,
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PDF readers DPaul 10 2,481 Jan-03-2023, 04:12 PM
Last Post: DPaul

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020