![]() |
.doc (word) readers - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Forum & Off Topic (https://python-forum.io/forum-23.html) +--- Forum: Bar (https://python-forum.io/forum-27.html) +--- Thread: .doc (word) readers (/thread-39152.html) |
.doc (word) readers - DPaul - Jan-10-2023 Hi, This time I came across a massive contingent of legacy .doc files (not .docx) Unlike .xml files, that you can read easily, with a choice of tools, .doc files prove to be difficult. Textract is reputed to do the job, but you need all sorts of strange softwares to make it work. I tried this, and it works in principle, but i cannot find a way to close the word document. So it opens hundreds simultaneously. import win32com.client word = win32com.client.DispatchEx("Word.Application") word.visible = False # does not seem to work, because word shows wb = word.Documents.Open(docpath) doc = word.ActiveDocument text = doc.Range().TextAnybody know what and how to close: word ? doc ? wb ? All I need is the text, never mind any font or formatting, just the text. thx, Paul |