Python Forum
How to transfer Text from one Word Document to anouther - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How to transfer Text from one Word Document to anouther (/thread-21649.html)

Pages: 1 2


How to transfer Text from one Word Document to anouther - konsular - Oct-08-2019

Hello there,

im a python newcomer and i decided to make the following my first python project. Im a state official from Germany and i help disabled people there as an administrator. We of course have a Program which helps us in our outgoing Papers.

The Problem is as follows:
Our social workers develop a plan to help the disabled people. I get an documentation of that. Some datas from this word document translate into my outgoing official letter. This information doesnt get transfered to my document automatically. You can imagine the problem like the following:

Document A (from the social worker which goes to me)
Date created 27.09.2019

Period of service 01.09.2019 - 29.02.2020

...

Document B (the document i have to create)
...referring to the document of 27.09.2019...

...we hereby grant the service from 01.09.2019 - 29.02.2020 ...



My solution methods:

In my opinion the code should look like this
#define variables
Date created = First line after 3rd word number

Period of service = 3rd line after 3rd word number

#insert them into my papers
Date created insert variable=first line after "of"
Date created insert variable.add_run(Date created)

[...]

-----------------------------

as ive said the "docx" expansion for python works realy well. but it cant define a place in the word data as an variable so i could insert something there. neither it can "read" a document and define things i need later for my papers.

So can you guys help me out? This is realy a project of my heart for me because even though im not a professional programmer, my work (which i realy love) provides me the problems which i could solve with my future to be hobby.

Looking forward for your replies Smile Big Grin


RE: How to transfer Text from one Word Document to anouther - buran - Oct-08-2019

without seeing actual source document it's hard to suggest anything. Is the format always the same? Also probably RegEx would help


RE: How to transfer Text from one Word Document to anouther - konsular - Oct-08-2019

(Oct-08-2019, 03:45 PM)buran Wrote: without seeing actual source document it's hard to suggest anything. Is the format always the same? Also probably RegEx would help

Can i include an word document here?


RE: How to transfer Text from one Word Document to anouther - buran - Oct-08-2019

Now you should be able. I have promoted you to next user group that would allow you to add attachments


RE: How to transfer Text from one Word Document to anouther - konsular - Oct-08-2019

I have anonimized and added Document A and Document B

some text needs to be transfered from Document A to Document B

as an example i have marked the date Document A was created and the place where this date needs to be transfered to in Document B red


RE: How to transfer Text from one Word Document to anouther - buran - Oct-08-2019

Here is very quick example, that certainly can be made much better but is something to start with
Document is constructed from tables, so I decided to use tables properties to retrieve the desired info and then write it
Definitely there may be other possible solutions

from docx import Document

# read from source
source_docx = Document('Document A.docx')
table = source_docx.tables[0]
cell = table.rows[1].cells[0]
date_created = cell.text.split()[-1]
print(date_created)

# write to destination
target_docx = Document('Document B.docx')
table = target_docx.tables[0]
cell = table.rows[2].cells[0]
paragraph = cell.paragraphs[-1]
paragraph.add_run(f' {date_created}')
target_docx.save('Document B.docx')



RE: How to transfer Text from one Word Document to anouther - konsular - Oct-08-2019

Thanks for your reply!

do you have a source where this commmands get explained? is this in "vanilla" python or in docx?

i will try out the code and then reply accourdingly. as im full time worker this might take a few days :)


RE: How to transfer Text from one Word Document to anouther - buran - Oct-08-2019

(Oct-08-2019, 05:06 PM)konsular Wrote: do you have a source where this commmands get explained? is this in "vanilla" python or in docx?

Python is modular language. There are plenty of third-party packages that extend the Standard library. In this case I am using python-docx package. It provide convenient high-level interface to interact with docx file. Here is the documentation for this package: https://python-docx.readthedocs.io/en/latest/
So the package provide tools (classes, functions, etc.) that help you work with docx without going into internal details. For example source_docx = Document('Document A.docx') creates instance of Document class, without you bother with details like opening file, reading it and parsing the content, etc.
The package and also my code also uses standard library - e.g. split() method of the standard str class. or import statement
In any cases you need to be familiar with the language fundamentals and build from there


RE: How to transfer Text from one Word Document to anouther - konsular - Oct-08-2019

Thanks for all your help so far!

im mainly using this user guide

https://buildmedia.readthedocs.org/media/pdf/python-docx/latest/python-docx.pdf

i cant find your command

source_docx.tables

there. if i want to search for a specific word in a paragraph would this mean i would have to make something like this?:

variable = source_docx.paragraph [0]


RE: How to transfer Text from one Word Document to anouther - buran - Oct-08-2019

It's pdf version of the same documentation.
source_docx is instance of Document class. In your pdf Document class API is on page 37-38. tables property is the last one (on page 38):
Quote:tables A list of Table objects corresponding to the tables in the document, in document order. Note that only tables appearing at the top level of the document appear in this list; a table nested inside a table cell does not appear. A table within revision marks such as<w:ins>or<w:del>will also not appear in the list.
what is says is that source_docx.tables will return list object (this is one of standard python object types).

(Oct-08-2019, 07:25 PM)konsular Wrote: variable = source_docx.paragraph [0]
well, no
source_docx.paragraphs (note the plural) will give you list of Paragraph objects (like source_docx.tables give you list of Table objects).
source_docx.paragraphs[0] will give you first Paragraph object ([0] notation is called indexing. indexes in python are 0-based). To get the text of the paragraph you will need source_docx.paragraphs[0].text. Note that first paragraph is actually empty in this case.

Because the document consists of tables you need to use table objects and find the text using respective table, row and cell

From your questions I see you lack basic undersatnding of language fundamentals. I strongly recommend to get familiar with the fundamentals as well as some basic understanding of object oriented programming