Python Forum
How to transfer Text from one Word Document to anouther
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to transfer Text from one Word Document to anouther
#1
Hello there,

im a python newcomer and i decided to make the following my first python project. Im a state official from Germany and i help disabled people there as an administrator. We of course have a Program which helps us in our outgoing Papers.

The Problem is as follows:
Our social workers develop a plan to help the disabled people. I get an documentation of that. Some datas from this word document translate into my outgoing official letter. This information doesnt get transfered to my document automatically. You can imagine the problem like the following:

Document A (from the social worker which goes to me)
Date created 27.09.2019

Period of service 01.09.2019 - 29.02.2020

...

Document B (the document i have to create)
...referring to the document of 27.09.2019...

...we hereby grant the service from 01.09.2019 - 29.02.2020 ...



My solution methods:

In my opinion the code should look like this
#define variables
Date created = First line after 3rd word number

Period of service = 3rd line after 3rd word number

#insert them into my papers
Date created insert variable=first line after "of"
Date created insert variable.add_run(Date created)

[...]

-----------------------------

as ive said the "docx" expansion for python works realy well. but it cant define a place in the word data as an variable so i could insert something there. neither it can "read" a document and define things i need later for my papers.

So can you guys help me out? This is realy a project of my heart for me because even though im not a professional programmer, my work (which i realy love) provides me the problems which i could solve with my future to be hobby.

Looking forward for your replies Smile Big Grin
Reply
#2
without seeing actual source document it's hard to suggest anything. Is the format always the same? Also probably RegEx would help
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
(Oct-08-2019, 03:45 PM)buran Wrote: without seeing actual source document it's hard to suggest anything. Is the format always the same? Also probably RegEx would help

Can i include an word document here?
Reply
#4
Now you should be able. I have promoted you to next user group that would allow you to add attachments
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
I have anonimized and added Document A and Document B

some text needs to be transfered from Document A to Document B

as an example i have marked the date Document A was created and the place where this date needs to be transfered to in Document B red

Attached Files

.docx   Document A.docx (Size: 52.01 KB / Downloads: 189)
.docx   Document B.docx (Size: 69.03 KB / Downloads: 122)
Reply
#6
Here is very quick example, that certainly can be made much better but is something to start with
Document is constructed from tables, so I decided to use tables properties to retrieve the desired info and then write it
Definitely there may be other possible solutions

from docx import Document

# read from source
source_docx = Document('Document A.docx')
table = source_docx.tables[0]
cell = table.rows[1].cells[0]
date_created = cell.text.split()[-1]
print(date_created)

# write to destination
target_docx = Document('Document B.docx')
table = target_docx.tables[0]
cell = table.rows[2].cells[0]
paragraph = cell.paragraphs[-1]
paragraph.add_run(f' {date_created}')
target_docx.save('Document B.docx')
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#7
Thanks for your reply!

do you have a source where this commmands get explained? is this in "vanilla" python or in docx?

i will try out the code and then reply accourdingly. as im full time worker this might take a few days :)
Reply
#8
(Oct-08-2019, 05:06 PM)konsular Wrote: do you have a source where this commmands get explained? is this in "vanilla" python or in docx?

Python is modular language. There are plenty of third-party packages that extend the Standard library. In this case I am using python-docx package. It provide convenient high-level interface to interact with docx file. Here is the documentation for this package: https://python-docx.readthedocs.io/en/latest/
So the package provide tools (classes, functions, etc.) that help you work with docx without going into internal details. For example source_docx = Document('Document A.docx') creates instance of Document class, without you bother with details like opening file, reading it and parsing the content, etc.
The package and also my code also uses standard library - e.g. split() method of the standard str class. or import statement
In any cases you need to be familiar with the language fundamentals and build from there
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#9
Thanks for all your help so far!

im mainly using this user guide

https://buildmedia.readthedocs.org/media...n-docx.pdf

i cant find your command

source_docx.tables

there. if i want to search for a specific word in a paragraph would this mean i would have to make something like this?:

variable = source_docx.paragraph [0]
Reply
#10
It's pdf version of the same documentation.
source_docx is instance of Document class. In your pdf Document class API is on page 37-38. tables property is the last one (on page 38):
Quote:tables A list of Table objects corresponding to the tables in the document, in document order. Note that only tables appearing at the top level of the document appear in this list; a table nested inside a table cell does not appear. A table within revision marks such as<w:ins>or<w:del>will also not appear in the list.
what is says is that source_docx.tables will return list object (this is one of standard python object types).

(Oct-08-2019, 07:25 PM)konsular Wrote: variable = source_docx.paragraph [0]
well, no
source_docx.paragraphs (note the plural) will give you list of Paragraph objects (like source_docx.tables give you list of Table objects).
source_docx.paragraphs[0] will give you first Paragraph object ([0] notation is called indexing. indexes in python are 0-based). To get the text of the paragraph you will need source_docx.paragraphs[0].text. Note that first paragraph is actually empty in this case.

Because the document consists of tables you need to use table objects and find the text using respective table, row and cell

From your questions I see you lack basic undersatnding of language fundamentals. I strongly recommend to get familiar with the fundamentals as well as some basic understanding of object oriented programming
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Replace a text/word in docx file using Python Devan 4 3,290 Oct-17-2023, 06:03 PM
Last Post: Devan
  How to summarize an article that is stored in a word document on your laptop? Mikedicenso87 2 654 Oct-06-2023, 12:07 PM
Last Post: Mikedicenso87
  find some word in text list file and a bit change to them RolanRoll 3 1,518 Jun-27-2022, 01:36 AM
Last Post: RolanRoll
  python-docx regex: replace any word in docx text Tmagpy 4 2,215 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
Question Problem: Check if a list contains a word and then continue with the next word Mangono 2 2,487 Aug-12-2021, 04:25 PM
Last Post: palladium
  How to read check boxes from word document srikanthpython 0 2,588 Mar-30-2021, 01:58 PM
Last Post: srikanthpython
  Searching for specific word in text files. JellyCreeper6 1 1,733 Nov-03-2020, 01:52 PM
Last Post: DeaD_EyE
  How to extract a single word from a text file buttercup 7 3,537 Jul-22-2020, 04:45 AM
Last Post: bowlofred
  Python Speech recognition, word by word AceScottie 6 15,984 Apr-12-2020, 09:50 AM
Last Post: vinayakdhage
  print a word after specific word search evilcode1 8 4,815 Oct-22-2019, 08:08 AM
Last Post: newbieAuggie2019

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020