Python Forum
Replace a text/word in docx file using Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Replace a text/word in docx file using Python
#1
Hello all,
I have few MS Word documents. I need to replace a particular text in that word document. This particular text can be found in paragraphs or inside tables, header or footer. I mean, where ever that particular text appears,I need to replace them all. I new to python and this is my 1st code,so pls help me.
My code:
from docx import Document

search_text = "WARD"

replace_text = "ORGANIZATION"

with open(r'First.docx', 'r') as file: 

	data = file.read() 

	data = data.replace(search_text, replace_text) 

with open(r'Second.docx', 'w') as file: 

	file.write(data) 

print("Text replaced")
The above program works with .txt files. But it is not working with .docx files, even after using 'from docx import Document'. If i use .docx files, I get the following errors.
Error:
=========================== RESTART: D:\Work\test.py =========================== Traceback (most recent call last): File "D:\Work\test.py", line 14, in <module> with open(r'Content1.docx', 'r') as file: FileNotFoundError: [Errno 2] No such file or directory: 'Content1.docx'
Thanks,
Dev.
Reply
#2
Severals problems here,so this will never work.
You import from docx import Document,but you never use it.
.docx is special format you can not use use Python open() to save it,have to use the python-docx for open and save.

So to give a working demo,this replace word in replace_word dictionary and save it to new a new note_demo.docx
# pip install python-docx
from docx import Document

doc = Document('demo.docx')
replace_word = {'WARD': 'ORGANIZATION', 'ebook': 'new_book'}
for word in replace_word:
    for p in doc.paragraphs:
        if p.text.find(word) >= 0:
            p.text = p.text.replace(word, replace_word[word])

doc.save('note_demo.docx')
There also python-docx-replace made for just this case.
Devan likes this post
Reply
#3
Hi,
Thanks for the help. Now I can search for different words and replace. But the code works only for paragraphs. I want to search for text in tables and replace them. I also want to search for texts in headers and footers and replace them as well. The headers/footers may have table also. Please help.
Dev.
Reply
#4
What have you tried? From snippsat's example you know tha the document will be organized into parts. You know how to do a search replace for the paragraphs, what other parts are there to the document?

From here it looks like there may be different haders and footers for each section.

https://python-docx.readthedocs.io/en/la...drftr.html

You'll probably have to loop through all the sections, replacing text in the header and footer for each.

Tables are constructed of cells, and cells can contain paragraphs.

https://python-docx.readthedocs.io/en/la...e.html#id1

Somebody else was doing the same thing.

https://stackoverflow.com/questions/2480...t-and-save
Reply
#5
Hi,

Thanks for your reply. Now I can replace texts in paragraph text and within tables. But not able replace text in header and footer. This text may be in table of the header and footer. Can you help.
regards,
Dev

from docx import Document
 
doc = Document('Input.docx')

# list of all words to be replaced, with its new word
replace_word = {'TABL':'SUD','HOUSE':'HOME','MYNAME':'DEV', 'KING':'OWNER','WARD':'AREA','TRY2':'HEA'}

# to replace words within paragraphs
for word in replace_word:
    for p in doc.paragraphs:
        if p.text.find(word) >= 0:
            p.text = p.text.replace(word, replace_word[word])

# to replace words within tables            
for word in replace_word:
    for table in doc.tables:
        for row in table.rows:
            for cell in row.cells:
                for p in cell.paragraphs:
                     if p.text.find(word) >= 0:
                         p.text = p.text.replace(word, replace_word[word])
    
# to replace words within headers
for word in replace_word:
    for section in doc.sections:
      header = section.header   
      for table in doc.tables:
            for row in table.rows:
                for cell in row.cells:
                    for p in cell.paragraphs:
                        if p.text.find(word) >= 0:
                            p.text = p.text.replace(word, replace_word[word])

# to replace words within footers
for word in replace_word:
    for footer in doc.sections:
      footer = section.footer   
      for table in doc.tables:
            for row in table.rows:
                for cell in row.cells:
                    for p in cell.paragraphs:
                        if p.text.find(word) >= 0:
                            p.text = p.text.replace(word, replace_word[word])
doc.save('Output.docx')
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  replace text in a txt cartonics 19 2,248 Jan-30-2024, 06:58 AM
Last Post: Athi
  no module named 'docx' when importing docx MaartenRo 1 894 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  Need to replace a string with a file (HTML file) tester_V 1 776 Aug-30-2023, 03:42 AM
Last Post: Larz60+
  save values permanently in python (perhaps not in a text file)? flash77 8 1,249 Jul-07-2023, 05:44 PM
Last Post: flash77
  Working with Excel and Word, Several Questions Regarding Find and Replace Brandon_Pickert 4 1,570 Feb-11-2023, 03:59 PM
Last Post: Brandon_Pickert
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,133 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  Replace columns indexes reading a XSLX file Larry1888 2 996 Nov-18-2022, 10:16 PM
Last Post: Pedroski55
  Use module docx to get text from a file with a table Pedroski55 8 6,199 Aug-30-2022, 10:52 PM
Last Post: Pedroski55
  python-docx: preserve formatting when printing lines Tmagpy 4 2,114 Jul-09-2022, 01:15 AM
Last Post: Tmagpy
  python-docx- change lowercase to bold, italic Tmagpy 0 1,420 Jul-01-2022, 07:25 AM
Last Post: Tmagpy

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020