Python Forum
python-docx regex: replace any word in docx text
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python-docx regex: replace any word in docx text
#4
def replace(input_file,key,value,Numberlist,output_file):
    # Input_file: path of file, key: word to change,value: word change,Numberlist: ordinal number of words to change
    # Output: file save with new file have value
    doc = Document(input_file)
    numberset = set(Numberlist)
    for p in doc.paragraphs:
        inline = p.runs
        match = re.finditer(key,p.text,re.IGNORECASE) #find key 
        for n, igkey in enumerate(match, 1):
            if n not in numberset:
                continue
            L_key = igkey.group()
            for j in range(len(inline)):
                if L_key in inline[j].text:
                    text = inline[j].text.replace(L_key, value)
                    inline[j].text = text
        #print(p.text)
    doc.save(output_file)
The effect of enumerate is to iterate on pairs (1, matchobj), (2, matchobj), (3, matchobj), ... instead of just match objects. Use the index n to reject occurrences that are not pointed to by Numberlist.

Edit: I realize that it will take the same numbers in every paragraph, this may not be what you want...
Reply


Messages In This Thread
RE: python-docx regex: replace any word in docx text - by Gribouillis - Jun-18-2022, 09:03 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  replace text in a txt cartonics 19 2,496 Jan-30-2024, 06:58 AM
Last Post: Athi
  no module named 'docx' when importing docx MaartenRo 1 1,138 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  Regex replace in SQLite3 database WJSwan 1 863 Dec-04-2023, 05:55 PM
Last Post: Larz60+
  Replace a text/word in docx file using Python Devan 4 4,011 Oct-17-2023, 06:03 PM
Last Post: Devan
  docx insert words of previuos paragraph on next paragraph in the same position ctrldan 7 1,365 Jun-20-2023, 10:26 PM
Last Post: Pedroski55
  Working with Excel and Word, Several Questions Regarding Find and Replace Brandon_Pickert 4 1,698 Feb-11-2023, 03:59 PM
Last Post: Brandon_Pickert
  Converting several Markdown files into DOCX using Pandoc Akule8 0 1,279 Feb-02-2023, 02:54 PM
Last Post: Akule8
  Python Regex quest 2 2,495 Sep-22-2022, 03:15 AM
Last Post: quest
  Use module docx to get text from a file with a table Pedroski55 8 6,580 Aug-30-2022, 10:52 PM
Last Post: Pedroski55
  python-docx: preserve formatting when printing lines Tmagpy 4 2,249 Jul-09-2022, 01:15 AM
Last Post: Tmagpy

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020