Python Forum
python-docx regex : Browse the found words in turn from top to bottom
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python-docx regex : Browse the found words in turn from top to bottom
#1
I'm trying to finalize my personal project and i'm having a new problem which is matching top to bottom positions in and out of the board to change word. here is my example docx file [Image: q2mR4.png] . I used the following code to change the word
import docx
import re
def iter_block_items(parent):
    if isinstance(parent, _Document):
        parent_elm = parent.element.body
        # print(parent_elm.xml)
    elif isinstance(parent, _Cell):
        parent_elm = parent._tc
    else:
        raise ValueError("something's not right")

    for child in parent_elm.iterchildren():
        if isinstance(child, CT_P):
            yield Paragraph(child, parent)
        elif isinstance(child, CT_Tbl):
            yield Table(child, parent)
def replace_string(key,value,NumberList,countKey,p):
    lenght = len(key)
    tmp_padding = len(key) - len (value)
    matchs = re.findall(key,p.text,re.IGNORECASE) 
    lines = p.runs 
    for j in range(len(lines)):
        padding = 0
        line = lines[j].text 
        for i in range(len(line)-lenght+1): 
            text = line[i - padding : i + lenght - padding] 
            if text in matchs:
                if countKey in NumberList:
                    text = line.replace(text, value) 
                    padding -= tmp_padding 
                    lines[j].text = text 
                countKey +=1 
    return countKey

def replace(filename,key,value,numberList,output_file):
    countKey = 1 
    doc = Document(filename)
    for block in iter_block_items(doc):
        if isinstance(block, Paragraph):
            if re.findall(key,block.text,re.IGNORECASE):
                countKey = replace_string(key,value,numberList,countKey,block)
        else:
            for table in doc.tables:
                for row in table.rows:
                    for cell in iter_unique_cells(row):
                        for p in cell.paragraphs:
                            if re.findall(key,p.text,re.IGNORECASE):
                                 countKey = replace_string(key,value,numberList,countKey,p)
    doc.save(output_file)
path = 'path of file docx'
tereplace(path,'collum','table',[1,3],'test2.docx')
here is the result: [Image: Xop7v.png]

Based on the results I see they match in the previous table.How can i position all the words in the text one by one?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  no module named 'docx' when importing docx MaartenRo 1 882 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  Replace a text/word in docx file using Python Devan 4 3,427 Oct-17-2023, 06:03 PM
Last Post: Devan
  I found a problem with Python Lahearle 12 1,471 Jul-20-2023, 10:58 PM
Last Post: Pedroski55
  docx insert words of previuos paragraph on next paragraph in the same position ctrldan 7 1,251 Jun-20-2023, 10:26 PM
Last Post: Pedroski55
  matplotlib x-axis text move bottom upward jacklee26 3 995 May-31-2023, 04:28 AM
Last Post: jacklee26
  Python Regex quest 2 2,351 Sep-22-2022, 03:15 AM
Last Post: quest
  python-docx: preserve formatting when printing lines Tmagpy 4 2,110 Jul-09-2022, 01:15 AM
Last Post: Tmagpy
  python-docx- change lowercase to bold, italic Tmagpy 0 1,415 Jul-01-2022, 07:25 AM
Last Post: Tmagpy
  python-docx regex: replace any word in docx text Tmagpy 4 2,245 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  python regex: get rid of double dot wardancer84 4 2,366 Sep-09-2021, 03:03 PM
Last Post: wardancer84

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020