Python Forum
<b> followed by <b> before closing</b> - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: <b> followed by <b> before closing</b> (/thread-39638.html)



<b> followed by <b> before closing</b> - WJSwan - Mar-20-2023

I am trying to edit some HTML text in Python. I have an HTML file where there are sometimes a <b> tag (Bold) and before it is closed with a </b> there is another <b>, eg:

<RF><b>1:4 <b>Hom wat is ... kom:</b>
he second <b> should not be there.

Is it possible to write a regex pattern to find such occurrences and to delete the spurious <b>?


RE: <b> followed by <b> before closing</b> - Axel_Erfurt - Mar-20-2023

something like that

import re

s = "<RF><b>1:4 <b>Hom wat is ... kom:</b>"

def repl(match, count=[0]):
    x, = count
    count[0] += 1
    if x > 0:
        return ''
    return '<b>'


print(re.sub('<b>', repl, s))
Output:
<RF><b>1:4 Hom wat is ... kom:</b>