Python Forum
<b> followed by <b> before closing</b>
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
<b> followed by <b> before closing</b>
#1
I am trying to edit some HTML text in Python. I have an HTML file where there are sometimes a <b> tag (Bold) and before it is closed with a </b> there is another <b>, eg:

<RF><b>1:4 <b>Hom wat is ... kom:</b>
he second <b> should not be there.

Is it possible to write a regex pattern to find such occurrences and to delete the spurious <b>?
Reply
#2
something like that

import re

s = "<RF><b>1:4 <b>Hom wat is ... kom:</b>"

def repl(match, count=[0]):
    x, = count
    count[0] += 1
    if x > 0:
        return ''
    return '<b>'


print(re.sub('<b>', repl, s))
Output:
<RF><b>1:4 Hom wat is ... kom:</b>
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020