Python Forum

Full Version: Put the new line after regex pattern
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have this code:

Quote:text = """MyCo Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.



SANTAM Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.





<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards """

pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
My code matches all these patterns
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">

but I want to update my code so that it inserts a new line after that pattern, please help. I want to have:
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">
Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards
Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-pos...#pid170817
(Jul-06-2023, 08:35 PM)Larz60+ Wrote: [ -> ]Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-pos...#pid170817

Hi,

They are not related. on this thread I want to put a new line after my match of Regex.
Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h
(Jul-08-2023, 10:44 PM)Pedroski55 Wrote: [ -> ]Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h

Hi, thanks for this,

I ran your code as it is, and it worked fine, then I changed it, and it didn't work as expected.

I put Blablablabla in the same sentence and the pattern we are looking for:

Quote:Blablablabla <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"

Here's my desired output, the "Blablablabla" should then be separated with the pattern, pattern should go in the new line. :
Quote:Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
You asked for a newline after the pattern, not before. Should be an easy fix.
import re

text = """<img id="Picture_1">Blablablabla <img id="Picture_2">
Blablablabla<img id="Picture_1">Blablablabla
"""

matches = set(re.findall(r"<img.*?>", text, re.DOTALL))
for match in matches:
    text = text.replace(match, f"\n{match}\n")
print(text)
Of course you get extra blank lines if a line starts with or ends with the pattern.
Output:
<img id="Picture_1"> Blablablabla <img id="Picture_2"> Blablablabla <img id="Picture_1"> Blablablabla