Python Forum
Put the new line after regex pattern - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Put the new line after regex pattern (/thread-40299.html)



Put the new line after regex pattern - stahorse - Jul-06-2023

Hi,
I have this code:

Quote:text = """MyCo Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.



SANTAM Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.





<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards """

pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
My code matches all these patterns
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">

but I want to update my code so that it inserts a new line after that pattern, please help. I want to have:
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">
Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards



RE: Put the new line after regex pattern - Larz60+ - Jul-06-2023

Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-post-170817.html#pid170817


RE: Put the new line after regex pattern - stahorse - Jul-07-2023

(Jul-06-2023, 08:35 PM)Larz60+ Wrote: Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-post-170817.html#pid170817

Hi,

They are not related. on this thread I want to put a new line after my match of Regex.


RE: Put the new line after regex pattern - Pedroski55 - Jul-08-2023

Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h


RE: Put the new line after regex pattern - stahorse - Jul-10-2023

(Jul-08-2023, 10:44 PM)Pedroski55 Wrote: Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h

Hi, thanks for this,

I ran your code as it is, and it worked fine, then I changed it, and it didn't work as expected.

I put Blablablabla in the same sentence and the pattern we are looking for:

Quote:Blablablabla <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"

Here's my desired output, the "Blablablabla" should then be separated with the pattern, pattern should go in the new line. :
Quote:Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"



RE: Put the new line after regex pattern - deanhystad - Jul-10-2023

You asked for a newline after the pattern, not before. Should be an easy fix.
import re

text = """<img id="Picture_1">Blablablabla <img id="Picture_2">
Blablablabla<img id="Picture_1">Blablablabla
"""

matches = set(re.findall(r"<img.*?>", text, re.DOTALL))
for match in matches:
    text = text.replace(match, f"\n{match}\n")
print(text)
Of course you get extra blank lines if a line starts with or ends with the pattern.
Output:
<img id="Picture_1"> Blablablabla <img id="Picture_2"> Blablablabla <img id="Picture_1"> Blablablabla