Python Forum

Full Version: regular expression for a transformation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi

I cannot work out why the following works:

import re
text = "Follow up on the new sales *order*. Do not consider the cancelled *order*"
pattern = re.compile(r'\*(.*?)\*')
print("Before substitution: ", text)
text = pattern.sub(r"<b>\g<1><\\b>", text)
print("After substitution: ", text)
but the following generates a "bad escape \g" error:

import re
text = "Follow up on the new sales *order*. Do not consider the cancelled *order*"
pattern = re.compile(r'\*(.*?)\*')
print("Before substitution: ", text)
transform_pattern = re.compile(r"<b>\g<1><\\b>")
text = pattern.sub(transform_pattern, text)
print("After substitution: ", text)
Any suggestions?
Thanks
Well, the reason is that \g is not a valid escape sequence... Big Grin

Seriously, in the line:
text = pattern.sub(r"<b>\g<1><\\b>", text)
The r-string is not a regex, is used as a raw-string, to avoid the backslash hell... otherwise you need to use "<b>\\g<l><\\\\b>".
This is more evident if you do not split the compile and the sub phases:
#      re.sub(pattern, replacement, text)
text = re.sub(r'\*(.*?)\*', r"<b>\g<1><\\b>", text)
In your second example you are trying to interpret '<b>\g<1><\\b>' as a regular expression:
transform_pattern = re.compile(r"<b>\g<1><\\b>")
And in that context \g is not valid.