Python Forum
Regex: Remove all match plus one char before all - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Regex: Remove all match plus one char before all (/thread-2137.html)

Pages: 1 2 3 4


Regex: Remove all match plus one char before all - Alfalfa - Feb-21-2017

Hello,
I made this script to find all "|BS|" blocks in a string, and simulate a backspace, as if the string would be typed in a linetext input field. I came up with this:

#!/usr/bin/python3
import re

if __name__ == '__main__':
    string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
    while re.search("\|BS\|", string):
        array = list(string)
        for m in re.finditer("\|BS\|", string):
            del array[m.start():m.end()]
            if m.start()-1 >= 0:
                del array[m.start()-1]
            string = ''.join(array)
            break
    print(string)
which output "this is an example"

It works, but feels very inefficient and I would like to improve it, avoiding the needs for an array or using a single pass regex. I looked at the documentation for re but yet I could not find a satisfying answer. Hopefully you can give me some tips  Big Grin

Thanks!


RE: Regex: Remove all match plus one char before all - buran - Feb-21-2017

So actually you want to remove all |BS| as well as one or more optional spaces and one word before that?

#!/usr/bin/python3
import re

string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
ptrn = re.compile(r'\w* ?\|BS\| ?') # (r'(\w* ?)(?=\|BS\|)(?<=[\w* ?])(\|BS\|)*') - this was the initail pattern
print(re.sub(ptrn, '', string))
pattern may not be the best, someone else may suggest something better


RE: Regex: Remove all match plus one char before all - Alfalfa - Feb-21-2017

Thanks for your reply! I want to remove all |BS|, as well as the optional character preceeding it.
Like so:

Quote:"it |BS||BS||BS||BS|this is one|BS||BS||BS|an example"
"it|BS||BS||BS|this is one|BS||BS||BS|an example"
"i|BS||BS|this is one|BS||BS||BS|an example"
"|BS|this is one|BS||BS||BS|an example"
"this is one|BS||BS||BS|an example"
"this is on|BS||BS|an example"
"this is o|BS|an example"
"this is an example"



RE: Regex: Remove all match plus one char before all - buran - Feb-21-2017

I don't get it. your code takes it |BS||BS||BS|this is one|BS||BS||BS|an example and returns this is an example. You said it works (i.e. that's the desired result) but inefficient. I give you more efficient code using single line re.sub. So now you display something totally different as desired output. Show us your code and ask specific question.

to make it more clear:

#!/usr/bin/python3
import re

string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
ptrn = re.compile(r'\w* ?\|BS\| ?')
print(re.sub(ptrn, '', string))
print ('\n---- all matches follow----\n')
for match in re.finditer(ptrn, string):
    print(match.group())
and the output:

Output:
this is an example ---- all matches follow---- it |BS| |BS| |BS| one|BS| |BS| |BS|



RE: Regex: Remove all match plus one char before all - Alfalfa - Feb-21-2017

Sorry for the misunderstanding, your solution work well. It is 5 times faster than the code in the original post. The quote I made was only about explaining what I want to do, in case someone would think of something better, as you suggested in your post.  However thank you again for your help, the code you provided is what I was looking for.

Oh I just noticed your pattern delete the whole word instead of a character.. Like I said I need to simulate a backspace event in a text field, so one |BS| should only remove a single (optionnal) char.


RE: Regex: Remove all match plus one char before all - buran - Feb-21-2017

(Feb-21-2017, 07:24 PM)Alfalfa Wrote: Oh I just noticed your pattern delete the whole word instead of a character.. Like I said I need to simulate a backspace event in a text field, so one |BS| should only remove a single (optionnal) char.

well, you don't delete single character, you delete it - that's 3 characters and also one - that's 3 characters...


RE: Regex: Remove all match plus one char before all - Alfalfa - Feb-21-2017

string = "it |BS||BS||BS|this is one|BS||BS||BS|an example"
sorry I should have been more explicit. there are 3 |BS| there, and 3 char to remove.


RE: Regex: Remove all match plus one char before all - buran - Feb-21-2017

I think there is some misunderstanding. Show us your input string. Does it really have |BS| strings in it?

do you actually want to remove as much characters(incl. spaces) before the |BS| sequence as many |BS|? The best would be to show input string, your code and the result. and ask specific question.


RE: Regex: Remove all match plus one char before all - Alfalfa - Feb-21-2017

The input string really have |BS| in it, but it's content can correspond to anyning, so there is no typical string I can provide. Following the example I showed earlier, it should behave as follow:

Quote:input: "it |BS|this is an example"
outpt: "itthis is an example"

input: "it |BS||BS|this is an example"
outpt: "ithis is an example"

input: "it |BS||BS||BS|this is an example"
outpt: "this is an example"

input: "it |BS||BS||BS||BS|this is an example"
outpt: "this is an example"



RE: Regex: Remove all match plus one char before all - buran - Feb-21-2017

so actually the requested result is "For each "|BS| group remove as many chars before that group as many times |BS| is present in that group"

#!/usr/bin/python3
import re

strings = ['it |BS||BS||BS|this is one|BS||BS||BS|an example',
           'it |BS|this is an example',
           'it |BS||BS|this is an example',
           'it |BS||BS||BS|this is an example',
           'it |BS||BS||BS||BS|this is an example']
ptrn = re.compile(r'[\w ]?\|BS\|')
for string in strings:
   print(string)
   while True:
       after_sub = ptrn.sub('', string, count=1)
       if string == after_sub:
           break
       else:
           string = after_sub
   print(string)
   print('\n')
Output:
it |BS||BS||BS|this is one|BS||BS||BS|an example this is an example it |BS|this is an example itthis is an example it |BS||BS|this is an example ithis is an example it |BS||BS||BS|this is an example this is an example it |BS||BS||BS||BS|this is an example this is an example