Python Forum
Regex: Remove all match plus one char before all
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regex: Remove all match plus one char before all
#1
Hello,
I made this script to find all "|BS|" blocks in a string, and simulate a backspace, as if the string would be typed in a linetext input field. I came up with this:

#!/usr/bin/python3
import re

if __name__ == '__main__':
    string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
    while re.search("\|BS\|", string):
        array = list(string)
        for m in re.finditer("\|BS\|", string):
            del array[m.start():m.end()]
            if m.start()-1 >= 0:
                del array[m.start()-1]
            string = ''.join(array)
            break
    print(string)
which output "this is an example"

It works, but feels very inefficient and I would like to improve it, avoiding the needs for an array or using a single pass regex. I looked at the documentation for re but yet I could not find a satisfying answer. Hopefully you can give me some tips  Big Grin

Thanks!
Reply
#2
So actually you want to remove all |BS| as well as one or more optional spaces and one word before that?

#!/usr/bin/python3
import re

string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
ptrn = re.compile(r'\w* ?\|BS\| ?') # (r'(\w* ?)(?=\|BS\|)(?<=[\w* ?])(\|BS\|)*') - this was the initail pattern
print(re.sub(ptrn, '', string))
pattern may not be the best, someone else may suggest something better
Reply
#3
Thanks for your reply! I want to remove all |BS|, as well as the optional character preceeding it.
Like so:

Quote:"it |BS||BS||BS||BS|this is one|BS||BS||BS|an example"
"it|BS||BS||BS|this is one|BS||BS||BS|an example"
"i|BS||BS|this is one|BS||BS||BS|an example"
"|BS|this is one|BS||BS||BS|an example"
"this is one|BS||BS||BS|an example"
"this is on|BS||BS|an example"
"this is o|BS|an example"
"this is an example"
Reply
#4
I don't get it. your code takes it |BS||BS||BS|this is one|BS||BS||BS|an example and returns this is an example. You said it works (i.e. that's the desired result) but inefficient. I give you more efficient code using single line re.sub. So now you display something totally different as desired output. Show us your code and ask specific question.

to make it more clear:

#!/usr/bin/python3
import re

string = "it |BS||BS||BS|this is one|BS||BS||BS|an example" #|BS| as in Backspace
ptrn = re.compile(r'\w* ?\|BS\| ?')
print(re.sub(ptrn, '', string))
print ('\n---- all matches follow----\n')
for match in re.finditer(ptrn, string):
    print(match.group())
and the output:

Output:
this is an example ---- all matches follow---- it |BS| |BS| |BS| one|BS| |BS| |BS|
Reply
#5
Sorry for the misunderstanding, your solution work well. It is 5 times faster than the code in the original post. The quote I made was only about explaining what I want to do, in case someone would think of something better, as you suggested in your post.  However thank you again for your help, the code you provided is what I was looking for.

Oh I just noticed your pattern delete the whole word instead of a character.. Like I said I need to simulate a backspace event in a text field, so one |BS| should only remove a single (optionnal) char.
Reply
#6
(Feb-21-2017, 07:24 PM)Alfalfa Wrote: Oh I just noticed your pattern delete the whole word instead of a character.. Like I said I need to simulate a backspace event in a text field, so one |BS| should only remove a single (optionnal) char.

well, you don't delete single character, you delete it - that's 3 characters and also one - that's 3 characters...
Reply
#7
string = "it |BS||BS||BS|this is one|BS||BS||BS|an example"
sorry I should have been more explicit. there are 3 |BS| there, and 3 char to remove.
Reply
#8
I think there is some misunderstanding. Show us your input string. Does it really have |BS| strings in it?

do you actually want to remove as much characters(incl. spaces) before the |BS| sequence as many |BS|? The best would be to show input string, your code and the result. and ask specific question.
Reply
#9
The input string really have |BS| in it, but it's content can correspond to anyning, so there is no typical string I can provide. Following the example I showed earlier, it should behave as follow:

Quote:input: "it |BS|this is an example"
outpt: "itthis is an example"

input: "it |BS||BS|this is an example"
outpt: "ithis is an example"

input: "it |BS||BS||BS|this is an example"
outpt: "this is an example"

input: "it |BS||BS||BS||BS|this is an example"
outpt: "this is an example"
Reply
#10
so actually the requested result is "For each "|BS| group remove as many chars before that group as many times |BS| is present in that group"

#!/usr/bin/python3
import re

strings = ['it |BS||BS||BS|this is one|BS||BS||BS|an example',
           'it |BS|this is an example',
           'it |BS||BS|this is an example',
           'it |BS||BS||BS|this is an example',
           'it |BS||BS||BS||BS|this is an example']
ptrn = re.compile(r'[\w ]?\|BS\|')
for string in strings:
   print(string)
   while True:
       after_sub = ptrn.sub('', string, count=1)
       if string == after_sub:
           break
       else:
           string = after_sub
   print(string)
   print('\n')
Output:
it |BS||BS||BS|this is one|BS||BS||BS|an example this is an example it |BS|this is an example itthis is an example it |BS||BS|this is an example ithis is an example it |BS||BS||BS|this is an example this is an example it |BS||BS||BS||BS|this is an example this is an example
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Facing issue in python regex newline match Shr 6 1,304 Oct-25-2023, 09:42 AM
Last Post: Shr
Sad How to split a String from Text Input into 40 char chunks? lastyle 7 1,135 Aug-01-2023, 09:36 AM
Last Post: Pedroski55
  Failing regex, space before and after the "match" tester_V 6 1,189 Mar-06-2023, 03:03 PM
Last Post: deanhystad
  Regex pattern match WJSwan 2 1,272 Feb-07-2023, 04:52 AM
Last Post: WJSwan
  Match substring using regex Pavel_47 6 1,433 Jul-18-2022, 07:46 AM
Last Post: Pavel_47
  Match key-value json,Regex saam 5 5,427 Dec-07-2021, 03:06 PM
Last Post: saam
  How to replace on char with another in a string? korenron 3 2,361 Dec-03-2020, 07:37 AM
Last Post: korenron
  How to remove char from string?? ridgerunnersjw 2 2,561 Sep-30-2020, 03:49 PM
Last Post: ridgerunnersjw
  regex.findall that won't match anything xiaobai97 1 2,029 Sep-24-2020, 02:02 PM
Last Post: DeaD_EyE
  Creating new list based on exact regex match in original list interjectdirector 1 2,285 Mar-08-2020, 09:30 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020