Feb-23-2017, 05:20 PM
As I am rather ignorant of re and repeated iterations for "linear" problem look ugly, I tried direct naive approach - just traverse through string, copy characters or delete last one if |BS| is found.
def naive(s=None, n=1000): if not s: s = 'it |BS||BS||BS|this is one|BS||BS||BS|an example' * n else: s = s * n size, ind = len(s), 0 accumulator = [] while ind < size: if (s[ind] == '|') and (ind+4 <= size) and (s[ind:ind+4] == "|BS|"): if accumulator: accumulator.pop() ind += 4 else: accumulator.append(s[ind]) ind += 1 return "".join(accumulator)Stolen buran's measuring script gives me:
Output:repeat 1000, short string s*1:
alfalfa --> 0.03940597499968135
buran1 --> 0.015949830999488768
buran2 --> 0.02403012300055707
ofnut --> 0.04103327599932527
naive --> 0.010694067999793333
repeat 1, long string s*1000
alfalfa --> 4.176700068000173
buran1 --> 2.2705873200002316
buran2 --> 3.068299023000691
ofnut --> 0.014837658000033116
naive --> 0.009339336000266485
repeat 1, very long string s*3000
alfalfa --> 40.76510821000011
buran1 --> 21.95231995500035
buran2 --> 29.43507453300026
ofnut --> 0.04528477199983172
naive --> 0.03035950000048615
So it seems that:- buran got faster PC
- naive is fastest, but it doesnt improve for longer strings as much as I would guess - using re and repeatly search entire string should be quadratic (and from timing it is), while traversing should be linear - perhaps accumulator "growing" and char/substring checking is rather expensive compared to optimalized re