Python Forum

i have a big string with 2 different delimiter characters. i want to split this big string into a list of strings for both delimiters. also, i want to keep the delimiter character at the beginning of each string it start a new string at.

with delimiters "XY", "XfooYbar" -> ["Xfoo","Ybar"]

i could make a function that does this character-by-character. but that might be too slow for the 100's of millions of characters i expect this to work with. this project needs to be limited to what comes with Python and what i can include in the code.

Hi,
One easy solution:

string = "XfooYbar"
pars1 = "X"
pars2 = "Y"
final = []
xpos = string.find("X")
ypos = string.find("Y")
final.append(string[xpos:ypos])
final.append(string[ypos:])
print(final)

Or a better solution with Regex:

import re
string = "XfooYbar"
regex = re.compile('^(X[^Y]+)(Y[^X]+)$')
final2 = []
m = regex.match(string)
[final2.append(items) for items in m.group(1, 2)]
print(final2)

Try this regex "[-.]+". The + after treats consecutive delimiter chars as one. Remove plus if you do not want this.

@BamBi25

i modified the test string:

import re
string = "XfooYbarXbooYfar"
regex = re.compile('^(X[^Y]+)(Y[^X]+)$')
final2 = []
m = regex.match(string)
[final2.append(items) for items in m.group(1, 2)]
print(final2)

which seems to have a problem:

Output:t2a/forums /home/forums 15> cat -n foo.py
     1	import re
     2	string = "XfooYbarXbooYfar"
     3	regex = re.compile('^(X[^Y]+)(Y[^X]+)$')
     4	final2 = []
     5	m = regex.match(string)
     6	[final2.append(items) for items in m.group(1, 2)]
     7	print(final2)
lt2a/forums /home/forums 16> py3 foo.py
Traceback (most recent call last):
  File "foo.py", line 6, in <module>
    [final2.append(items) for items in m.group(1, 2)]
AttributeError: 'NoneType' object has no attribute 'group'
lt2a/forums /home/forums 17>

Try this

re.findall(r'(X[^Y]+)(Y[^X]+)', string)

Skaperen

BamBi25

LeanbridgeTech

Skaperen

BamBi25