Python Forum
Split string between two different delimiters, with exceptions
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Split string between two different delimiters, with exceptions
#1
I couldn't really think of a good title for this, so, sorry!

Say I have some data like so:
:1:123456:2:name:42:3:30:4::5:somerandomdata:9:8
The important data is held between groups like :1:. Each of those groups has a different number from the next. I need to split the string between all of those groups so I can access that data. But due to some issues with the data I'll explain, it's not as easy as just splitting.
In the above data I have tried to include all the edge cases:
- The numbers of the groups are not in any specific order. Generally they are ascending but some will be out of order
- The groups numbers are between 1 and 2 characters
- Some groups will be empty like :4::5:
- One groups has a time which has a colon in it: :42:3:30:4: (the time being 3:30). This will mess up most already existing split functions.
In the data above, all actual data between the groups is hard coded. In the PHP file this comes from, most of that data is from many variables which explain what that data is. However some of the data is just constant values like 1 and I don't know what they do. In the data above I made :9:8 the "unknown constant".

My current approach has a class full of values like this:
class DataValues:
    _ID = 1
    NAME = 2
    TIME = 42
    SOMETHING = 4
    DATA = 5
    CONSTANT = 9
Each variable is in the order like in the data. To split I do this:
s = []
_vars = [getattr(DataValues, v) for v in vars(DataValues) if not v.startswith("__")] #loop through the variables in the class
for i in range(len(_vars) - 1):
	split = re.search("(:{}:)(.*)(:{}:)".format(_vars[i], _vars[i+1]), self.data)
	split = "" if split is None else split.group(2)
	s.append(split)
This works for the most part. It has 1 issue where the regex just doesn't split the string like so :2:Sidestep:3:VXBkYXRlOiBhZGRlZCB (It should have split between 2 and 3).
The code also relies on the class having every single value in the exact order of the data. If anything was changed in the data it would break. It also involves me having these "constant" values which are unknown and so it makes the code messy because I have to create variables like UKNOWN_1 = 8.

Could anyone help me out on this?
Reply
#2
I have never seen split used like this. And surprised it works at all.

Are you parsing the class or the raw data?

I would start with something simple like:
foo=rawdata.split(':')
And join or delete the list slices into appropriate vars.
Here it splits as:
['', '1', '123456', '2', 'name', '42', '3', '30', '4', '', '5', 'somerandomdata', '9', '8']
Time can be assembled as
time = foo[5]+':'+foo[6]+':'+foo[7]+':'+foo[8] for example.
Reply
#3
(Aug-24-2020, 08:05 AM)millpond Wrote: I have never seen split used like this. And surprised it works at all.

Are you parsing the class or the raw data?

I would start with something simple like:
foo=rawdata.split(':')
And join or delete the list slices into appropriate vars.
Here it splits as:
['', '1', '123456', '2', 'name', '42', '3', '30', '4', '', '5', 'somerandomdata', '9', '8']
Time can be assembled as
time = foo[5]+':'+foo[6]+':'+foo[7]+':'+foo[8] for example.
I am parsing the raw data from a string.
Your method I tried at first (and is what other people seem to). It would work, but there is a floor in that the actual data has about 35 values which would mean you would have foo[0] all the way up to foo[35] and that just makes the code very messy.

When I mentioned PHP, here's what I was talking about:

$response = "1:".$result["levelID"].":2:".$result["levelName"].":3:".$desc.":4:".$levelstring.":5:".$result["levelVersion"].":6:".$result["userID"].":8:10:9:".$result["starDifficulty"].":10:".$result["downloads"].":11:1:12:".$result["audioTrack"].":13:".$result["gameVersion"].":14:".$result["likes"].":17:".$result["starDemon"].":43:".$result["starDemonDiff"].":25:".$result["starAuto"].":18:".$result["starStars"].":19:".$result["starFeatured"].":42:".$result["starEpic"].":45:".$result["objects"].":15:".$result["levelLength"].":30:".$result["original"].":31:1:28:".$uploadDate. ":29:".$updateDate. ":35:".$result["songID"].":36:".$result["extraString"].":37:".$result["coins"].":38:".$result["starCoins"].":39:".$result["requestedStars"].":46:1:47:2:48:1:40:".$result["isLDM"].":27:$xorPass"
It's just a very long string of data.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python, exceptions KingKhan248 6 136 Nov-15-2020, 06:54 AM
Last Post: buran
  handling 2 exceptions at once Skaperen 2 408 Jun-27-2020, 08:55 AM
Last Post: Yoriz
  remove spaces with exceptions catosp 4 413 May-29-2020, 09:32 AM
Last Post: catosp
  split string enigma619 1 273 May-20-2020, 02:47 PM
Last Post: perfringo
  Split string with multiple delimiters and keep the string in "groups" DreamingInsanity 4 531 May-12-2020, 09:31 AM
Last Post: DeaD_EyE
  Problem with delimiters johnprada 5 601 Jan-29-2020, 10:17 AM
Last Post: DeaD_EyE
  Problem with delimiters johnprada 1 354 Jan-28-2020, 04:27 PM
Last Post: buran
  splitting a string with 2 different delimiters Skaperen 4 662 Dec-30-2019, 04:49 AM
Last Post: BamBi25
  Split a long string into other strings with no delimiters/characters krewlaz 4 550 Nov-15-2019, 02:48 PM
Last Post: ichabod801
  Looking for advice and Guidance on Exceptions used within Functions paul41 1 393 Nov-14-2019, 12:33 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020