Python Forum
Split string between two different delimiters, with exceptions
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Split string between two different delimiters, with exceptions
#1
I couldn't really think of a good title for this, so, sorry!

Say I have some data like so:
:1:123456:2:name:42:3:30:4::5:somerandomdata:9:8
The important data is held between groups like :1:. Each of those groups has a different number from the next. I need to split the string between all of those groups so I can access that data. But due to some issues with the data I'll explain, it's not as easy as just splitting.
In the above data I have tried to include all the edge cases:
- The numbers of the groups are not in any specific order. Generally they are ascending but some will be out of order
- The groups numbers are between 1 and 2 characters
- Some groups will be empty like :4::5:
- One groups has a time which has a colon in it: :42:3:30:4: (the time being 3:30). This will mess up most already existing split functions.
In the data above, all actual data between the groups is hard coded. In the PHP file this comes from, most of that data is from many variables which explain what that data is. However some of the data is just constant values like 1 and I don't know what they do. In the data above I made :9:8 the "unknown constant".

My current approach has a class full of values like this:
class DataValues:
    _ID = 1
    NAME = 2
    TIME = 42
    SOMETHING = 4
    DATA = 5
    CONSTANT = 9
Each variable is in the order like in the data. To split I do this:
s = []
_vars = [getattr(DataValues, v) for v in vars(DataValues) if not v.startswith("__")] #loop through the variables in the class
for i in range(len(_vars) - 1):
	split = re.search("(:{}:)(.*)(:{}:)".format(_vars[i], _vars[i+1]), self.data)
	split = "" if split is None else split.group(2)
	s.append(split)
This works for the most part. It has 1 issue where the regex just doesn't split the string like so :2:Sidestep:3:VXBkYXRlOiBhZGRlZCB (It should have split between 2 and 3).
The code also relies on the class having every single value in the exact order of the data. If anything was changed in the data it would break. It also involves me having these "constant" values which are unknown and so it makes the code messy because I have to create variables like UKNOWN_1 = 8.

Could anyone help me out on this?
Reply
#2
I have never seen split used like this. And surprised it works at all.

Are you parsing the class or the raw data?

I would start with something simple like:
foo=rawdata.split(':')
And join or delete the list slices into appropriate vars.
Here it splits as:
['', '1', '123456', '2', 'name', '42', '3', '30', '4', '', '5', 'somerandomdata', '9', '8']
Time can be assembled as
time = foo[5]+':'+foo[6]+':'+foo[7]+':'+foo[8] for example.
Reply
#3
(Aug-24-2020, 08:05 AM)millpond Wrote: I have never seen split used like this. And surprised it works at all.

Are you parsing the class or the raw data?

I would start with something simple like:
foo=rawdata.split(':')
And join or delete the list slices into appropriate vars.
Here it splits as:
['', '1', '123456', '2', 'name', '42', '3', '30', '4', '', '5', 'somerandomdata', '9', '8']
Time can be assembled as
time = foo[5]+':'+foo[6]+':'+foo[7]+':'+foo[8] for example.
I am parsing the raw data from a string.
Your method I tried at first (and is what other people seem to). It would work, but there is a floor in that the actual data has about 35 values which would mean you would have foo[0] all the way up to foo[35] and that just makes the code very messy.

When I mentioned PHP, here's what I was talking about:

$response = "1:".$result["levelID"].":2:".$result["levelName"].":3:".$desc.":4:".$levelstring.":5:".$result["levelVersion"].":6:".$result["userID"].":8:10:9:".$result["starDifficulty"].":10:".$result["downloads"].":11:1:12:".$result["audioTrack"].":13:".$result["gameVersion"].":14:".$result["likes"].":17:".$result["starDemon"].":43:".$result["starDemonDiff"].":25:".$result["starAuto"].":18:".$result["starStars"].":19:".$result["starFeatured"].":42:".$result["starEpic"].":45:".$result["objects"].":15:".$result["levelLength"].":30:".$result["original"].":31:1:28:".$uploadDate. ":29:".$updateDate. ":35:".$result["songID"].":36:".$result["extraString"].":37:".$result["coins"].":38:".$result["starCoins"].":39:".$result["requestedStars"].":46:1:47:2:48:1:40:".$result["isLDM"].":27:$xorPass"
It's just a very long string of data.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  doing string split with 2 or more split characters Skaperen 22 2,317 Aug-13-2023, 01:57 AM
Last Post: Skaperen
Sad How to split a String from Text Input into 40 char chunks? lastyle 7 1,054 Aug-01-2023, 09:36 AM
Last Post: Pedroski55
  [split] Parse Nested JSON String in Python mmm07 4 1,413 Mar-28-2023, 06:07 PM
Last Post: snippsat
  PiCamera - print exceptions? korenron 2 791 Dec-15-2022, 10:48 PM
Last Post: Larz60+
  Split string using variable found in a list japo85 2 1,238 Jul-11-2022, 08:52 AM
Last Post: japo85
  Class exceptions DPaul 1 1,257 Mar-11-2022, 09:01 AM
Last Post: Gribouillis
  Split string knob 2 1,840 Nov-19-2021, 10:27 AM
Last Post: ghoul
  is this a good way to catch exceptions? korenron 14 4,593 Jul-05-2021, 06:20 PM
Last Post: hussaind
  Delimiters - How to skip some html tags from being translate Melcu54 0 1,620 May-26-2021, 06:21 AM
Last Post: Melcu54
  Parse String between 2 Delimiters and add as single list items lastyle 5 3,284 Apr-11-2021, 11:03 PM
Last Post: lastyle

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020