Hello guest, if you read this it means you are not registered. Click here to register in a few simple steps, you will enjoy all features of our Forum.
Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Manipulating Binary Data

Dealing with some python at my work and having a little trouble writing a good script.

I have a txt file with a bunch of Hexadecimal data (1000's lines), e.g 2 frames shown below

AA08430022 AA08410234

The first 2 bytes of each 'frame' is the timestamp signal (in this case AA08). The rest of the frame (3 bytes) is the actual data.

I need to separate the data so that the timestamps can be in their own list and the data can be separated into 3 bins. This data needs to all be linked together at the end so that it is all sequential.

I need to get to a situation where this data is in a csv file in the following way:

          COL                            BIN1                BIN2                    BIN3
timestamp(frame1)              data(frame1)      data(frame1)        data(frame1)
timestamp(frame2)              data(frame2)      data(frame2)        data(frame2)

I'm not sure how to go about doing this. 
Not expecting any solution but advice what direction to go in would be great as I'm at a dead end.


my code here
I would:
  • Read them in as strings
  • Split them into timestamp vs. data with string slicing
  • Create a dictionary, with the timestamps as the keys, and the values being lists of the data for that timestamp.*
  • Append each data point to the list for the appropriate timestamp
  • Once all the data is read, get the keys into a list and sort it.
  • Loop through the list, writing the keys and the data out to the file.
* You could maybe do this step as a list of lists, but I would only do that if you are sure the data you are reading in is in the order you want to output things in.
zivoni and volcano63 like this post
Craig "Ichabod" O'Brien - xenomind.com
Buddhist, biker, poet, coder, theist
Recommended Tutorials: BBCode, functions, classes, text adventures

Thanks for the response. Yea my initial idea was to treat is all as 1 large string:

<python>with open('test_data.txt') as hexData:
data = "".join(line.rstrip() for line in hexData)

This now creates a string 'data' with all the frames in 1 long row:

I know how to slice strings based on indices but not sure when there is a reoccuring pattern (timestamps)? Some sort of for loop over the data and extract patterns based on the values?
Would I include regex?

There is a nice and simple package just for this purpose - parsing binary data strings, see struct
I am using regex to match all the values in the string that contain the timestamp and put it into a list:

match = re.findall(r'(AA08)', data)

I think i'll need to use this because some frames are a little corrupted, containing perhaps A0AA as a timestamp instead etc so I'll need a way of identifying these corruptions and finding similar patterns.

I want to create a list of all the valeus that do NOT contain the timestamp, so that each list index in 'match 'will correlate to the index of 'noMatch'... then I can go from there.

But it seems quite difficult to get any regex to work that returns a list of the string that does not contain AA08

Any idea how to get that?
Since the time stamps can be the same for a bunch of frames you may face some obstacles processing them. It depends on what you need. List of tuples could be more convenient.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
You don't need to mess with regular expressions. Just chop them up and feed them in to a defaultdict, and you're good to go.

import collections
data = collections.defaultdict(list)
for file_data in ('AA08430022', 'AA08410234', 'AB81130138'):
    timestamp = file_data[:4]
    datum = file_data[4:]
>>> data {'AA08': ['430022', '410234'], 'AB81': ['130138']}
snippsat likes this post
Craig "Ichabod" O'Brien - xenomind.com
Buddhist, biker, poet, coder, theist
Recommended Tutorials: BBCode, functions, classes, text adventures

Split them into timestamp vs. data with string slicing

That bit there is proving to be tricky for me.

I have tried the split method, to turn the long string into a list and separate all the timestamps:


This gives me ['AA08', data, 'AA08', data, etc..]

I want it to be [AA08+data, AA08+data]

It's probably a simple solution but I'm such a newbie with python. I'm an embedded C engineer and don't deal with manipulating these kind of data structures or methods often.. :(
(Apr-22-2017, 07:46 PM)arsenal88 Wrote: I want it to be [AA08+data, AA08+data]
You could probably join element before the goes in the list.
It had been better if you post sample code that we can run.
Can of course join element in a list that has been made.
>>> lst = ['AA08', 'data', 'AA08', 'data']
>>> it = iter(lst)
>>> [''.join(each) for each in zip(it, it)]
['AA08data', 'AA08data']

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  HELP: String of Zero's and One's to binary byte schwasskin 0 94 Apr-07-2017, 03:04 AM
Last Post: schwasskin
  [?] UTF8, Unicode and Binary data reading troubles doublezero 1 72 Mar-31-2017, 11:32 PM
Last Post: Ofnuts
  binary trees Nucifera 3 152 Mar-10-2017, 08:07 AM
Last Post: Skaperen
  Find offset of binary data in file sparkz_alot 3 164 Mar-01-2017, 10:30 PM
Last Post: Larz60+
  Manipulating files Python 2.7 hugobaur 6 628 Nov-01-2016, 12:28 PM
Last Post: hugobaur

Forum Jump:

Users browsing this thread: 1 Guest(s)