Manipulating Binary Data

arsenal88 · Apr-20-2017, 11:02 PM

Hi,

Dealing with some python at my work and having a little trouble writing a good script.

I have a txt file with a bunch of Hexadecimal data (1000's lines), e.g 2 frames shown below

AA08430022 AA08410234

The first 2 bytes of each 'frame' is the timestamp signal (in this case AA08). The rest of the frame (3 bytes) is the actual data.

I need to separate the data so that the timestamps can be in their own list and the data can be separated into 3 bins. This data needs to all be linked together at the end so that it is all sequential.

I need to get to a situation where this data is in a csv file in the following way:

COL BIN1 BIN2 BIN3
timestamp(frame1) data(frame1) data(frame1) data(frame1)
timestamp(frame2) data(frame2) data(frame2) data(frame2)

I'm not sure how to go about doing this.
Not expecting any solution but advice what direction to go in would be great as I'm at a dead end.

Thanks

my code here

***ichabod801*** · Apr-21-2017, 01:05 AM

I would:

Read them in as strings
Split them into timestamp vs. data with string slicing
Create a dictionary, with the timestamps as the keys, and the values being lists of the data for that timestamp.*
Append each data point to the list for the appropriate timestamp
Once all the data is read, get the keys into a list and sort it.
Loop through the list, writing the keys and the data out to the file.

* You could maybe do this step as a list of lists, but I would only do that if you are sure the data you are reading in is in the order you want to output things in.

arsenal88 · Apr-21-2017, 09:16 AM

Thanks for the response. Yea my initial idea was to treat is all as 1 large string:

<python>with open('test_data.txt') as hexData:
data = "".join(line.rstrip() for line in hexData)
</python>

This now creates a string 'data' with all the frames in 1 long row:
AA08430022AA08410234

I know how to slice strings based on indices but not sure when there is a reoccuring pattern (timestamps)? Some sort of for loop over the data and extract patterns based on the values?
Would I include regex?

Cheers

volcano63 · Apr-21-2017, 09:21 AM

There is a nice and simple package just for this purpose - parsing binary data strings, see struct

arsenal88 · Apr-21-2017, 11:24 AM

I am using regex to match all the values in the string that contain the timestamp and put it into a list:

match = re.findall(r'(AA08)', data)

I think i'll need to use this because some frames are a little corrupted, containing perhaps A0AA as a timestamp instead etc so I'll need a way of identifying these corruptions and finding similar patterns.

I want to create a list of all the valeus that do NOT contain the timestamp, so that each list index in 'match 'will correlate to the index of 'noMatch'... then I can go from there.

But it seems quite difficult to get any regex to work that returns a list of the string that does not contain AA08

Any idea how to get that?

wavic · Apr-21-2017, 01:01 PM

Since the time stamps can be the same for a bunch of frames you may face some obstacles processing them. It depends on what you need. List of tuples could be more convenient.

***ichabod801*** · (This post was last modified: Apr-21-2017, 09:09 PM by ichabod801.)

You don't need to mess with regular expressions. Just chop them up and feed them in to a defaultdict, and you're good to go.

import collections
data = collections.defaultdict(list)
for file_data in ('AA08430022', 'AA08410234', 'AB81130138'):
    timestamp = file_data[:4]
    datum = file_data[4:]
    data[timestamp].append(datum)

Output:>>> data
{'AA08': ['430022', '410234'], 'AB81': ['130138']}

arsenal88 · Apr-22-2017, 07:46 PM

Split them into timestamp vs. data with string slicing

That bit there is proving to be tricky for me.

I have tried the split method, to turn the long string into a list and separate all the timestamps:

data.split('AA08')

This gives me ['AA08', data, 'AA08', data, etc..]

I want it to be [AA08+data, AA08+data]

It's probably a simple solution but I'm such a newbie with python. I'm an embedded C engineer and don't deal with manipulating these kind of data structures or methods often.. :(

***snippsat*** · Apr-22-2017, 08:42 PM

(Apr-22-2017, 07:46 PM)arsenal88 Wrote: I want it to be [AA08+data, AA08+data]

You could probably join element before the goes in the list.
It had been better if you post sample code that we can run.
Can of course join element in a list that has been made.

>>> lst = ['AA08', 'data', 'AA08', 'data']
>>> it = iter(lst)
>>> [''.join(each) for each in zip(it, it)]
['AA08data', 'AA08data']

arsenal88 · (This post was last modified: Apr-25-2017, 12:30 PM by arsenal88.)

*Create a dictionary, with the timestamps as the keys, and the values being lists of the data for that timestamp.*

Ok so I've got up to this step.. My dictionary contains 1 key ('AA08') and then a list of all the data [data1, data2, data3 etc..] corresponding to that 1 key.

I need to seperate the data into 3 bins... i.e create a list of 3 values within each data list value...
Then I need to somehow put this into a csv in the format I described in my original post.

I have done the following:

file = open("test_data.csv", "w")
file.write(Timestamp")
for h in range(0, 3):
file.write("Bin " + str(h + 1) + ",")
file.write("\n")

Thats my headers sorted... but for looping through the dictionary, I'm stuck! I'm not sure how to extract the dictionary key value pairs into the csv format described originally.

Apologies if I'm asking seemingly trivial things.. tried for hours to get things working but I'm so unfamiliar with python and its data structures.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to convert binary data into text?	ZYSIA	3	2,643	Jul-16-2021, 04:18 PM Last Post: deanhystad
	Binary data to Image convert	Nuwan16	1	5,687	Aug-24-2020, 06:03 AM Last Post: millpond
	Manipulating data from a CSV	EvanS1	5	2,726	Jun-12-2020, 05:59 PM Last Post: perfringo
	manipulating two lists	rancans	8	3,195	Apr-16-2020, 06:00 PM Last Post: deanhystad
	Manipulating index value, what is wrong with this code?	Emun	1	1,754	Feb-05-2020, 07:18 AM Last Post: perfringo
	Manipulating the filename of an output script	mckinneycm	4	11,897	Jan-15-2020, 07:29 PM Last Post: mckinneycm
	hex file to binary or pcap to binary	baran01	1	5,702	Dec-11-2019, 10:19 PM Last Post: Larz60+
	Manipulating Excel with Python.	Spacely	2	3,640	Jun-25-2019, 01:57 AM Last Post: Dequanharrison
	How to Read Binary Data	pyth0nus3r	1	2,216	Jun-09-2019, 08:58 PM Last Post: DeaD_EyE
	Parse Binary Data File and convert Epoch Time	drdevereaux	1	3,188	May-16-2019, 01:56 AM Last Post: Larz60+

Manipulating Binary Data

User Panel Messages

Announcements