Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Manipulating Binary Data
#1
Hi,

Dealing with some python at my work and having a little trouble writing a good script.

I have a txt file with a bunch of Hexadecimal data (1000's lines), e.g 2 frames shown below


AA08430022 AA08410234



The first 2 bytes of each 'frame' is the timestamp signal (in this case AA08). The rest of the frame (3 bytes) is the actual data.

I need to separate the data so that the timestamps can be in their own list and the data can be separated into 3 bins. This data needs to all be linked together at the end so that it is all sequential.

I need to get to a situation where this data is in a csv file in the following way:

          COL                            BIN1                BIN2                    BIN3
timestamp(frame1)              data(frame1)      data(frame1)        data(frame1)
timestamp(frame2)              data(frame2)      data(frame2)        data(frame2)


I'm not sure how to go about doing this. 
Not expecting any solution but advice what direction to go in would be great as I'm at a dead end.

Thanks











my code here
Reply
#2
I would:
  • Read them in as strings
  • Split them into timestamp vs. data with string slicing
  • Create a dictionary, with the timestamps as the keys, and the values being lists of the data for that timestamp.*
  • Append each data point to the list for the appropriate timestamp
  • Once all the data is read, get the keys into a list and sort it.
  • Loop through the list, writing the keys and the data out to the file.
* You could maybe do this step as a list of lists, but I would only do that if you are sure the data you are reading in is in the order you want to output things in.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#3
Thanks for the response. Yea my initial idea was to treat is all as 1 large string:

<python>with open('test_data.txt') as hexData:
data = "".join(line.rstrip() for line in hexData)
</python>


This now creates a string 'data' with all the frames in 1 long row:
AA08430022AA08410234


I know how to slice strings based on indices but not sure when there is a reoccuring pattern (timestamps)? Some sort of for loop over the data and extract patterns based on the values?
Would I include regex?

Cheers
Reply
#4
There is a nice and simple package just for this purpose - parsing binary data strings, see struct
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#5
I am using regex to match all the values in the string that contain the timestamp and put it into a list:

match = re.findall(r'(AA08)', data)

I think i'll need to use this because some frames are a little corrupted, containing perhaps A0AA as a timestamp instead etc so I'll need a way of identifying these corruptions and finding similar patterns.

I want to create a list of all the valeus that do NOT contain the timestamp, so that each list index in 'match 'will correlate to the index of 'noMatch'... then I can go from there.

But it seems quite difficult to get any regex to work that returns a list of the string that does not contain AA08


Any idea how to get that?
Reply
#6
Since the time stamps can be the same for a bunch of frames you may face some obstacles processing them. It depends on what you need. List of tuples could be more convenient.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#7
You don't need to mess with regular expressions. Just chop them up and feed them in to a defaultdict, and you're good to go.

import collections
data = collections.defaultdict(list)
for file_data in ('AA08430022', 'AA08410234', 'AB81130138'):
    timestamp = file_data[:4]
    datum = file_data[4:]
    data[timestamp].append(datum)
Output:
>>> data {'AA08': ['430022', '410234'], 'AB81': ['130138']}
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#8
Split them into timestamp vs. data with string slicing


That bit there is proving to be tricky for me.

I have tried the split method, to turn the long string into a list and separate all the timestamps:

data.split('AA08')

This gives me ['AA08', data, 'AA08', data, etc..]

I want it to be [AA08+data, AA08+data]

It's probably a simple solution but I'm such a newbie with python. I'm an embedded C engineer and don't deal with manipulating these kind of data structures or methods often.. :(
Reply
#9
(Apr-22-2017, 07:46 PM)arsenal88 Wrote: I want it to be [AA08+data, AA08+data]
You could probably join element before the goes in the list.
It had been better if you post sample code that we can run.
Can of course join element in a list that has been made.
>>> lst = ['AA08', 'data', 'AA08', 'data']
>>> it = iter(lst)
>>> [''.join(each) for each in zip(it, it)]
['AA08data', 'AA08data']
Reply
#10
*Create a dictionary, with the timestamps as the keys, and the values being lists of the data for that timestamp.*

Ok so I've got up to this step.. My dictionary contains 1 key ('AA08') and then a list of all the data [data1, data2, data3 etc..] corresponding to that 1 key.



I need to seperate the data into 3 bins... i.e create a list of 3 values within each data list value...
Then I need to somehow put this into a csv in the format I described in my original post.

I have done the following:

file = open("test_data.csv", "w")
file.write(Timestamp")
for h in range(0, 3):
file.write("Bin " + str(h + 1) + ",")
file.write("\n")

Thats my headers sorted... but for looping through the dictionary, I'm stuck! I'm not sure how to extract the dictionary key value pairs into the csv format described originally.

Apologies if I'm asking seemingly trivial things.. tried for hours to get things working but I'm so unfamiliar with python and its data structures.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to convert binary data into text? ZYSIA 3 2,585 Jul-16-2021, 04:18 PM
Last Post: deanhystad
  Binary data to Image convert Nuwan16 1 5,562 Aug-24-2020, 06:03 AM
Last Post: millpond
  Manipulating data from a CSV EvanS1 5 2,676 Jun-12-2020, 05:59 PM
Last Post: perfringo
  manipulating two lists rancans 8 3,112 Apr-16-2020, 06:00 PM
Last Post: deanhystad
  Manipulating index value, what is wrong with this code? Emun 1 1,727 Feb-05-2020, 07:18 AM
Last Post: perfringo
  Manipulating the filename of an output script mckinneycm 4 11,826 Jan-15-2020, 07:29 PM
Last Post: mckinneycm
  hex file to binary or pcap to binary baran01 1 5,629 Dec-11-2019, 10:19 PM
Last Post: Larz60+
  Manipulating Excel with Python. Spacely 2 3,586 Jun-25-2019, 01:57 AM
Last Post: Dequanharrison
  How to Read Binary Data pyth0nus3r 1 2,178 Jun-09-2019, 08:58 PM
Last Post: DeaD_EyE
  Parse Binary Data File and convert Epoch Time drdevereaux 1 3,121 May-16-2019, 01:56 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020