Python Forum
parsing question - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: parsing question (/thread-30153.html)



parsing question - ridgerunnersjw - Oct-09-2020

Can someone tell me how to parse this using '\x' and place each line on it's own:....I tried .split(b'\x') but this gives me an error

Output:
b'\x00\xd0\xffp\x01p\x01 \x00 \xff\xa0\x00 \x01\x90\x00\xf0\x00\xf0\x00 \x01P\xfe\xf0\x00\xc0\xff\x80\x00\x10\x00\xc0\xff\xb0\x01\xb0\x01@\xff\xc0\x00`\x01`\x01P\x01\xd0\x01@\x01\xa0\x00\xd0\x01P\x01P\xff\xc0\x02\x10\x010\x00\x80\x01\x10\x000\x00\x90\x00\x10\xff\xf0\x00\x10\x00p\x02`\x01`\x01P\x01\x10\x00\x10\x00@\x010\x00\xd0\x01\x80\x02 \x02\x90\x02 \x02P\x01\xd0\x01@\x00\xd0\x00\xd0\x01\xf0\x03\x00\x02\x90\x02\xe0\x01\xc0\x00\xe0\x02\xa0\x03\xb0\x02P\x02\xc0\x03@\x02 \x020\x02@\x04\x10\x02\x90\x00\xd0\x02\xc0\x04`\x02\xc0\x02 \x02\x80\x03\x90\x04\x00\x02p\x02\xd0\x03\x00\x01\xe0\x02\x00\x03@\x03\xb0\x03\xf0\x04@\x05 \x03\xd0\x04\x90\x03`\x04@\x04\xa0\x04\xb0\x04\xf0\x04 \x050\x06\x90\x05\xd0\x03`\x05\x10\x06\x10\x06\xb0\x08P\x06@\x05\xb0\x05\xf0\x05\xf0\x04\xd0\x06\x10\x050\x05\x90\x06\xf0\x07\x10\x06\xf0\x07\x90\t \x07 \x06\x90\x07\xa0\x08\x10\x07`\x07p\x08p\x07\x10\t\xb0\x07\xf0\x06@\x07\xc0\x07\x90\x080\x08\x90\x07\xe0\x07\x80\t\x00\x06\xe0\x080\x06\xc0\x08 \x06\xd0\x08\x80\x06P\x07\xe0\x08\xa0\x08\xe0\tP\tp\t@\x080\t\x00\x08 \x08 \x07p\x07`\x08 \x07\x80\x07\x10\x06\x90\x08@\x080\x08\x80\x08\x00\x08\x00\x07@\x06\xf0\x07\xb0\x08\x10\x07\x80\x07\xe0\x06\xa0\x07\xa0\x06\xe0\tP\x07P\x07\xb0\x06\xe0\x06p\x05\xc0\x06\xc0\x06 \x06\x90\x05\xa0\x06\xd0\x07`\x05\x



RE: parsing question - bowlofred - Oct-09-2020

This is just the default way that python prints bytes that don't have ascii mappings. If you didn't care about printable vs non-printable, and just want all the numbers, I'd do something like this:

b = b'\x00\xd0\xffp\x01p\x01 \x00 \xff\xa0\x00 \x01\x90\x00\xf0\x00\xf0\x00 \x01P'

for byte in b:
    print(format(byte, "x"))
Output:
0 d0 ff 70 1 70 1 20 0 20 ff a0 0 20 1 90 0 f0 0 f0 0 20 1 50
Do you need the bytes that are ascii to remain in ascii format?


RE: parsing question - ridgerunnersjw - Oct-09-2020

I have a sensor that is feeding me that datastream....It represents I and Q data (int16), FFT data (uint16) and some misc (uint16)...
I notice in your solution you are losing characters ie... \xffp (third one). That will be important during processing....

At the moment the only thing I am sure of is it looks like the '\x' can be parsed out and then everything else matters....
How would I do this?


RE: parsing question - bowlofred - Oct-09-2020

That's not one character "\xffp", that's two bytes. The first is '\xff' (which I print as "ff") and the second is "p" (which I print as 70).

>>> b'\xff' + b'\x70'
b'\xffp'
>>> len(b'\xffp')
2
If the bytes in the bytestring have some meaning (like every pair represents something), then that would change how you parse it. All I've done is take all the bytes and display them with their hex code. The default python print statement does similar, but if the byte is in the ASCII range, it replaces it with the ASCII character.


RE: parsing question - Skaperen - Oct-10-2020

@ridgerunnersjw: i assume each uint16 arrives as 2 bytes.  is that in big-endian order or little-endian order?


RE: parsing question - ridgerunnersjw - Oct-10-2020

Datasheet only calls out uint16....I gotta believe that WYSIWYG so:

x\ea --> 00ea
x\f4p --> f470

If I can get this parsed I will try and plot this data to see if it makes sense.....If endian is wrong then I'll swap....


RE: parsing question - bowlofred - Oct-10-2020

Something like this then:

import struct

b = b'\x00\xd0\xffp\x01p\x01 \x00 \xff\xa0\x00 \x01\x90\x00\xf0\x00\xf0\x00 \x01P'
size, mod = divmod(len(b), 2)
if mod:
    print("Odd number of bytes seen.  Ignoring final byte")
    b = b[:-1]
uint16_data = struct.unpack(f'>{size}H', b)
for uint in uint16_data:
    print(f"{uint:04x}")
You can change the > to < to reverse the endian order.

00d0
ff70
0170
0120
0020
ffa0
0020
0190
00f0
00f0
0020
0150



RE: parsing question - ridgerunnersjw - Oct-10-2020

Thank you....
I look forward to experimenting with this... It definitely looks promising...

Thanks again!