Python Forum

Full Version: Decoding a serial stream
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I am trying to read a UART serial stream, and the data comes in as hex (I think). I’m not actually sure.

Below is a short snippet of what I am getting.

b’\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85’

I’m pretty sure the “” is a delimiter. And I believe the \n means new line, \t means tab. So those are ascii. I am also guessing that anything starting with \x is a hex value. But that’s where I get lost. Hex \x9c is convertible. But what is \xfdXh?

If anyone can help, or know if there is a python command to parse this, thanks in advance.

Andy
b’\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85’ is a bytearray. When you print a bytearray it displays ascii characters when possible, or it uses hexadecimal notation. (\x). Since you are seeing \x in your byte array, it means that the bytearray data is either numbers or Unicode.

If you want to see the bytes as hex values you can do this:
import binascii

x = b'\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85'
print(binascii.hexlify(x))
Output:
b'099c1dc84c5809fd586809bf1e8a5e587b7cbe85'
If you want to convert it to a Unicode string use .decode().
(Mar-20-2021, 03:31 PM)deanhystad Wrote: [ -> ]b’\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85’ is a bytearray. When you print a bytearray it displays ascii characters when possible, or it uses hexadecimal notation. (\x). Since you are seeing \x in your byte array, it means that the bytearray data is either numbers or Unicode.

If you want to see the bytes as hex values you can do this:
import binascii

x = b'\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85'
print(binascii.hexlify(x))
Output:
b'099c1dc84c5809fd586809bf1e8a5e587b7cbe85'
If you want to convert it to a Unicode string use .decode().
Thank you for that explanation.

I have used your code as a test. And I have added the decode. The decoded output simply removes the leading "b" and the single quote marks from either end. It didn't really convert it. Can you please explain how to convert this string into readable characters?

============== Code Start ==========================
import binascii

x = b'\t\x9c\x1d\xc8LX\t\xfdXh\t\xbf\x1e\x8a^X{|\xbe\x85'
print(binascii.hexlify(x))
q=binascii.hexlify(x)
print(q.decode())

=========== Code End ===============================

============== Output to Terminal ===============
b'099c1dc84c5809fd586809bf1e8a5e587b7cbe85'
099c1dc84c5809fd586809bf1e8a5e587b7cbe85
You do not decode the "hexlify", you decode the original bytearray. You also need to specify which encoding was used. The default for .decode is utf-8.

What you really need is a knowledge of what was sent. Is it 20 bytes? Five 32 bit numbers? A Unicode string, and if so, what encoding was used?
Ha! What is being sent is what I'm trying to figure out. Unfortunately I don't have a clue. It's a solar charger controller with a uart output. I'm trying to decode the stream so I can determine what is being sent.

Any ideas?
Nope. You need to find a datasheet, a library where someone has already decoded it, or you need to have some mad reconstruction skills. The values may simply be numerical values, but you don't know the size, the range, or the format.

Sometimes you get lucky and the data is a simple ascii or UTF text representation. But most of the time for speed it's a formatted value. Next best thing is you find the chipset and then see if someone has written a driver or decoder module for it already. Even if it's not in python, translating a working driver is much easier than sussing it out blindly.
It could be something hideous like the way Transducer Data Sheet information (TEDS) is encoded. Those have a header that tells you what template is used to decode the remaining information. Nothing ends on a natural boundary. The template not only specifies format but also scaling and units information.