Python Forum

Full Version: how to convert a string to hex
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
what is the best way to convert a string to hexadecimal?

the purpose is to get the character codes to see what is being read in from a file.

i already have command line tools that display the file in hexadecimal.  i want to match that up with what the python script is doing.  something that is suitable for making such tools might work.  i tried binascii.hexlify() but it wanted a bytes-like object.  i have a string for now.

for example i have a misbehaving script.  i want to drop in a slim piece of code to have it print the string in hex.
str.encode gives you the bytes representation of the string. From that you can use the hex method to get the hex values:

>>> s = 'The quick brown fox jumps over the lazy dog.'.encode('utf-8')
>>> s
b'The quick brown fox jumps over the lazy dog.'
>>> s.hex()
'54686520717569636b2062726f776e20666f78206a756d7073206f76657220746865206c617a7920646f672e'
That's Python 3.x of course. Sting handling was one of the big changes between 2.x and 3.x.
And when convert to bytes,then decode method become active(dos not work on string).
>>> s = 'The quick brown fox jumps over the lazy dog.'.encode('utf-8')
>>> s
b'The quick brown fox jumps over the lazy dog.'
>>> # Back to string
>>> s.decode('utf-8')
'The quick brown fox jumps over the lazy dog.'
Can also just call encode decode.
>>> s = 'The quick brown fox jumps over the lazy dog.'
>>> s = s.encode()
>>> s
b'The quick brown fox jumps over the lazy dog.'
>>> s.decode()
'The quick brown fox jumps over the lazy dog.'
(Jan-22-2017, 11:52 AM)snippsat Wrote: [ -> ]And when convert to bytes,then decode method become active(dos not work on string).
>>> s = 'The quick brown fox jumps over the lazy dog.'.encode('utf-8')
>>> s
b'The quick brown fox jumps over the lazy dog.'
>>> # Back to string
>>> s.decode('utf-8')
'The quick brown fox jumps over the lazy dog.'
Can also just call encode decode.
>>> s = 'The quick brown fox jumps over the lazy dog.'
>>> s = s.encode()
>>> s
b'The quick brown fox jumps over the lazy dog.'
>>> s.decode()
'The quick brown fox jumps over the lazy dog.'

so what do i get out of this (a string) that i didn't have before (a string).  so why not just keep a copy of the original string?

s = 'The quick brown fox jumps over the lazy dog.'
b = s.encode()
now i have both.  or is there something special about the string result of s.encode().decode() ???
(Jan-23-2017, 01:43 AM)Skaperen Wrote: [ -> ]now i have both.  or is there something special about the string result of s.encode().decode() ???
You can of course do it like this if you need a copy.
Both have now own space in memory.
>>> s = 'The quick brown fox jumps over the lazy dog.'
>>> b = s.encode()
>>> id(s)
64508296
>>> id(b)
64396640
>>> b
b'The quick brown fox jumps over the lazy dog.'
>>> s
'The quick brown fox jumps over the lazy dog.'

>>> help(id)
Help on built-in function id in module builtins:

id(...)
   id(object) -> integer
   
   Return the identity of an object.  This is guaranteed to be unique among
   simultaneously existing objects.  (Hint: it's the object's memory address.)
So a little more on this Unicode stuff,this was a big change for Python 3.
Just writing Unicode in Python 3.x and it work.
In Python 3 are all strings sequences of Unicode character.
# Python 3.6
>>> print('Spicy jalapeño ☂')
Spicy jalapeño ☂

# Python 2.7
>>> print('Spicy jalapeño ☂')
Spicy jalapeño ☂
Taking in stuff from outside in to Python 3.x,
we need to give it a encoding to be a string.
So open() was given a new parameter encoding='utf-8'.
with open('some_file', encoding='utf-8') as f:
    print(f.read())
There are also parameter to taken a malformed encoded file like errors='ignore' errors='replace'
with open('some_file', encoding='utf-8', errors='ignore') as f:
    print(f.read())
So save(as utf-8) and read in Spicy jalapeño ☂
with open('uni.txt', encoding='utf-8') as f:
    print(f.read()) # spicy jalapeño ☂

# Read in as ascii,but ignore UnicodeDecodeError
with open('uni.txt', encoding='ascii', errors='ignore') as f:
    print(f.read()) # Spicy jalapeo

# Read in as ascii,and replace errors
with open('uni.txt', encoding='ascii', errors='replace') as f:
    print(f.read()) # Spicy jalape��o ���