Python Forum
string to hex and back again - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: string to hex and back again (/thread-3058.html)

Pages: 1 2


string to hex and back again - Skaperen - Apr-27-2017

i want to convert a string to hexadecimal

'Skaperen' -> '536b61706572656e'
'Python' -> '507974686f6e'

and back again

'536b61706572656E' -> 'Skaperen'
'507974686f6E' -> 'Python'

what is the best and most Pythonic way to do this not involving single characters, bytes or bytearray in both Python2 and Python3 (does not have to be same code)?  i only need character values from 0 to 255 to work, as if i were using bytes but i am using strings and need everything to work with exactly 2 hexadecimal digits per character regardless of value.

i have converted strings to hexadecimal like this before (typing this from vague memory).  how Pythonic is this?

def str2hex(s):
    return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s])
i've been trying to make a reverse of this but have not figured out how to do a for loop of 2 digits per cycle.

any bright ideas?


RE: string to hex and back again - volcano63 - Apr-27-2017

I am not sure if that's what you are looking for, but array may be your friend
In [3]: array.array('b', b'Skaperen')
Out[3]: array('b', [83, 107, 97, 112, 101, 114, 101, 110])

In [4]:



RE: string to hex and back again - Skaperen - Apr-27-2017

(Apr-27-2017, 04:01 AM)volcano63 Wrote: I am not sure if that's what you are looking for, but array may be your friend
In [3]: array.array('b', b'Skaperen')
Out[3]: array('b', [83, 107, 97, 112, 101, 114, 101, 110])

In [4]:

but...... can it be used with non-byte stuff?


RE: string to hex and back again - volcano63 - Apr-27-2017

(Apr-27-2017, 04:34 AM)Skaperen Wrote: but...... can it be used with non-byte stuff?

In [7]: array.array('I', 'Skaperen'.encode())
Out[7]: array('I', [1885432659, 1852142181])

In [8]:
IMHO, struct and array are the tools to process stream data - like if you want to parse a protocol packet. Otherwise, you have to invent bells and whistle...

Here's a little more involved example with struct - split string into unequal-size fields
In [15]: struct.unpack('2h2b2s', b'Skaperen')
Out[15]: (27475, 28769, 101, 114, b'en')

In [16]:
But you have to type-cast (for the lack of better word) strings for those to work


RE: string to hex and back again - wavic - Apr-27-2017

In [1]: s = 'Skaperen'

In [2]: s.encode()
Out[2]: b'Skaperen'

In [3]: s.encode().hex()
Out[3]: '536b61706572656e'

In [4]: bytes.fromhex('536b61706572656e').decode('utf-8')
Out[4]: 'Skaperen'



RE: string to hex and back again - Skaperen - Apr-27-2017

(Apr-27-2017, 07:10 AM)wavic Wrote:
In [1]: s = 'Skaperen'

In [2]: s.encode()
Out[2]: b'Skaperen'

In [3]: s.encode().hex()
Out[3]: '536b61706572656e'

In [4]: bytes.fromhex('536b61706572656e').decode('utf-8')
Out[4]: 'Skaperen'

i already know that for some values in a string, the .encode() method will either return different (the result of encoding) values, yielding a wrong hexadecimal, or will raise an exception.  i've run into this, already.


RE: string to hex and back again - wavic - Apr-27-2017

Some example?
How the hexadecimal could be wrong as utf-8 ( which is not specified in the encode() function ) is bunch of numbers and turning each of them into hexadecimal is just a routine


RE: string to hex and back again - Skaperen - Apr-27-2017

(Apr-27-2017, 07:38 AM)wavic Wrote: Some example?
How the hexadecimal could be wrong as utf-8 ( which is not specified in the encode() function ) is bunch of numbers and turning each of them into hexadecimal is just a routine

an example is anything beyond plain ASCII in Unicode.  the hexadecimal of the utf-8 is, by definition, different thab its originating unicode.  do you have an example where they are the same? with the character value being 128 or higher?

some pending (a work in progress) code i have put together to do lots of type conversion without any encoding:

from __future__ import print_function
import sys
ver = sys.version_info
out = sys.stdout.flush

def bytearray_to_bytes(a):
    return bytes(a)

def bytearray_to_hex(a):
    return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in a])

def bytearray_to_str(b):
    return ''.join([chr(c) for c in b])

def bytes_to_bytearray(b):
    if ver[0] < 3:
        return bytearray(b)
    else:
        return bytearray(b)

def bytes_to_hex(b):
    if ver[0] < 3:
        return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in b])
    else:
        return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in b])

def bytes_to_str(b):
    if ver[0] < 3:
        return ''.join([chr(ord(c)) for c in b])
    else:
        return ''.join([chr(c) for c in b])

def hex_to_bytearray(x):
    if ver[0] < 3:
        return bytearray(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)]))
    else:
        return bytearray.fromhex(x)

def hex_to_bytes(x):
    if ver[0] < 3:
        return bytes(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)]))
    else:
        return bytes.fromhex(x)

def hex_to_str(x):
    return ''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)])

def str_to_bytes(b):
    if ver[0] < 3:
        return bytes(b)
    else:
        return bytes.fromhex(''.join([hex(ord(c)).replace('x','0')[-2:] for c in b]))

def str_to_bytearray(b):
    if ver[0] < 3:
        return bytearray(b)
    else:
        return bytearray.fromhex(''.join([hex(ord(c))[2:] for c in b]))

def str_to_hex(s):
    return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s])

def test_byteconv(args):

    u = [v for v in range(256)]
    u = [100]+u
    
    for v in u:
        print(repr(v))
        out()
    
        aa = bytearray([v]*80)
        bb = bytes(aa)
        ss = chr(v)*80
    
        xx = bytearray_to_hex(aa)
        if xx != bytes_to_hex(bb):
            print('bytes_to_hex(',repr(v),') failed')
            out()
        if xx != str_to_hex(ss):
            print('str_to_hex(',repr(v),') failed')
            out()
    
        print('aa',repr(bytearray_to_hex(aa)))
        out()
        print('bb',repr(bytes_to_hex(bb)))
        out()
        print('ss',repr(str_to_hex(ss)))
        out()
    
        print('end',repr(v))
        out()
    
    print('end')
    out()
    
    with open('/dev/urandom','rb') as f:
        for n in range(200):
            bb=f.read(40)
            xx=bytes_to_hex(bb)
            print(xx)
            out()
            aa=hex_to_bytearray(xx)
            ss=hex_to_str(xx)

            if xx != str_to_hex(ss):
                print('bytearray_to_hex() failed')
                out()

            if bb !=bytearray_to_bytes(aa):
                print('bytearray_to_bytes() failed')
                out()

            if xx != bytearray_to_hex(aa):
                print('bytearray_to_hex() failed')
                out()

            if ss != bytearray_to_str(aa):
                print('bytearray_to_str() failed')
                out()

            if aa != bytes_to_bytearray(bb):
                print('bytes_to_bytearray() failed')
                out()

            if xx != bytes_to_hex(bb):
                print('bytes_to_hex() failed')
                out()
                fail()
                
    print('end')
    out()

    return 0


def main(args):
    return test_byteconv(args)


if __name__ == '__main__':
    try:
        result=main(sys.argv)
        sys.stdout.flush()
    except BrokenPipeError:
        result=99
    except KeyboardInterrupt:
        print('')
        result=98
    if result is 0 or result is None or result is True:
        exit(0)
    if result is 1 or result is False:
        exit(1)
    if isinstance(result,str):
        print(result,file=sys.stderr)
        exit(2)
    try:
        exit(int(result))
    except ValueError:
        print(str(result),file=sys.stderr)
        exit(3)
    except TypeError:
        exit(4)
# EOF



RE: string to hex and back again - Ofnuts - Apr-27-2017

In slow-mo with plenty of print(...). Works in both v2 and V3 (assume starting "string" is a unicode for v2):

# -*- coding: utf-8 -*-
import codecs

x=u'Déjà'
bx=codecs.encode(x,'utf-8')
print("bx:",bx)
hx=codecs.encode(bx,'hex')
print("hx:",hx)

bx2=codecs.decode(hx,'hex')
print("bx2:",bx2)
x2=codecs.decode(bx2,'utf-8')
print("x2:",x2)
print x2,type(x2)



RE: string to hex and back again - wavic - Apr-28-2017

He gets the values from /dev/random so there are not always in the necessary range to print them.

But can use 'replace', 'backslashreplace' or 'ignore' as a second argument to decode: https://docs.python.org/3.5/howto/unicode.html