string to hex and back again - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: string to hex and back again (/thread-3058.html) Pages:
1
2
|
string to hex and back again - Skaperen - Apr-27-2017 i want to convert a string to hexadecimal 'Skaperen' -> '536b61706572656e' 'Python' -> '507974686f6e' and back again '536b61706572656E' -> 'Skaperen' '507974686f6E' -> 'Python' what is the best and most Pythonic way to do this not involving single characters, bytes or bytearray in both Python2 and Python3 (does not have to be same code)? i only need character values from 0 to 255 to work, as if i were using bytes but i am using strings and need everything to work with exactly 2 hexadecimal digits per character regardless of value. i have converted strings to hexadecimal like this before (typing this from vague memory). how Pythonic is this? def str2hex(s): return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s])i've been trying to make a reverse of this but have not figured out how to do a for loop of 2 digits per cycle. any bright ideas? RE: string to hex and back again - volcano63 - Apr-27-2017 I am not sure if that's what you are looking for, but array may be your friend In [3]: array.array('b', b'Skaperen') Out[3]: array('b', [83, 107, 97, 112, 101, 114, 101, 110]) In [4]: RE: string to hex and back again - Skaperen - Apr-27-2017 (Apr-27-2017, 04:01 AM)volcano63 Wrote: I am not sure if that's what you are looking for, but array may be your friend but...... can it be used with non-byte stuff? RE: string to hex and back again - volcano63 - Apr-27-2017 (Apr-27-2017, 04:34 AM)Skaperen Wrote: but...... can it be used with non-byte stuff? In [7]: array.array('I', 'Skaperen'.encode()) Out[7]: array('I', [1885432659, 1852142181]) In [8]:IMHO, struct and array are the tools to process stream data - like if you want to parse a protocol packet. Otherwise, you have to invent bells and whistle...Here's a little more involved example with struct - split string into unequal-size fieldsIn [15]: struct.unpack('2h2b2s', b'Skaperen') Out[15]: (27475, 28769, 101, 114, b'en') In [16]:But you have to type-cast (for the lack of better word) strings for those to work RE: string to hex and back again - wavic - Apr-27-2017 In [1]: s = 'Skaperen' In [2]: s.encode() Out[2]: b'Skaperen' In [3]: s.encode().hex() Out[3]: '536b61706572656e' In [4]: bytes.fromhex('536b61706572656e').decode('utf-8') Out[4]: 'Skaperen' RE: string to hex and back again - Skaperen - Apr-27-2017 (Apr-27-2017, 07:10 AM)wavic Wrote:In [1]: s = 'Skaperen' In [2]: s.encode() Out[2]: b'Skaperen' In [3]: s.encode().hex() Out[3]: '536b61706572656e' In [4]: bytes.fromhex('536b61706572656e').decode('utf-8') Out[4]: 'Skaperen' i already know that for some values in a string, the .encode() method will either return different (the result of encoding) values, yielding a wrong hexadecimal, or will raise an exception. i've run into this, already. RE: string to hex and back again - wavic - Apr-27-2017 Some example? How the hexadecimal could be wrong as utf-8 ( which is not specified in the encode() function ) is bunch of numbers and turning each of them into hexadecimal is just a routine RE: string to hex and back again - Skaperen - Apr-27-2017 (Apr-27-2017, 07:38 AM)wavic Wrote: Some example? an example is anything beyond plain ASCII in Unicode. the hexadecimal of the utf-8 is, by definition, different thab its originating unicode. do you have an example where they are the same? with the character value being 128 or higher? some pending (a work in progress) code i have put together to do lots of type conversion without any encoding: from __future__ import print_function import sys ver = sys.version_info out = sys.stdout.flush def bytearray_to_bytes(a): return bytes(a) def bytearray_to_hex(a): return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in a]) def bytearray_to_str(b): return ''.join([chr(c) for c in b]) def bytes_to_bytearray(b): if ver[0] < 3: return bytearray(b) else: return bytearray(b) def bytes_to_hex(b): if ver[0] < 3: return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in b]) else: return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in b]) def bytes_to_str(b): if ver[0] < 3: return ''.join([chr(ord(c)) for c in b]) else: return ''.join([chr(c) for c in b]) def hex_to_bytearray(x): if ver[0] < 3: return bytearray(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)])) else: return bytearray.fromhex(x) def hex_to_bytes(x): if ver[0] < 3: return bytes(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)])) else: return bytes.fromhex(x) def hex_to_str(x): return ''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)]) def str_to_bytes(b): if ver[0] < 3: return bytes(b) else: return bytes.fromhex(''.join([hex(ord(c)).replace('x','0')[-2:] for c in b])) def str_to_bytearray(b): if ver[0] < 3: return bytearray(b) else: return bytearray.fromhex(''.join([hex(ord(c))[2:] for c in b])) def str_to_hex(s): return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s]) def test_byteconv(args): u = [v for v in range(256)] u = [100]+u for v in u: print(repr(v)) out() aa = bytearray([v]*80) bb = bytes(aa) ss = chr(v)*80 xx = bytearray_to_hex(aa) if xx != bytes_to_hex(bb): print('bytes_to_hex(',repr(v),') failed') out() if xx != str_to_hex(ss): print('str_to_hex(',repr(v),') failed') out() print('aa',repr(bytearray_to_hex(aa))) out() print('bb',repr(bytes_to_hex(bb))) out() print('ss',repr(str_to_hex(ss))) out() print('end',repr(v)) out() print('end') out() with open('/dev/urandom','rb') as f: for n in range(200): bb=f.read(40) xx=bytes_to_hex(bb) print(xx) out() aa=hex_to_bytearray(xx) ss=hex_to_str(xx) if xx != str_to_hex(ss): print('bytearray_to_hex() failed') out() if bb !=bytearray_to_bytes(aa): print('bytearray_to_bytes() failed') out() if xx != bytearray_to_hex(aa): print('bytearray_to_hex() failed') out() if ss != bytearray_to_str(aa): print('bytearray_to_str() failed') out() if aa != bytes_to_bytearray(bb): print('bytes_to_bytearray() failed') out() if xx != bytes_to_hex(bb): print('bytes_to_hex() failed') out() fail() print('end') out() return 0 def main(args): return test_byteconv(args) if __name__ == '__main__': try: result=main(sys.argv) sys.stdout.flush() except BrokenPipeError: result=99 except KeyboardInterrupt: print('') result=98 if result is 0 or result is None or result is True: exit(0) if result is 1 or result is False: exit(1) if isinstance(result,str): print(result,file=sys.stderr) exit(2) try: exit(int(result)) except ValueError: print(str(result),file=sys.stderr) exit(3) except TypeError: exit(4) # EOF RE: string to hex and back again - Ofnuts - Apr-27-2017 In slow-mo with plenty of print(...) . Works in both v2 and V3 (assume starting "string" is a unicode for v2):# -*- coding: utf-8 -*- import codecs x=u'Déjà' bx=codecs.encode(x,'utf-8') print("bx:",bx) hx=codecs.encode(bx,'hex') print("hx:",hx) bx2=codecs.decode(hx,'hex') print("bx2:",bx2) x2=codecs.decode(bx2,'utf-8') print("x2:",x2) print x2,type(x2) RE: string to hex and back again - wavic - Apr-28-2017 He gets the values from /dev/random so there are not always in the necessary range to print them. But can use 'replace', 'backslashreplace' or 'ignore' as a second argument to decode: https://docs.python.org/3.5/howto/unicode.html |