Python Forum
string to hex and back again
Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
string to hex and back again
#1
i want to convert a string to hexadecimal

'Skaperen' -> '536b61706572656e'
'Python' -> '507974686f6e'

and back again

'536b61706572656E' -> 'Skaperen'
'507974686f6E' -> 'Python'

what is the best and most Pythonic way to do this not involving single characters, bytes or bytearray in both Python2 and Python3 (does not have to be same code)?  i only need character values from 0 to 255 to work, as if i were using bytes but i am using strings and need everything to work with exactly 2 hexadecimal digits per character regardless of value.

i have converted strings to hexadecimal like this before (typing this from vague memory).  how Pythonic is this?

def str2hex(s):
    return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s])
i've been trying to make a reverse of this but have not figured out how to do a for loop of 2 digits per cycle.

any bright ideas?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
I am not sure if that's what you are looking for, but array may be your friend
In [3]: array.array('b', b'Skaperen')
Out[3]: array('b', [83, 107, 97, 112, 101, 114, 101, 110])

In [4]:
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#3
(Apr-27-2017, 04:01 AM)volcano63 Wrote: I am not sure if that's what you are looking for, but array may be your friend
In [3]: array.array('b', b'Skaperen')
Out[3]: array('b', [83, 107, 97, 112, 101, 114, 101, 110])

In [4]:

but...... can it be used with non-byte stuff?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#4
(Apr-27-2017, 04:34 AM)Skaperen Wrote: but...... can it be used with non-byte stuff?

In [7]: array.array('I', 'Skaperen'.encode())
Out[7]: array('I', [1885432659, 1852142181])

In [8]:
IMHO, struct and array are the tools to process stream data - like if you want to parse a protocol packet. Otherwise, you have to invent bells and whistle...

Here's a little more involved example with struct - split string into unequal-size fields
In [15]: struct.unpack('2h2b2s', b'Skaperen')
Out[15]: (27475, 28769, 101, 114, b'en')

In [16]:
But you have to type-cast (for the lack of better word) strings for those to work
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#5
In [1]: s = 'Skaperen'

In [2]: s.encode()
Out[2]: b'Skaperen'

In [3]: s.encode().hex()
Out[3]: '536b61706572656e'

In [4]: bytes.fromhex('536b61706572656e').decode('utf-8')
Out[4]: 'Skaperen'
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#6
(Apr-27-2017, 07:10 AM)wavic Wrote:
In [1]: s = 'Skaperen'

In [2]: s.encode()
Out[2]: b'Skaperen'

In [3]: s.encode().hex()
Out[3]: '536b61706572656e'

In [4]: bytes.fromhex('536b61706572656e').decode('utf-8')
Out[4]: 'Skaperen'

i already know that for some values in a string, the .encode() method will either return different (the result of encoding) values, yielding a wrong hexadecimal, or will raise an exception.  i've run into this, already.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#7
Some example?
How the hexadecimal could be wrong as utf-8 ( which is not specified in the encode() function ) is bunch of numbers and turning each of them into hexadecimal is just a routine
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#8
(Apr-27-2017, 07:38 AM)wavic Wrote: Some example?
How the hexadecimal could be wrong as utf-8 ( which is not specified in the encode() function ) is bunch of numbers and turning each of them into hexadecimal is just a routine

an example is anything beyond plain ASCII in Unicode.  the hexadecimal of the utf-8 is, by definition, different thab its originating unicode.  do you have an example where they are the same? with the character value being 128 or higher?

some pending (a work in progress) code i have put together to do lots of type conversion without any encoding:

from __future__ import print_function
import sys
ver = sys.version_info
out = sys.stdout.flush

def bytearray_to_bytes(a):
    return bytes(a)

def bytearray_to_hex(a):
    return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in a])

def bytearray_to_str(b):
    return ''.join([chr(c) for c in b])

def bytes_to_bytearray(b):
    if ver[0] < 3:
        return bytearray(b)
    else:
        return bytearray(b)

def bytes_to_hex(b):
    if ver[0] < 3:
        return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in b])
    else:
        return ''.join([('0'+hex(c).split('x')[1])[-2:] for c in b])

def bytes_to_str(b):
    if ver[0] < 3:
        return ''.join([chr(ord(c)) for c in b])
    else:
        return ''.join([chr(c) for c in b])

def hex_to_bytearray(x):
    if ver[0] < 3:
        return bytearray(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)]))
    else:
        return bytearray.fromhex(x)

def hex_to_bytes(x):
    if ver[0] < 3:
        return bytes(''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)]))
    else:
        return bytes.fromhex(x)

def hex_to_str(x):
    return ''.join([chr(int(x[i:i+2],16)) for i in range(0,len(x),2)])

def str_to_bytes(b):
    if ver[0] < 3:
        return bytes(b)
    else:
        return bytes.fromhex(''.join([hex(ord(c)).replace('x','0')[-2:] for c in b]))

def str_to_bytearray(b):
    if ver[0] < 3:
        return bytearray(b)
    else:
        return bytearray.fromhex(''.join([hex(ord(c))[2:] for c in b]))

def str_to_hex(s):
    return ''.join([('0'+hex(ord(c)).split('x')[1])[-2:] for c in s])

def test_byteconv(args):

    u = [v for v in range(256)]
    u = [100]+u
    
    for v in u:
        print(repr(v))
        out()
    
        aa = bytearray([v]*80)
        bb = bytes(aa)
        ss = chr(v)*80
    
        xx = bytearray_to_hex(aa)
        if xx != bytes_to_hex(bb):
            print('bytes_to_hex(',repr(v),') failed')
            out()
        if xx != str_to_hex(ss):
            print('str_to_hex(',repr(v),') failed')
            out()
    
        print('aa',repr(bytearray_to_hex(aa)))
        out()
        print('bb',repr(bytes_to_hex(bb)))
        out()
        print('ss',repr(str_to_hex(ss)))
        out()
    
        print('end',repr(v))
        out()
    
    print('end')
    out()
    
    with open('/dev/urandom','rb') as f:
        for n in range(200):
            bb=f.read(40)
            xx=bytes_to_hex(bb)
            print(xx)
            out()
            aa=hex_to_bytearray(xx)
            ss=hex_to_str(xx)

            if xx != str_to_hex(ss):
                print('bytearray_to_hex() failed')
                out()

            if bb !=bytearray_to_bytes(aa):
                print('bytearray_to_bytes() failed')
                out()

            if xx != bytearray_to_hex(aa):
                print('bytearray_to_hex() failed')
                out()

            if ss != bytearray_to_str(aa):
                print('bytearray_to_str() failed')
                out()

            if aa != bytes_to_bytearray(bb):
                print('bytes_to_bytearray() failed')
                out()

            if xx != bytes_to_hex(bb):
                print('bytes_to_hex() failed')
                out()
                fail()
                
    print('end')
    out()

    return 0


def main(args):
    return test_byteconv(args)


if __name__ == '__main__':
    try:
        result=main(sys.argv)
        sys.stdout.flush()
    except BrokenPipeError:
        result=99
    except KeyboardInterrupt:
        print('')
        result=98
    if result is 0 or result is None or result is True:
        exit(0)
    if result is 1 or result is False:
        exit(1)
    if isinstance(result,str):
        print(result,file=sys.stderr)
        exit(2)
    try:
        exit(int(result))
    except ValueError:
        print(str(result),file=sys.stderr)
        exit(3)
    except TypeError:
        exit(4)
# EOF
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#9
In slow-mo with plenty of print(...). Works in both v2 and V3 (assume starting "string" is a unicode for v2):

# -*- coding: utf-8 -*-
import codecs

x=u'Déjà'
bx=codecs.encode(x,'utf-8')
print("bx:",bx)
hx=codecs.encode(bx,'hex')
print("hx:",hx)

bx2=codecs.decode(hx,'hex')
print("bx2:",bx2)
x2=codecs.decode(bx2,'utf-8')
print("x2:",x2)
print x2,type(x2)
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Reply
#10
He gets the values from /dev/random so there are not always in the necessary range to print them.

But can use 'replace', 'backslashreplace' or 'ignore' as a second argument to decode: https://docs.python.org/3.5/howto/unicode.html
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How do I convert this string back to a list of integers? donmerch 6 3,739 Apr-05-2020, 06:43 PM
Last Post: donmerch
  string to list and back again as same type Skaperen 10 5,557 Sep-06-2018, 08:26 PM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020