I have some code that generates MD5 hashes from IPv6 addresses, then checks them against a list of known MD5 hashes. In trying to speed it up, I profiled it, and found the string conversion was chewing up a lot of CPU time. One must convert IPv6 to string to bytes, then feed that to _hashlib.
So, I attempted to speed it up. Here's some code documenting my attempt:
What am I doing wrong?
Ah, figured it out. 'u' is not a Unicode string literal. Apparently it's for an integer.
So, I attempted to speed it up. Here's some code documenting my attempt:
from _hashlib import openssl_md5 as hashMD5 from ipaddress import IPv6Address as IPv6 starting_ip='2001:4958::' ip = IPv6(starting_ip) aa = 208000000000 hashgen = hashMD5((b'%u' % (ip+aa))).hexdigest() hashgen2 = hashMD5(('%s' % (ip+aa)).encode('utf-8')).hexdigest() print(hashgen) print(hashgen2)
Output:d6f76fb9ca27fdae847af8ea2f3797e2
6e217802558e0534bfb91f694e045f5e
I know the second one (hashgen2) is correct, but why is the first one (hashgen) not returning the correct MD5 hash? If Python 3.5.2 is using Unicode as a default, then specifying the 'b' string literal should implicitly encode it as Unicode, right?What am I doing wrong?
Ah, figured it out. 'u' is not a Unicode string literal. Apparently it's for an integer.