Python Forum
Convert file of hex strings to binary file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Convert file of hex strings to binary file
#1
Hello,
I have a file that has several hex string values, separated by the newline character. E.g. the file looks like:

dd5bda81
ae0ac495
b97a7664
...
I can easily parse this file and find e.g. the 10th string based on the line number.

However I want to convert the file to binary to save disk space.
I was thinking of using
binascii.unhexlify()
however I'm not sure what's the best way to handle their ordering, i.e. just concatenate the byte arrays? Note that the original file can be huge, maybe several gigabytes in size, and I'm not sure how efficient it would be to parse the billion-th value.
Reply
#2
If the source file has line endings like your example, you can process line by line.
Here as short example:

with open("source.hex") as fd_in, open("destination.bin", "wb") as fd_out:
    for line in fd_in:
        chunk = binascii.unhexlify(line.rstrip())
        fd_out.write(chunk)
  • Open source-file in read text mode, open output file in binary write mode. The example shows how to do it in one line.
  • iterate over lines. for line in fd_in
  • Strip from the right side whitespace: line.rstrip()
  • Convert hex-string into bytes (binary data) with binascii.unhexlify
  • Write the processed data to fd_out
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
Sometimes it makes more sense to write a C program. Am I going to get kicked off the forum now?
Reply
#4
Hm, why?

Try your luck.

If you have done it right, then create a Python Module in C.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#5
My test:

[deadeye@nexus ~]$ dd if=/dev/urandom of=random.bin bs=1M count=64
64+0 Datensätze ein
64+0 Datensätze aus
67108864 Bytes (67 MB, 64 MiB) kopiert, 0,815292 s, 82,3 MB/s
[deadeye@nexus ~]$ python file2hex.py 
[deadeye@nexus ~]$ md5sum random.bin random2.bin 
929b3a89653f956721743a93955e2ec2  random.bin
929b3a89653f956721743a93955e2ec2  random2.bin
Code:
from binascii import hexlify, unhexlify


def file2hex(input_file, output_file):
    with open(input_file, "rb") as fd_in, open(output_file, "wb") as fd_out:
        while chunk := fd_in.read(20):
            fd_out.write(hexlify(chunk))
            fd_out.write(b"\n")


def hex2file(input_file, output_file):
    with open(input_file, "rb") as fd_in, open(output_file, "wb") as fd_out:
        for line in fd_in:
            fd_out.write(unhexlify(line.rstrip()))



file2hex("random.bin", "random.hex")
hex2file("random.hex", "random2.bin")
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to write variable in a python file then import it in another python file? tatahuft 4 860 Jan-01-2025, 12:18 AM
Last Post: Skaperen
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 988 Dec-19-2024, 11:57 AM
Last Post: snippsat
  JSON File - extract only the data in a nested array for CSV file shwfgd 2 1,014 Aug-26-2024, 10:14 PM
Last Post: shwfgd
  FileNotFoundError: [Errno 2] No such file or directory although the file exists Arnibandyo 0 813 Aug-12-2024, 09:11 AM
Last Post: Arnibandyo
  "[Errno 2] No such file or directory" (.py file) IbrahimBennani 13 6,110 Jun-17-2024, 12:26 AM
Last Post: AdamHensley
Question [SOLVED] Correct way to convert file from cp-1252 to utf-8? Winfried 8 9,492 Feb-29-2024, 12:30 AM
Last Post: Winfried
  file open "file not found error" shanoger 8 5,934 Dec-14-2023, 08:03 AM
Last Post: shanoger
  Need to replace a string with a file (HTML file) tester_V 1 1,862 Aug-30-2023, 03:42 AM
Last Post: Larz60+
  Trying to understand strings and lists of strings Konstantin23 2 1,713 Aug-06-2023, 11:42 AM
Last Post: deanhystad
  How can I change the uuid name of a file to his original file? MaddoxMB 2 2,069 Jul-17-2023, 10:15 PM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020