Python Forum
how to read a text file as bytes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to read a text file as bytes
#1
i want to read a file as a list of byte types, one for each line where '\n' or '\r' or '\r\n' or even the unlikely '\n\r' are marking end of line. the file might be so large that two copies of it in memory can exhaust memory so i need a way that is not "read in the whole file and do a big split". there are UTF-8 byte sequences with non-ASCII values in some of the files. i am using Python3. what is a good Pythonic way to no this? can the os module be avoided?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
The "per line" operator works with binary files using '\n' as separator, so you can do something like:
with open('input.bin', 'rb') as fd:
    for line in fd:
        for sub_line in line.split(b'\r'):
            # Take into account any single '\r'
            if not sub_line:
                # If you want to deal also with zero length groups, thsi must be improved...
                continue
            # Do something with the lines
            pass
If you can guarantee in the input format that splitting by '\n' is safe is a good way.
Other option is to use a memory map.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 985 Dec-19-2024, 11:57 AM
Last Post: snippsat
  Read TXT file in Pandas and save to Parquet zinho 2 1,200 Sep-15-2024, 06:14 PM
Last Post: zinho
  UART how to read 300 bytes ? trix 7 1,312 Aug-22-2024, 12:54 PM
Last Post: trix
  Pycharm can't read file Genericgamemaker 5 1,527 Jul-24-2024, 08:10 PM
Last Post: deanhystad
  Python is unable to read file Genericgamemaker 13 3,506 Jul-19-2024, 06:42 PM
Last Post: snippsat
  Connecting to Remote Server to read contents of a file ChaitanyaSharma 1 3,152 May-03-2024, 07:23 AM
Last Post: Pedroski55
  Recommended way to read/create PDF file? Winfried 3 4,557 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 3,726 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 2,525 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 2,215 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020