Python Forum
how to read a text file as bytes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to read a text file as bytes
#1
i want to read a file as a list of byte types, one for each line where '\n' or '\r' or '\r\n' or even the unlikely '\n\r' are marking end of line. the file might be so large that two copies of it in memory can exhaust memory so i need a way that is not "read in the whole file and do a big split". there are UTF-8 byte sequences with non-ASCII values in some of the files. i am using Python3. what is a good Pythonic way to no this? can the os module be avoided?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
The "per line" operator works with binary files using '\n' as separator, so you can do something like:
with open('input.bin', 'rb') as fd:
    for line in fd:
        for sub_line in line.split(b'\r'):
            # Take into account any single '\r'
            if not sub_line:
                # If you want to deal also with zero length groups, thsi must be improved...
                continue
            # Do something with the lines
            pass
If you can guarantee in the input format that splitting by '\n' is safe is a good way.
Other option is to use a memory map.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Recommended way to read/create PDF file? Winfried 3 2,783 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,308 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 1,047 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,043 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,160 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,008 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,140 Mar-27-2023, 08:59 PM
Last Post: deanhystad
  Read text file, modify it then write back Pavel_47 5 1,499 Feb-18-2023, 02:49 PM
Last Post: deanhystad
  Correctly read a malformed CSV file data klllmmm 2 1,813 Jan-25-2023, 04:12 PM
Last Post: klllmmm
  How to read csv file update matplotlib column chart regularly SamLiu 2 1,015 Jan-21-2023, 11:33 PM
Last Post: SamLiu

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020