pickle or txt - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Forum & Off Topic (https://python-forum.io/forum-23.html) +--- Forum: Bar (https://python-forum.io/forum-27.html) +--- Thread: pickle or txt (/thread-38018.html) |
pickle or txt - DPaul - Aug-22-2022 Hi, After an OCR-session, I have very large files for people to search for data (prayer cards). I "dump" them both in text format (.txt) and in binary, using pickle. So far , so good. Now I need to read the data: With pickle it is file.load(...) and I get the whole thing as a list, I can go through, record by record, that is ok. My traditional way of reading the .txt file is: with open('sourcefile', 'r') as source: for idx, line in enumerate(source): ...code ...In both cases, after scanning through all data , i just close the file, and continue with the results. The pickle file is much smaller that my txt file, but: Question: is the pickle load(...) method more taxing on the computer's memory that the enumerate(...) method. Even if I can empty the list after using it? Any pros or cons? thx, Paul RE: pickle or txt - Gribouillis - Aug-22-2022 Pickle can store several objects by successive calls to dump. These objects can be retrieved one by one, which solves your memory issue. from pathlib import Path import pickle this_dir = Path(__file__).parent idata = ['some spam', '4 slices of ham', '12 eggs'] filename = this_dir / 'idata.pkl' # pickle a sequence of objects one by one with filename.open('wb') as ofh: pkl = pickle.Pickler(ofh) for x in idata: pkl.dump(x) # unpickle a sequence of objects one by one with filename.open('rb') as ifh: pkl = pickle.Unpickler(ifh) try: while True: x = pkl.load() print(x) except EOFError: pass
RE: pickle or txt - DPaul - Aug-22-2022 [quote="Gribouillis" pid='160814' dateline='1661160975'] Pickle can store several objects by successive calls to dump. These objects can be retrieved one by one, which solves your memory issue. [python] OK, Thanks, I'll try this method to understand what it does ! thx, Paul RE: pickle or txt - DPaul - Aug-22-2022 [quote="Gribouillis" pid='160814' dateline='1661160975'] These objects can be retrieved one by one, which solves your memory issue. [python] Python is wonderful thx, Paul |