Python Forum
[solved] how to speed-up huge data in an ascii file ?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[solved] how to speed-up huge data in an ascii file ?
#1
I've not paid any attention on that topic so far, but now I'm wondering how I can write (in a fast way) a huge amount of data in an ascii file (that's the current specification!)

The following snippet mimics of what I'm trying to do

import os, time
import numpy as np
import io
BufferSize = io.DEFAULT_BUFFER_SIZE
print (f"Default buffer size={BufferSize}")

path=str(os.getcwd())
FileName="myFile.txt"

n=1_000_000# 10_000_000
M=np.random.random((n, 3))

# without buffering
t1_0=time.time()
with open (path+'/'+FileName[:-4]+'_0.txt', 'w') as f:
    for i in range(n):
        f.write(f" X={M[i, 0]}, Y={M[i, 1]}, Z={M[i, 2]}\n")
t1_1=time.time()
print(f"With loops, duration={t1_1-t1_0}")

# with buffering
t2_0=time.time()
TestValue = 2**10
with open (path+'/'+FileName[:-4]+'_1.txt', 'w', buffering = TestValue) as f:
    for i in range(n):
        f.write(f" X={M[i, 0]}, Y={M[i, 1]}, Z={M[i, 2]}\n")
t2_1=time.time()
print(f"With buffering, duration={t2_1-t2_0}")
One can identify 2 mains issues at least:
  1. the use of a loop
  2. write function is called for each llop, which is time consuming

I'm currently trying to understand how to use buffering, but in practise, the value remains unclear for now; any general advice on how to write huge data in an ascii file?

Thanks

P.
Reply


Messages In This Thread
[solved] how to speed-up huge data in an ascii file ? - by paul18fr - May-15-2023, 07:31 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Finding the median of a column in a huge CSV file markagregory 5 1,997 Jan-24-2023, 04:22 PM
Last Post: DeaD_EyE
Smile How to further boost the data read write speed using pandas tjk9501 1 1,365 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  visualizing huge correation matrix erdemath 3 2,247 Oct-13-2021, 09:44 AM
Last Post: erdemath
  [solved] Save a matplotlib figure into hdf5 file paul18fr 1 2,679 Jun-08-2021, 05:58 PM
Last Post: paul18fr
  huge and weird values after applying some calculations karlito 2 2,279 Dec-13-2019, 08:32 AM
Last Post: karlito
  [SOLVED on SO] Downsizing non-representative data in DataFrame volcano63 1 2,272 Sep-28-2018, 12:56 PM
Last Post: volcano63
  Loading HUGE data from Python into SQL SERVER Sandeep 2 21,310 Jan-13-2018, 07:52 AM
Last Post: Sandeep

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020