Python Forum
Pigz inside python - Reading compressed .gz file much faster
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pigz inside python - Reading compressed .gz file much faster
#1
Hello Pythoners-

I am a linux admin. And one of our users were wondering on how to make the below script faster using pigz or any other multi-threading methods. I have no idea regarding python. Can someone please share on how to make the below part a little bit faster? She said it currently takes around 45minutes to parse on compressed .gz file that is 1GB in size.

if infile.endswith(".gz"):
data = gzip.open(infile, 'rb')
else:
data = open(infile, "r")
outfile = infile.split(".txt")[0] +"_step1.gz"
outdata = gzip.open(outfile, "wb")

## take line by line
for line in data:
line1 = line.rstrip()
if line.startswith("@"):
....
....
....
....
....
outdata.close()
data.close()
print ">Output file: "+ outfile # end of run
Thank you. This is not a homework task. This is a biology lab's problem.
Reply


Messages In This Thread
Pigz inside python - Reading compressed .gz file much faster - by jsmith7279 - Dec-21-2017, 07:21 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
Sad problems with reading csv file. MassiJames 3 691 Nov-16-2023, 03:41 PM
Last Post: snippsat
  Navigating file directories and paths inside Jupyter Notebook Mark17 5 757 Oct-29-2023, 12:40 PM
Last Post: Mark17
  Reading a file name fron a folder on my desktop Fiona 4 952 Aug-23-2023, 11:11 AM
Last Post: Axel_Erfurt
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,140 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Reading a file JonWayn 3 1,124 Dec-30-2022, 10:18 AM
Last Post: ibreeden
  Reading Specific Rows In a CSV File finndude 3 1,016 Dec-13-2022, 03:19 PM
Last Post: finndude
  Excel file reading problem max70990 1 913 Dec-11-2022, 07:00 PM
Last Post: deanhystad
  Reading All The RAW Data Inside a PDF NBAComputerMan 4 1,407 Nov-30-2022, 10:54 PM
Last Post: Larz60+
  Replace columns indexes reading a XSLX file Larry1888 2 1,011 Nov-18-2022, 10:16 PM
Last Post: Pedroski55
  Failing reading a file and cannot exit it... tester_V 8 1,859 Aug-19-2022, 10:27 PM
Last Post: tester_V

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020