Dec-21-2017, 07:21 PM
Hello Pythoners-
I am a linux admin. And one of our users were wondering on how to make the below script faster using pigz or any other multi-threading methods. I have no idea regarding python. Can someone please share on how to make the below part a little bit faster? She said it currently takes around 45minutes to parse on compressed .gz file that is 1GB in size.
I am a linux admin. And one of our users were wondering on how to make the below script faster using pigz or any other multi-threading methods. I have no idea regarding python. Can someone please share on how to make the below part a little bit faster? She said it currently takes around 45minutes to parse on compressed .gz file that is 1GB in size.
if infile.endswith(".gz"): data = gzip.open(infile, 'rb') else: data = open(infile, "r") outfile = infile.split(".txt")[0] +"_step1.gz" outdata = gzip.open(outfile, "wb") ## take line by line for line in data: line1 = line.rstrip() if line.startswith("@"): .... .... .... .... .... outdata.close() data.close() print ">Output file: "+ outfile # end of runThank you. This is not a homework task. This is a biology lab's problem.