Aug-29-2018, 10:46 AM
(This post was last modified: Aug-29-2018, 11:41 AM by Destry23000.)
Hi,
I'm working with DNA and am trying to run a simple program to clean up the sequences I get from online databases. I need to convert all lower case letters to uppercase ones and ignore anything that isn't an A, T, C, or G. I want this new string in a new global variable. Unfortunately the code I currently have (below), which does work, is EXTREMELY slow. Does anyone know a better/faster/more efficient means of doing this? I currently have it running over the human genome which is 6.6 billion characters long and it has been going for 7 days and counting.
def CleanText(Text):
Gen = ""
global Genome
Genome = Gen
for i in Text:
if i == "a":
Gen += "A"
elif i == "t":
Gen += "T"
elif i == "c":
Gen += "C"
elif i == "g":
Gen += "G"
Genome = Gen
Sorry, I forgot to format it correctly.
I'm working with DNA and am trying to run a simple program to clean up the sequences I get from online databases. I need to convert all lower case letters to uppercase ones and ignore anything that isn't an A, T, C, or G. I want this new string in a new global variable. Unfortunately the code I currently have (below), which does work, is EXTREMELY slow. Does anyone know a better/faster/more efficient means of doing this? I currently have it running over the human genome which is 6.6 billion characters long and it has been going for 7 days and counting.
def CleanText(Text):
Gen = ""
global Genome
Genome = Gen
for i in Text:
if i == "a":
Gen += "A"
elif i == "t":
Gen += "T"
elif i == "c":
Gen += "C"
elif i == "g":
Gen += "G"
Genome = Gen
Sorry, I forgot to format it correctly.
def CleanText(Text): Gen = "" global Genome Genome = Gen for i in Text: if i == "a": Gen += "A" elif i == "t": Gen += "T" elif i == "c": Gen += "C" elif i == "g": Gen += "G" Genome = Gen