Fasta Files - Printable Version

Fasta Files - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Fasta Files (/thread-28602.html)

Fasta Files - johnny_sav1992 - Jul-25-2020

Hello everybody,

i'm new in programming and its the first time i use python. I'm working on a code that should read a fasta file and delete the header of each sequence.
My code to read the file:

def read_fasta(inputfile):
    with open(inputfile,'r') as f:
        file=f.readlines()
        f.close
        return file

fasta_file=read_fasta('SELEX_100_reads.txt')  

print(fasta_file)

The output of fasta file looks like that:

Output:
['@DBV2SVN1:110:B:7:1101:1456:2092\n', 'CTAAAAAGCGAGTGCGNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNANNNNNNCNNNNNNNNAAACANNAAGGTAAGAAACAAGCACAGATGAGAGC\n', '\n', '+\n', '#####################################################################################################\n', '\n', '@DBV2SVN1:110:B:7:1101:2491:2141\n', 'AAGTGAGCAAACAGAAACATAGTGCGGAGTGGGAAAATGAGACTCAAAAAAAGAGTGTGGGTATTCAGTAGGGGATATTAGGCCACAATACGAAAGAGCAA\n', '\n', '+\n', '#####################################################################################################\n', '\n', '@DBV2SVN1:110:B:7:1101:2924:2130\n'......]

it's a list with header for each sequence. therefore i just want the DNA sequences (CTAAAA or AAGTAAAGCA) of each line as a list.
Can anyone help me with that ?
Thanks a lot

Cheers,
John

RE: Fasta Files - ndc85430 - Jul-25-2020

So how do you think you can approach the problem?

RE: Fasta Files - johnny_sav1992 - Jul-25-2020

i would do it with a loop, and ignore every line if it's not a sequence.

Like that:

for i in file:
    list_=[]
    if ... ='A' or 'T' : 
      new_list=append.list_      .... ( if the line  start with an A,T,C, or G then append to my list)

i dont know how to write that as a code.

RE: Fasta Files - Yoriz - Jul-25-2020

You could use the following:

https://docs.python.org/3/library/stdtypes.html#str.startswith Wrote:str.startswith(prefix[, start[, end]])
Return True if string starts with the prefix, otherwise return False. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

RE: Fasta Files - Larz60+ - Jul-25-2020

Links to some Fasta code I wrote, can't remember exactly what's here, but expect that I have dealt with headers somewhere in the following:

Fasta search: https://python-forum.io/Thread-How-can-I-make-a-faster-search-algorithm?pid=77344#pid77344
Extract slice: https://python-forum.io/Thread-Extracting-a-portion-of-a-text-document?pid=68997#pid68997
Merge Fasta files: https://python-forum.io/Thread-How-to-link-two-python-scripts?pid=38527#pid38527
Read Fasta files: https://python-forum.io/Thread-Python-help-with-module-function-return-dictionary?pid=2761#pid2761