Fasta Files

johnny_sav1992 · (This post was last modified: Jul-25-2020, 05:03 PM by Yoriz.)

Hello everybody,

i'm new in programming and its the first time i use python. I'm working on a code that should read a fasta file and delete the header of each sequence.
My code to read the file:

def read_fasta(inputfile):
    with open(inputfile,'r') as f:
        file=f.readlines()
        f.close
        return file

fasta_file=read_fasta('SELEX_100_reads.txt')  

print(fasta_file)

The output of fasta file looks like that:

Output:
['@DBV2SVN1:110:B:7:1101:1456:2092\n', 'CTAAAAAGCGAGTGCGNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNANNNNNNCNNNNNNNNAAACANNAAGGTAAGAAACAAGCACAGATGAGAGC\n', '\n', '+\n', '#####################################################################################################\n', '\n', '@DBV2SVN1:110:B:7:1101:2491:2141\n', 'AAGTGAGCAAACAGAAACATAGTGCGGAGTGGGAAAATGAGACTCAAAAAAAGAGTGTGGGTATTCAGTAGGGGATATTAGGCCACAATACGAAAGAGCAA\n', '\n', '+\n', '#####################################################################################################\n', '\n', '@DBV2SVN1:110:B:7:1101:2924:2130\n'......]

it's a list with header for each sequence. therefore i just want the DNA sequences (CTAAAA or AAGTAAAGCA) of each line as a list.
Can anyone help me with that ?
Thanks a lot

Cheers,
John

ndc85430 · Jul-25-2020, 05:00 PM

So how do you think you can approach the problem?

johnny_sav1992 · (This post was last modified: Jul-25-2020, 05:21 PM by Yoriz.)

i would do it with a loop, and ignore every line if it's not a sequence.

Like that:

for i in file:
    list_=[]
    if ... ='A' or 'T' : 
      new_list=append.list_      .... ( if the line  start with an A,T,C, or G then append to my list)

i dont know how to write that as a code.

**Yoriz** · Jul-25-2020, 05:28 PM

You could use the following:

https://docs.python.org/3/library/stdtyp...startswith Wrote:str.startswith(prefix[, start[, end]])
Return True if string starts with the prefix, otherwise return False. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

**Larz60+** · Jul-25-2020, 11:04 PM

Links to some Fasta code I wrote, can't remember exactly what's here, but expect that I have dealt with headers somewhere in the following:

Fasta search: https://python-forum.io/Thread-How-can-I...4#pid77344
Extract slice: https://python-forum.io/Thread-Extractin...7#pid68997
Merge Fasta files: https://python-forum.io/Thread-How-to-li...7#pid38527
Read Fasta files: https://python-forum.io/Thread-Python-he...61#pid2761

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Please suggest python code to format DNA sequence FASTA file	rajamdade	4	3,178	Oct-24-2019, 04:36 AM Last Post: rajamdade

Fasta Files

User Panel Messages

Announcements