Python Forum

Full Version: Python help with module function return dictionary
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
(Oct-15-2016, 07:43 PM)Larz60+ Wrote: [ -> ]Hello again,

I would like to take a closer look at how this data is exactly laid out.
Is this from the ncbi blast database? and if so, which file.

I would rather be working with an actual file.

Larz60+

Hello. I dont know actually if its from the ncbi database or not I've got the file from someone else and I dont know where they've got it. :s!

How do you attach or send a file here? I cant seem to find it in here :huh:

(Oct-15-2016, 07:35 PM)wavic Wrote: [ -> ]Does this work? It's not I've proposed. It's step by step. If the file doesn't contain something else...

def get_data(f):
    data = f.read().split()
    ecoli = dict()
    e_name = None

    for row in data:
        if row.startswith(">"):
            e_name = row.strip(">")
            ecoli[e_name] = ""
        else:
            ecoli[e_name] = "{}{}".format(ecoli[e_name], row)

    return ecoli

I got an error for data=f.read().split()  so I tried to change it to data = open(f).read().split()  and I got no error but the output were only the values and also not in a dictionary. :-/
Probably not a good idea as size could be an issue.
If the name of the file hasn't changed, I should be able to find it.
what name?

here is the blast help file location: https://blast.ncbi.nlm.nih.gov/Blast.cgi...=BlastHelp
(Oct-15-2016, 08:22 PM)Larz60+ Wrote: [ -> ]Probably not a good idea as size could be an issue.
If the name of the file hasn't changed, I should be able to find it.
what name?

here is the blast help file location: https://blast.ncbi.nlm.nih.gov/Blast.cgi...=BlastHelp

The name of the file is "Ecoli.prot.fasta"
OK,

This should do the trick. It's not the most efficient code, but you can clean it up. It works, that's the important thing.

def read_fasta(filename=None):
    table_dict = {}
    update_dict = False
    if filename is not None:
        name = ''
        value = ''
        with open(filename, 'r') as f:
            for line in f.readlines():
                line = line.strip()

                if line[0] == ">":
                    if update_dict:
                        table_dict[name] = value
                        value = ''
                    name = line[1:]
                else:
                    update_dict = True
                    if len(line):
                        value += line

            if len(value):
                table_dict[name] = value

        print(table_dict)


if __name__ == '__main__':
    read_fasta('Ecoli.prot.fasta')
Larz60+
Quote:How do you attach or send a file here? I cant seem to find it in here [Image: huh.png]

I posted only the function which is supposed to do the job.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import pprint

def get_data(f):
    data = f.read().split()
    ecoli = dict()
    e_name = None

    for row in data:
        if row.startswith(">"):
            e_name = row.strip(">")
            ecoli[e_name] = ""
        else:
            ecoli[e_name] = "{}{}".format(ecoli[e_name], row)

    return ecoli

def main():

    with open("ecoli.txt") as in_file:
        pprint.pprint(get_data(in_file))

if __name__ == '__main__':
    sys.exit(main())
Try @Larz60's solution first. He is a real programmer. I code for fun. :surfing:
Pages: 1 2