Python Forum
Python help with module function return dictionary - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Python help with module function return dictionary (/thread-508.html)

Pages: 1 2


Python help with module function return dictionary - tebirkes - Oct-15-2016

I dont know what is wrong with my script, it only returns the last line from a file with some E coli data. 
It should return a dictionary where the keys are the names of the proteins and sequences are the values. 
def read_fasta(str):
    file =  open(str).read()
    holder = []
    table_dict = {}
    for line in file.split('>'):
        #table_dict[line.split()[0]] = line.splitlines()[1:]
        table_dict["Name"] = (line.splitlines()[:1])
        table_dict["Value"] = (line.splitlines()[1:])
    print table_dict
In another Python file I import this function and test whether it works or not
import module

module.read_fasta("Ecoli.prot.fasta")



RE: Python help with module function return dictionary - Larz60+ - Oct-15-2016

what is the file structure?
sample data?


RE: Python help with module function return dictionary - Yoriz - Oct-15-2016

On each iteration of your loop the dictionary over writes the same two keys. Would it be preferable to use a list instead or on each loop the key names will need changing to something unique to the dictionary each time.

If you want the two keys values to be a list, first asign each of them as a list outside of the loop then inside the loop append to the list of each key.


RE: Python help with module function return dictionary - tebirkes - Oct-15-2016

(Oct-15-2016, 12:54 PM)Larz60+ Wrote: what is the file structure?
sample data?

The data contains amino acid sequences for all known proteins in the E. coli organism. So something like this 

>YBGC_ECOLI

and then some lines containing the sequence of amino acids:
MNTTLFRWPVRVYYEDTDAGGVVYHASYVAFYERARTEMLRHHH
FSQQALMAERVAFVVRKMTVEYYAPARLDDMLEIQTEITSMRGTSL
VFTQRIVNAENTLLNEAEVLVVCVDPLKMKPRALPKSIVAEFKQ



RE: Python help with module function return dictionary - wavic - Oct-15-2016

Hello!

Instead of splitting and cutting I'd check if the row starts with '>', strip('>') the row, all that using it as the dict name, and value it with the data with a list comprehension.


RE: Python help with module function return dictionary - tebirkes - Oct-15-2016

(Oct-15-2016, 12:58 PM)Yoriz Wrote: On each iteration of your loop the dictionary over writes the same two keys. Would it be preferable to use a list instead or on each loop the key names will need changing to something unique to the dictionary each time.

If you want the two keys values to be a list, first asign each of them as a list outside of the loop then inside the loop append to the list of each key.

I've tried but it gives me an error or again only the last line. I'm supposed to create a function that  reads the file then create and return a dictionary. 
This is my output: 
{'Name': ['ZIPA_ECOLI'], 'Value': ['MMQDLRLILIIVGAIAIIALLVHGFWTSRKERSSMFRDRPLKRMKSKRDDDSYDEDVEDDEGVGEVRVHR', 'VNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQ..... ']}


RE: Python help with module function return dictionary - ichabod801 - Oct-15-2016

What is the error you are getting (full text please), and what are the last few lines of the file?


RE: Python help with module function return dictionary - tebirkes - Oct-15-2016

(Oct-15-2016, 05:34 PM)ichabod801 Wrote: What is the error you are getting (full text please), and what are the last few lines of the file?

The last few lines of the file is : 


Output:
>ZIPA_ECOLI MMQDLRLILIIVGAIAIIALLVHGFWTSRKERSSMFRDRPLKRMKSKRDDDSYDEDVEDDEGVGEVRVHR VNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPHPAQPVQQPAYQPQPEQPLQQ PVSPQVAPAPQPVHSAPQPAQQAFQPAEPVAAPQPEPVAEPAPVMDKPKRKEAVIIMNVAAHHGSELNGE ALLNSIQQAGFIFGDMNIYHRHLSPDGSGPALFSLANMVKPGTFDPEMKDFTTPGVTIFMQVPSYGDELQ NFKLMLQSAQHIADEVGGVVLDDQRRMMTPQKLREYQDIIREVKDANA
The error that I get sometimes is : 

Error:
", line 8, in read_fasta     table_dict[list] = line.splitlines() TypeError: unhashable type: 'list'



RE: Python help with module function return dictionary - wavic - Oct-15-2016

Does this work? It's not I've proposed. It's step by step. If the file doesn't contain something else...

def get_data(f):
    data = f.read().split()
    ecoli = dict()
    e_name = None

    for row in data:
        if row.startswith(">"):
            e_name = row.strip(">")
            ecoli[e_name] = ""
        else:
            ecoli[e_name] = "{}{}".format(ecoli[e_name], row)

    return ecoli



RE: Python help with module function return dictionary - Larz60+ - Oct-15-2016

Hello again,

I would like to take a closer look at how this data is exactly laid out.
Is this from the ncbi blast database? and if so, which file.

I would rather be working with an actual file.

Larz60+