Sep-28-2018, 08:06 AM
Hey guys,
I have a huge spreadsheet that I am attempting to search through for some specific data.
On the one hand I have IDs like this:
Y00988-11
G01024-14
Z01933-13
And on the other hand I have a massive spreadsheet(CSV) in the following format:
Run,Sample,Source,Rate,
DFT,G01024-14,A,High
DFT,U04424-15,B,Low
TFF,T64673-18,A,Low
RRT,I01324-14,A,High
RRT,J01624-14,A,High
...
I'm trying to extract both the 'Sample' ID and the 'Run'.
I read the csv spreadsheet into a Dictionary using the in built reader, but I'm having trouble extracting the elements I am interested in.
So I did try to use
I have a huge spreadsheet that I am attempting to search through for some specific data.
On the one hand I have IDs like this:
Y00988-11
G01024-14
Z01933-13
And on the other hand I have a massive spreadsheet(CSV) in the following format:
Run,Sample,Source,Rate,
DFT,G01024-14,A,High
DFT,U04424-15,B,Low
TFF,T64673-18,A,Low
RRT,I01324-14,A,High
RRT,J01624-14,A,High
...
I'm trying to extract both the 'Sample' ID and the 'Run'.
I read the csv spreadsheet into a Dictionary using the in built reader, but I'm having trouble extracting the elements I am interested in.
import csv import sys # sequences of interest dataset=sys.argv[1] # CSV spreadsheet database=sys.argv[2] sampleIDs=[] with open(dataset, 'r') as file: for line in file: line.strip('\n') sampleIDs.append(line) file.close() seq_Dict=[] finalList=['init'] with open(database, 'rb') as csvfile: reader=csv.DictReader(csvfile, delimiter='\t') for line in reader: seq_Dict.append(line) csvfile.close() for element in seq_Dict: for key, value in element.items(): if element['Sample'] in sampleIDs: finalList.pop() finalList.append(element['Sample']+" "+element['Run']) for i in finalList: print(i)This script returns the info of the last ID in my sampleIDs, so I can see that what is occurring during the loop is being overwriting the previous iteration.
So I did try to use
deepcopy
but that didn't seem to work.