Hello,
*sorry about the subject, unable to edit it*
I am new to Python and trying to figure out a way to start. I have a large section of code that was generated by someone else and I now need to edit the code to add some filters. I was thinking of writing a separate code rather than to mess around with the original one. So I have several excel sheets, each having 11 tabs. I need to take data from the first tab and write the output to the second tab. The first tab looks like this, starting from Row 3
FID IID LOCUS CHROM POS REF ALT ZG freq type subtype consequence percent type1 type2 type3 type4 type5 type6 type7 type8 Score type9 type10 type11 type12 type13
A_1 B1 ABC 7 1000 A G yes 150 snp ts Random 68 546 3532 253 4000 5 123 200 0E0
A_2 B2 APP 21 23 C T yes 36 snp ts Random 2 547 5435 353 6000 6 345 200 0E0
A_3 B3 APOE 19 200 A G yes 0 snp ts Random 99 234 6456 5235 26994 9 1 200 0E0
The filters that needs to be created are
1) keep anything greater than 'POS' 5 and less than 160 AND
2) From an external csv file, read the file with headers CHROM/POS/REF/ALT/SCORE and create a new column on the this sheet called SCORE to write only the ones with score 0 or 1 or 2 (out of 6) matching CHROM/POS/REF/ALT/
3) Keep the ones with 'percent' more than 5% for those with score 0
I have written a shell/bash script for this by converting the sheet to a text file, which I am happy to share if it helps.
*sorry about the subject, unable to edit it*
I am new to Python and trying to figure out a way to start. I have a large section of code that was generated by someone else and I now need to edit the code to add some filters. I was thinking of writing a separate code rather than to mess around with the original one. So I have several excel sheets, each having 11 tabs. I need to take data from the first tab and write the output to the second tab. The first tab looks like this, starting from Row 3
FID IID LOCUS CHROM POS REF ALT ZG freq type subtype consequence percent type1 type2 type3 type4 type5 type6 type7 type8 Score type9 type10 type11 type12 type13
A_1 B1 ABC 7 1000 A G yes 150 snp ts Random 68 546 3532 253 4000 5 123 200 0E0
A_2 B2 APP 21 23 C T yes 36 snp ts Random 2 547 5435 353 6000 6 345 200 0E0
A_3 B3 APOE 19 200 A G yes 0 snp ts Random 99 234 6456 5235 26994 9 1 200 0E0
The filters that needs to be created are
1) keep anything greater than 'POS' 5 and less than 160 AND
2) From an external csv file, read the file with headers CHROM/POS/REF/ALT/SCORE and create a new column on the this sheet called SCORE to write only the ones with score 0 or 1 or 2 (out of 6) matching CHROM/POS/REF/ALT/
3) Keep the ones with 'percent' more than 5% for those with score 0
I have written a shell/bash script for this by converting the sheet to a text file, which I am happy to share if it helps.