count string occurrences of 2nd file in lines of first

showkat · (This post was last modified: Mar-01-2018, 07:36 AM by micseydel.)

I need to generate permutation of some words (A T G C ) actually nucleotides for di-composition (eg AA AT AG AC), tri-composition (AAA AAT AAC AAG), tetra, penta etc (one at a time) and then check in the other file that contains sequences with some values the count of occurrences of each permutation. I generated the permutation list. Now I need to loop through the sequences only (splitting the sequences from values) for counting each of the permutation generated above and get the output in new file. But I'm getting the answer for only one sequence and not for the other sequences.

Logic of the programme i tried to follow is :

Generate the permutations of ATCG in a file1 (e.g. AT AG AC AA ...)
Read the generated file1 and sequence#value file (DNA_seq_val.txt)
Read the sequences and separate the sequences form values
Loop through the sequences for the permutations and print their occurrence with values (each separated with comma) in results file.
Input test file= DNA_seq_val.txt AAAATTTT#99
CCCCGGGG#77
ATATATCGCGCG#88

*Output I got is --
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
77 CCCCGGGG
88 ATATATCGCGCG
Output Needed is
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,77 CCCCGGGGx
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,88 ATATATCGCGCG
(where x= corresponding counts as in first line)

my code is below:

from itertools import product
import os

f2 = open('TRYYY', 'a')

#********Generate the permutations start********
per = product('ACGT', repeat=2)	# ATGC =nucleotides; 2= for di ntd(replace 2 with 3 fir tri ntds and so on)
f = open('myfile', 'w')
p = ""
for p in per:
    p = "".join(p)
    f.write(p + "\n")
f.close()

#********Generate the permutations ENDS********

with open('DNA_seq_val.txt', 'r+') as SEQ, open('myfile', 'r+') as TET: #open two files
	SEQ_lines = sum(1 for line in open('DNA_seq_val.txt'))		#count lines in sequences file
	#print (SEQ_lines)
	compo_lines = sum(1 for line in open('myfile'))		#count lines in composition
	#print (compo_lines)
	for lines in SEQ:
		line,val1 = lines.split("#")
		val2 = val1.rstrip('\n')
		val = str(val2)
		line = line.rstrip('\n')
		length =len(line)
		#print (line)		
		#print (val)
		LIN = line, val
		#print (LIN)
		newstr = "".join((line))
		print (newstr)
		#while True:		# infinte loop
		for PER in TET:
			#print (line)
			PER = PER.rstrip('\n')
			length2 =len(PER)
			#print (length2)
			#print (line)
#			print (PER)
			C_PER  = str(line.count(PER))
#			print (C_PER)
			for R in C_PER:
				R1 = "".join(R)
				f2.write(R1+ ",")
		f2.write(val,)
		f2.write('\t')
		f2.write(line)
		f2.write('\n')
	#exit()

showkat · Mar-01-2018, 11:25 AM

this is all actually...sorry for my naiveness...I'm new to programming and this forum
inputs and outputs are provided above the code...

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Need to replace a string with a file (HTML file)	tester_V	1	761	Aug-30-2023, 03:42 AM Last Post: Larz60+
	Row Count and coloumn count	Yegor123	4	1,322	Oct-18-2022, 03:52 AM Last Post: Yegor123
	Delete multiple lines from txt file	Lky	6	2,284	Jul-10-2022, 12:09 PM Last Post: jefsummers
	Editing text between two string from different lines	Paqqno	1	1,311	Apr-06-2022, 10:34 PM Last Post: BashBedlam
	failing to print not matched lines from second file	tester_V	14	6,073	Apr-05-2022, 11:56 AM Last Post: codinglearner
	Extracting Specific Lines from text file based on content.	jokerfmj	8	2,953	Mar-28-2022, 03:38 PM Last Post: snippsat
	I want to simplify this python code into fewer lines, it's about string	mandaxyz	5	2,118	Jan-15-2022, 01:28 PM Last Post: mandaxyz
	Why does 'nothing' count as a string?	BashBedlam	3	1,643	Nov-10-2021, 12:41 AM Last Post: BashBedlam
	[SOLVED] Delete specific characters from string lines	EnfantNicolas	4	2,203	Oct-21-2021, 11:28 AM Last Post: EnfantNicolas
	Importing a function from another file runs the old lines also	dedesssse	6	2,543	Jul-06-2021, 07:04 PM Last Post: deanhystad

count string occurrences of 2nd file in lines of first

User Panel Messages

Announcements