Python Forum

Full Version: Browse a graph
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello everyone,

I am creating an algorithm that will allow me to find all the chains of size 4 interactions in this tree taking into account the fact that we can go back in the tree (the interactions are two-way) (and you can not find the same number twice in a chain).

                                 1
                                -  -
                               -     -
                             6       4
                              -        -
                                 -      -
                                     -   9
                                        -   -
                                       -     -
                                     33       66
I managed to generate size chains 4 (here is what my algorithm currently finds me)

1 6 9 66
1 6 9 4
1 6 9 33
1 4 9 6
1 4 9 66
1 4 9 33
4 1 6 9
4 9 6 1
6 1 4 9
6 9 4 1
9 6 1 4
9 4 1 6
33 9 6 1
33 9 4 1
66 9 6 1
66 9 4 1

However I would also need the "intermediate chains", that is to say the chains that are "blocked" before reaching the size of 4, for example:

4-9-33
6-9-33
9-33
9-66
66-9-33
33-9-66
6-9-66
ect ...

it must not be much to change / add, I did research on potential functions (including intertools) for this but I can not find anything that solves my problem ...

Here is the file I'm working on:

1 4
1 6
4 9
6 9
9 33
9 66
4 1
6 1
9 4
9 6
33 9
66 9

Here is my current algorithm:

import sys
sys.setrecursionlimit(1000000)
from collections import defaultdict
dd = defaultdict(set)
  
with open ("C:/Users/lveillat/Desktop/Données stage/Fichiers tests/testchaines3.txt","r") as f1:
    for ligne in f1:
        lp = ligne.rstrip('\n').split(" ")
        prot1 = lp[0] #I select the first protein of each interactions
        prot2 = lp[1] #I select the second protein of each interactions
        dd[prot1].add(prot2) #I create my dictionary with key the first prot of the interaction and in values the sets of prots with which it can interact
    print(dd)
  
def chain(maillon, pathway, limite=4): 
    next_ = maillon.get(pathway[-1], None) #next_ = We add a link to the existing chain according to the last protein of the pathway
  
    if next_ is None or len(pathway) >= limite : #If there is no protein found interacting with the last protein of the pathway, or if the size exceeds the limit then we move to the next chain
        yield pathway 
  
    else: #If we still find proteins interacting with the last protein of the pathway and if the size limit is not reached
        for m in next_: # for an interaction in the set of possible pathway interactions [-1]
            if m not in pathway: #to avoid ending up with repetitions of the same proteins in the same chain
                yield from chain(maillon, pathway + [m])
 
for k in dd: # For each prot of the dico
    for z in chain(dd, pathway = [k]): 
        print (' '.join(z))
and here is my current release:

defaultdict (<class 'set'>, {'1': {'6', '4'}, '4': {'1', '9'}, '6': {'1', '9' }, '9': {'6', '66', '4', '33'}, '33': {'9'}, '66': {'9'}})
1 6 9 66
1 6 9 4
1 6 9 33
1 4 9 6
1 4 9 66
1 4 9 33
4 1 6 9
4 9 6 1
6 1 4 9
6 9 4 1
9 6 1 4
9 4 1 6
33 9 6 1
33 9 4 1
66 9 6 1
66 9 4 1

Thank you for your help