Python Forum
How to merge strings through graphs
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to merge strings through graphs
#1
Having a list of strings and knowing that all of them have to be merged as they have a common prefix or suffix
for a simple example, having
list=['red', 'allure', 'all']
the output should be
out='allured'
.
I tried many recursive and iterated methods, but all of them, at some point, as the input list gets longer strings got more complicated, do fail as the merge order is confusing.
All I know is that the best solution could be combining all the strings together, and the vertexes are the right final string.As I am not familiar with graphs at all, could someone please suggest me how to implement this problem through them?
Reply
#2
It is not clear to me what you really want. You do realize the order of the list will change the result? Do you realize a fit is not always possible? Or do you want to test all elements of the list so far? And when is there a match? Is a match of one character sufficient or do you want to merge the element with the most characters fitting?
And what do you mean when you say your algorithm fails? Do you get an error or is the result not what you expected?
Reply
#3
Forgive me, you're right,I was not clear enough at all. The idea is given by an italian game "the talking crow" in which a list of words is given and all of its words are supposed to get concatenated, as they all have a prefix that is the suffix of another word or vice versa. If a word has a suffix common as a prefix to more than one word, the one which begins with the longest common slice, gets to be concatenated (match).
So by that, I can know when my algorithm fails, it does not work right for all cases, tbe order of the words merged in the final string does not match the solution. Still, forgive me for my unclearness

(Jun-28-2020, 08:49 AM)Den Wrote: Forgive me, you're right,I was not clear enough at all. The idea is given by an italian game "the talking crow" in which a list of words is given and all of its words are supposed to get concatenated, as they all have a prefix that is the suffix of another word or vice versa. If a word has a suffix common as a prefix to more than one word, the one which begins with the longest common slice, gets to be concatenated (match).
So by that, I can know when my algorithm fails, it does not work right for all cases, tbe order of the words merged in the final string does not match the solution. Still, forgive me for my unclearness

for example, a simple chase:
Original_list=['ottuso', 'iodato', 'coniglio']
the final string of the merged words is:
Output='conigliodatottuso'
and the order of the merged word, based from the original order is:
[2,1,0]
Reply
#4
I haven't figured out how to generate the graph, but this should find all possible solutions. It is not elegant.
def merge(prefix, suffix):
    """Return a word thaat is prefix+suffix with matching letters only
    appearing once.   For example, "taco" and "comply" have matching
    letters "co" and the merge is "tocomply"
    """
    end = min(len(prefix), len(suffix))
    for i in range(end, 0, -1):
        if prefix[-i:] == suffix[:i]:
            return prefix + suffix[i:]
    return None

def replace(words, prefix, suffix, merge):
    """Make new words list with prefix and suffix replaced by tne
    merged word
    """
    newlist = words[:]
    newlist.remove(words[prefix])
    newlist.remove(words[suffix])
    newlist.append(merge)
    return newlist
    
def solve(words):
    """Brute force recursive solution to The Talking Crow game"""
    if len(words) == 1:
        print(words[0]) # Found a solution!
    for i in range(0, len(words)-1):
        for j in range(i+1, len(words)):
            word = merge(words[i], words[j])
            if word:
                solve(replace(words, i, j, word))
            word = merge(words[j], words[i])
            if word:
                solve(replace(words, j, i, word))

solve(['ottuso', 'iodato', 'coniglio'])
Reply
#5
(Jun-28-2020, 05:51 PM)deanhystad Wrote: I haven't figured out how to generate the graph, but this should find all possible solutions. It is not elegant.
def merge(prefix, suffix):
    """Return a word thaat is prefix+suffix with matching letters only
    appearing once.   For example, "taco" and "comply" have matching
    letters "co" and the merge is "tocomply"
    """
    end = min(len(prefix), len(suffix))
    for i in range(end, 0, -1):
        if prefix[-i:] == suffix[:i]:
            return prefix + suffix[i:]
    return None

def replace(words, prefix, suffix, merge):
    """Make new words list with prefix and suffix replaced by tne
    merged word
    """
    newlist = words[:]
    newlist.remove(words[prefix])
    newlist.remove(words[suffix])
    newlist.append(merge)
    return newlist
    
def solve(words):
    """Brute force recursive solution to The Talking Crow game"""
    if len(words) == 1:
        print(words[0]) # Found a solution!
    for i in range(0, len(words)-1):
        for j in range(i+1, len(words)):
            word = merge(words[i], words[j])
            if word:
                solve(replace(words, i, j, word))
            word = merge(words[j], words[i])
            if word:
                solve(replace(words, j, i, word))

solve(['ottuso', 'iodato', 'coniglio'])
Thank you for the tips! The reasoning is similar to my code, I've tried this out, but it still does not work for cases with longer and more complicated lists
Reply
#6
I suspect the problem is there may be multiple ways to merge some strings and picking the longest overlap (which my code does) may not give the best results. Merge may need to be rewritten as a generator and an additional for loop used to check all possible merges.
def merge(prefix, suffix):
    """Return a word thaat is prefix+suffix with matching letters only
    appearing once.   For example, "taco" and "comply" have matching
    letters "co" and the merge is "tocomply"
    """
    end = min(len(prefix), len(suffix))
    for i in range(end, 0, -1):
        if prefix[-i:] == suffix[:i]:
            yield prefix + suffix[i:]

def replace(words, prefix, suffix, merge):
    newlist = words[:]
    newlist.remove(words[prefix])
    newlist.remove(words[suffix])
    newlist.append(merge)
    return newlist
    
def solve(words):
    if len(words) == 1:
        print(words[0])
    for i in range(0, len(words)-1):
        for j in range(i+1, len(words)):
            for word in  merge(words[i], words[j]):
                solve(replace(words, i, j, word))
            for word in merge(words[j], words[i]):
                solve(replace(words, j, i, word))

solve(['ottuso', 'iodato', 'coniglio'])
It would be greatly appreciated if you would share a word list that has a solution but fails.
Reply
#7
(Jun-28-2020, 10:06 PM)deanhystad Wrote: I suspect the problem is there may be multiple ways to merge some strings and picking the longest overlap (which my code does) may not give the best results. Merge may need to be rewritten as a generator and an additional for loop used to check all possible merges.
def merge(prefix, suffix):
    """Return a word thaat is prefix+suffix with matching letters only
    appearing once.   For example, "taco" and "comply" have matching
    letters "co" and the merge is "tocomply"
    """
    end = min(len(prefix), len(suffix))
    for i in range(end, 0, -1):
        if prefix[-i:] == suffix[:i]:
            yield prefix + suffix[i:]

def replace(words, prefix, suffix, merge):
    newlist = words[:]
    newlist.remove(words[prefix])
    newlist.remove(words[suffix])
    newlist.append(merge)
    return newlist
    
def solve(words):
    if len(words) == 1:
        print(words[0])
    for i in range(0, len(words)-1):
        for j in range(i+1, len(words)):
            for word in  merge(words[i], words[j]):
                solve(replace(words, i, j, word))
            for word in merge(words[j], words[i]):
                solve(replace(words, j, i, word))

solve(['ottuso', 'iodato', 'coniglio'])
It would be greatly appreciated if you would share a word list that has a solution but fails.
Of course, here are some examples(to see if all of the strings are merged, it is helpful to enumerate the original list and to produce a sequence of numbers in which each number corresponds to the initial word in the list. I used dictionaries to optimize the timing, but the struggle is still there):
list1=['xgtjYkgnXKFlHn',
 'xcKlFXdmrJuGBj',
 'xbTXJwtxPI',
 'wtxPINSDCLk',
 'wBQVGlNLxcKlF',
 'vIDrHDiNNkGX',
 'tXscIAkfnLhFDxg',
 'tDujNHoYrxpx',
 'rxpxQKpLrdR',
 'qysLnwRnopuiC',
 'qtxVjjsQtRntqy',
 'qTUiDMpgMbUlg',
 'pPfDHrIjBMYTPQ',
 'pLrdROsvUvUK',
 'onfAfJFmVy',
 'nPSwFNdIsYHh',
 'mWpdNrdxlyOQnJ',
 'mNodRbGdNkHX',
 'ldBqdHNNtl',
 'kDvtpRsoRV',
 'jrFsOyXQvAx',
 'iJdWbyFKobu',
 'iCvMJDvuyVckUq',
 'hMpEpPSXdW',
 'gTVGXOIVcEwtxPI',
 'dNkHXvmwvPhM',
 'bitslkgLwrBLVOX',
 'YHhTikUNadrUvID',
 'WBXsIYdOvPy',
 'VToOpqQSxTC',
 'UyUSLJMFtTldB',
 'UlgADUoRFLaKrTl',
 'UchIJHHWaQvTnPS',
 'UKwoQYiJTMQv',
 'RoRoLpLWimWpd',
 'QvLbGpthUy',
 'QnJbhNnHonfAf',
 'QVbdjkVpstDu',
 'PlVwIqtxVj',
 'PQmKNbSfPl',
 'OvPyGetXsc',
 'NtlcOXCpEsYHh',
 'NkGXJrbWaNcoti',
 'LVOXqgTVGX',
 'KrTldonoTFQVbd',
 'IWwRMMATwBEDxT',
 'FmVyuhHlfgb',
 'FlHnNUchIJ',
 'EsYHhQmNod',
 'EDxTpjKBFR',
 'CmAXVtcxyaoIp',
 'CjrMWFTlJwBQVG']

list2=['°$OwStvCI.&VYBxJxSuTN*XR?Gv',
 'y^#=pNcvRyq._UXQ:"Ps#msGhx!;',
 'y"WpTl,(@uM,a&nrGSGxR°!+StOC',
 'x!;!VgpLJxO+kTqoI_eJEWOLh',
 '[email protected]^J$_BP$=J+Ffx)OLDu',
 'vDdU=vcN?v@XR^^jUsOjQ@?,.Qg',
 'ub;tBHsw°PeI:=D=UyhlWF&.w',
 'tw!iDTHsHL+)XIFpMva@(dqK*ig$',
 'tOCsHdupn;bH&ykl$(DqBf.ndLwl*a',
 'sdbqdSmK*H&qm)JuF?.$SXXC!°N',
 'sHkrcrh(QURDq;%+A°FMBOG,jg',
 'qq@k)nDTV(SoP&kh(eVgL,O&^lf;eI',
 'o%Eo°AhuQEGY"gSud(PTyA^DXsJr',
 'n?LLO°;KIiHkek.%(Q*VI#$uVUNo=',
 'l*apH#;[email protected],QL"X',
 'l#Me@)+.YyS_Aoj,!^mnA%n$MhNC',
 'ig$%K°B$sdMmJgNgp*kSHRJ°o%',
 'hNCSV#BqtuoBi)B°Ocwn&g:,&r)b$-',
 'hNCKV@+g):y#dii°=Cs;Xrg"C@B',
 'flhqopIajp%?JkbeXgY)gu.)LEa^I',
 'flhDn+ye$,AEl_ylBH_jT!,*hvLdw',
 'eNMv&Q#°bL.=EY+)GMDDQN&@n',
 'eISBe_Nbne-cBPW:nGlGkjjwWkd*',
 'c.vJh?#,*neHdr(CVGtuF$.:R',
 '_BxaJTtB*xxiVoD,IrRNYvQ*rH#oB',
 '_BNaa°("NQH_s=$rlumGm?AolKJU%X:',
 '^YKrD#yV&c+txRIVWrn*AS_Pk,?',
 'X^qqhp*evwu:$owd#e-CWFOl_Rv;Jd',
 'VpLyf°)^:;wPRfc_RfMQDW?O^Y',
 'V,k%:yEAocud+qTJnBq&AJkhUW$VpL',
 'T°&j-K@%vnS(ajJ(n_R(BWingaP$V,',
 'S)$,K_hXcxV:)!+Db(r@SVUfk&',
 'R!ySX_agBsbl*":jiLq-n;ujD#?JX',
 'QgPfSGy%YjrX.AYUa$nhkLAusHk',
 'PTXW*Rj-hXHLfd,Nnp#:BcG*V,-hNC',
 'P)?wLmc&PAyXSpSiCtWQjG=bQyTUvy',
 'OtWaB;mtMO+Th@fDHBILQc@rso?eN',
 'OLhT.KP:j.+G)mg°%hnl"aNleBbf*E',
 'KNtWC):IsCtenH.qM°v@DdR@F+Iq?G',
 'JrdA°_$&=sUN("-o^SamfCe.PA',
 'JXi)T!(x°M#DTrQTVF*&lF#=!Asd',
 'I$ttN)l-°EP(I*Cp;=-*%v+bFI;oa',
 'GuFCCdC:sUn:"Il-;_SdlGDS)$',
 'FWVcX%F(&-NQ°jr;SCxspUQLhnMay"',
 'E$lRIBGinwGFAf$BQWnArVv)hUT°',
 'DhISs.GIeDd-:$s.."o=&))BwAc',
 'DAdbjw!I(^Xu_&nCy)cmGDE$EK_B',
 'A!jyU@pe°H.#+ky@ABH$vtM*Tu_BN',
 '@B%,mgcwHOtH:)UcQ=*QGcJbgPTX',
 '?shpvPTa?GNRify=f,A?(#FVR@-(',
 '=$l:CJLocwT&MA,j)fctgd)Po_EpkR',
 ';oaXNwmN*,#oiFM(=e°j)k°CEur$k',
 ';JdDxJ!MN°K;O$=tw_gtYM_Mflh',
 ';-yAXq-SELM-I,^xeg$f°$)DA',
 ':Rb!-dY==@O+R^t!k?o^RR°^"YcXD',
 '.wSR:hT°Chih:C:,kXVP-Dc*+B#',
 '-DMxeq°B^-G^UVVo"G_cuaREkhMflh',
 ',jgbY"EL;=L^v*KpNlY*uSHe;-',
 '+B#vV=GljgoUUg;k=sfdi*!-D',
 '*sKbuEJ-ct%I^a*YPD":y@$_A(RR(t',
 '*n;@&hmc@#F&&p%Bj&FyVm^guOtW',
 ')KdLx+&hN°Wk_yv+sbf^lygc$_qq',
 '(D%MklBitTLJ#UsJRRHdV:fVuwwK',
 '&i*cy%e#TBaO-,@[email protected]',
 '%X:=M*RCBnL?)*^tGSH&.?EaP)',
 '$kIn#+AgSy:°kS__xv,Jjo(D#:IeFW',
 '$-weivcvf_o:ExonqAJCui%°$',
 '#ovN"jlHsYTDm=MhsYu#:kLmJ?h*n',
 '#oBd^V#(DiJ&rHnGLke,;C&@st)K',
 '!°NC%Cl%P?CDTf!_;.MOFhD*+B#']

list3=['t%vw;$GSG*&Q!&I?v*sqk)WJL',
 'p&lE°jqHf&ekh$@+It%$Yb^h^*Vo:',
 'o:.jq,Ew=*JfjTxG,OjPisML#TvWAD',
 'n*KqK-Wax^C+D%lTS!PEQ;+#sCQ',
 'm.jK@fyp"@gmSmB=NXGbnC^jxXVp',
 'k?=.=Na".EBBwGe@"ngbCMWTd',
 'iq#;uE,lOg?L-,u:KbBEPVR=xuFt',
 'fyohhG+oe)jVBUV@)"B:)*KSd(Pm',
 '_AHHQFj*wtaisDAqIW;UpJ+@Y##',
 'Ycm.oXIyC=UfK!&=#$V#^^u"He',
 'Y=D#(FwKY;n$d&F(Kqo;tdtWG',
 'XdV@klgS^+FReN,NDb=HMfU*,',
 'WAD-u$D#qrOwuEAdj??@?JheY',
 'Vpwdt&FR.Xsn,j+cW^SwhR@Xh^pU',
 'VE$:Wh@l#aSo*c@-pX)LWukYWUTl',
 'UTl#ftb°"S-@RMp#.+WkmVMy;ac)',
 'UEYm^WTdgF,Dwkh_uTF@WuFv,IX',
 'Td(*=RLrijibeqbC#xkLrsDkmGHu',
 'QU(:sqLPNa***_&mBI)$;vF).T&;',
 'Pm*n(A)f$ewG°DrpI)Ca$oLSkI*dXd',
 'LXfCQcroU^xUUo;)@QU.G"wwy_',
 'KX*vpaoyDxD@maij(=,;l=cMoi',
 'Hu=Q,(c;YnPn:NO!gIBapKQ:;j',
 'HeUig?&j)!Tw°NLMv!l$PaQ^=@;KX*',
 'GfHgjm&,kiRSwBv@o?tpCBMe!Jcq%r',
 'G$U$&a;yHftvC,WqtQFHIGQ;$L',
 'FtG-wPOMop@VniPmHjbP;O-"??Nu!;',
 'FRIgRGKP^_xtlME(:PSb;Vn#C',
 'COQ"(iIj@s)OMVu=It")ab!j;.Y',
 'Bx,ty&WO**iBSTHrJQ%SVmu@yRxBV',
 ';j?jPOQqEGktJjW)SfhPB"mGfH',
 ';Uhii%q%+vVebS:w?:QaDqk?=',
 ';$L;(i+°NUNFBRy',
 ',IXgb.:Xf!KvOwaGfflLjcK(+mA,-',
 ',HQ#k+HV-sM_kJgcpyXKcX""t%',
 ',-sKy%blq$FAHye,!"uq*gG;IEn',
 ')°jI=,KK^?rnKqWMl_uTIH)fy',
 '%reE(;XJLa*E!BAeNgoO-GBx,',
 '##s=^U.aLcNubR!=IdqyaSSlFRI',
 '!;yE#bUUl"ypT*o_n:RN°YiYaMp']
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to understand strings and lists of strings Konstantin23 2 758 Aug-06-2023, 11:42 AM
Last Post: deanhystad
  Plot several graphs on the same row Menthix 1 1,032 Mar-18-2022, 05:47 PM
Last Post: deanhystad
  Splitting strings in list of strings jesse68 3 1,758 Mar-02-2022, 05:15 PM
Last Post: DeaD_EyE
  grouped bar graphs tobiasfw 1 1,402 Feb-22-2022, 02:25 PM
Last Post: deanhystad
  Tranforming into .exe with Matplotlib (graphs) ericvrocha 0 2,910 Oct-14-2019, 06:54 PM
Last Post: ericvrocha
  Little Help With Graphs ericvrocha 5 3,176 Oct-14-2019, 12:07 PM
Last Post: buran
  Finding multiple strings between the two same strings Slither 1 2,514 Jun-05-2019, 09:02 PM
Last Post: Yoriz
  lists, strings, and byte strings Skaperen 2 4,217 Mar-02-2018, 02:12 AM
Last Post: Skaperen
  Drawing graphs with matplotlib JRod 5 4,706 Jul-31-2017, 10:00 AM
Last Post: JRod

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020