Posts: 7
Threads: 3
Joined: Oct 2019
Having a list of strings and knowing that all of them have to be merged as they have a common prefix or suffix
for a simple example, having
list=['red', 'allure', 'all'] the output should be out='allured' .
I tried many recursive and iterated methods, but all of them, at some point, as the input list gets longer strings got more complicated, do fail as the merge order is confusing.
All I know is that the best solution could be combining all the strings together, and the vertexes are the right final string.As I am not familiar with graphs at all, could someone please suggest me how to implement this problem through them?
Posts: 582
Threads: 1
Joined: Aug 2019
It is not clear to me what you really want. You do realize the order of the list will change the result? Do you realize a fit is not always possible? Or do you want to test all elements of the list so far? And when is there a match? Is a match of one character sufficient or do you want to merge the element with the most characters fitting?
And what do you mean when you say your algorithm fails? Do you get an error or is the result not what you expected?
Posts: 7
Threads: 3
Joined: Oct 2019
Jun-28-2020, 08:49 AM
(This post was last modified: Jun-28-2020, 09:14 AM by Den.)
Forgive me, you're right,I was not clear enough at all. The idea is given by an italian game "the talking crow" in which a list of words is given and all of its words are supposed to get concatenated, as they all have a prefix that is the suffix of another word or vice versa. If a word has a suffix common as a prefix to more than one word, the one which begins with the longest common slice, gets to be concatenated (match).
So by that, I can know when my algorithm fails, it does not work right for all cases, tbe order of the words merged in the final string does not match the solution. Still, forgive me for my unclearness
(Jun-28-2020, 08:49 AM)Den Wrote: Forgive me, you're right,I was not clear enough at all. The idea is given by an italian game "the talking crow" in which a list of words is given and all of its words are supposed to get concatenated, as they all have a prefix that is the suffix of another word or vice versa. If a word has a suffix common as a prefix to more than one word, the one which begins with the longest common slice, gets to be concatenated (match).
So by that, I can know when my algorithm fails, it does not work right for all cases, tbe order of the words merged in the final string does not match the solution. Still, forgive me for my unclearness
for example, a simple chase:
Original_list=['ottuso', 'iodato', 'coniglio'] the final string of the merged words is:
Output='conigliodatottuso' and the order of the merged word, based from the original order is:
[2,1,0]
Posts: 6,779
Threads: 20
Joined: Feb 2020
I haven't figured out how to generate the graph, but this should find all possible solutions. It is not elegant.
def merge(prefix, suffix):
"""Return a word thaat is prefix+suffix with matching letters only
appearing once. For example, "taco" and "comply" have matching
letters "co" and the merge is "tocomply"
"""
end = min(len(prefix), len(suffix))
for i in range(end, 0, -1):
if prefix[-i:] == suffix[:i]:
return prefix + suffix[i:]
return None
def replace(words, prefix, suffix, merge):
"""Make new words list with prefix and suffix replaced by tne
merged word
"""
newlist = words[:]
newlist.remove(words[prefix])
newlist.remove(words[suffix])
newlist.append(merge)
return newlist
def solve(words):
"""Brute force recursive solution to The Talking Crow game"""
if len(words) == 1:
print(words[0]) # Found a solution!
for i in range(0, len(words)-1):
for j in range(i+1, len(words)):
word = merge(words[i], words[j])
if word:
solve(replace(words, i, j, word))
word = merge(words[j], words[i])
if word:
solve(replace(words, j, i, word))
solve(['ottuso', 'iodato', 'coniglio'])
Posts: 7
Threads: 3
Joined: Oct 2019
(Jun-28-2020, 05:51 PM)deanhystad Wrote: I haven't figured out how to generate the graph, but this should find all possible solutions. It is not elegant.
def merge(prefix, suffix):
"""Return a word thaat is prefix+suffix with matching letters only
appearing once. For example, "taco" and "comply" have matching
letters "co" and the merge is "tocomply"
"""
end = min(len(prefix), len(suffix))
for i in range(end, 0, -1):
if prefix[-i:] == suffix[:i]:
return prefix + suffix[i:]
return None
def replace(words, prefix, suffix, merge):
"""Make new words list with prefix and suffix replaced by tne
merged word
"""
newlist = words[:]
newlist.remove(words[prefix])
newlist.remove(words[suffix])
newlist.append(merge)
return newlist
def solve(words):
"""Brute force recursive solution to The Talking Crow game"""
if len(words) == 1:
print(words[0]) # Found a solution!
for i in range(0, len(words)-1):
for j in range(i+1, len(words)):
word = merge(words[i], words[j])
if word:
solve(replace(words, i, j, word))
word = merge(words[j], words[i])
if word:
solve(replace(words, j, i, word))
solve(['ottuso', 'iodato', 'coniglio']) Thank you for the tips! The reasoning is similar to my code, I've tried this out, but it still does not work for cases with longer and more complicated lists
Posts: 6,779
Threads: 20
Joined: Feb 2020
I suspect the problem is there may be multiple ways to merge some strings and picking the longest overlap (which my code does) may not give the best results. Merge may need to be rewritten as a generator and an additional for loop used to check all possible merges.
def merge(prefix, suffix):
"""Return a word thaat is prefix+suffix with matching letters only
appearing once. For example, "taco" and "comply" have matching
letters "co" and the merge is "tocomply"
"""
end = min(len(prefix), len(suffix))
for i in range(end, 0, -1):
if prefix[-i:] == suffix[:i]:
yield prefix + suffix[i:]
def replace(words, prefix, suffix, merge):
newlist = words[:]
newlist.remove(words[prefix])
newlist.remove(words[suffix])
newlist.append(merge)
return newlist
def solve(words):
if len(words) == 1:
print(words[0])
for i in range(0, len(words)-1):
for j in range(i+1, len(words)):
for word in merge(words[i], words[j]):
solve(replace(words, i, j, word))
for word in merge(words[j], words[i]):
solve(replace(words, j, i, word))
solve(['ottuso', 'iodato', 'coniglio']) It would be greatly appreciated if you would share a word list that has a solution but fails.
Posts: 7
Threads: 3
Joined: Oct 2019
(Jun-28-2020, 10:06 PM)deanhystad Wrote: I suspect the problem is there may be multiple ways to merge some strings and picking the longest overlap (which my code does) may not give the best results. Merge may need to be rewritten as a generator and an additional for loop used to check all possible merges.
def merge(prefix, suffix):
"""Return a word thaat is prefix+suffix with matching letters only
appearing once. For example, "taco" and "comply" have matching
letters "co" and the merge is "tocomply"
"""
end = min(len(prefix), len(suffix))
for i in range(end, 0, -1):
if prefix[-i:] == suffix[:i]:
yield prefix + suffix[i:]
def replace(words, prefix, suffix, merge):
newlist = words[:]
newlist.remove(words[prefix])
newlist.remove(words[suffix])
newlist.append(merge)
return newlist
def solve(words):
if len(words) == 1:
print(words[0])
for i in range(0, len(words)-1):
for j in range(i+1, len(words)):
for word in merge(words[i], words[j]):
solve(replace(words, i, j, word))
for word in merge(words[j], words[i]):
solve(replace(words, j, i, word))
solve(['ottuso', 'iodato', 'coniglio']) It would be greatly appreciated if you would share a word list that has a solution but fails. Of course, here are some examples(to see if all of the strings are merged, it is helpful to enumerate the original list and to produce a sequence of numbers in which each number corresponds to the initial word in the list. I used dictionaries to optimize the timing, but the struggle is still there):
list1=['xgtjYkgnXKFlHn',
'xcKlFXdmrJuGBj',
'xbTXJwtxPI',
'wtxPINSDCLk',
'wBQVGlNLxcKlF',
'vIDrHDiNNkGX',
'tXscIAkfnLhFDxg',
'tDujNHoYrxpx',
'rxpxQKpLrdR',
'qysLnwRnopuiC',
'qtxVjjsQtRntqy',
'qTUiDMpgMbUlg',
'pPfDHrIjBMYTPQ',
'pLrdROsvUvUK',
'onfAfJFmVy',
'nPSwFNdIsYHh',
'mWpdNrdxlyOQnJ',
'mNodRbGdNkHX',
'ldBqdHNNtl',
'kDvtpRsoRV',
'jrFsOyXQvAx',
'iJdWbyFKobu',
'iCvMJDvuyVckUq',
'hMpEpPSXdW',
'gTVGXOIVcEwtxPI',
'dNkHXvmwvPhM',
'bitslkgLwrBLVOX',
'YHhTikUNadrUvID',
'WBXsIYdOvPy',
'VToOpqQSxTC',
'UyUSLJMFtTldB',
'UlgADUoRFLaKrTl',
'UchIJHHWaQvTnPS',
'UKwoQYiJTMQv',
'RoRoLpLWimWpd',
'QvLbGpthUy',
'QnJbhNnHonfAf',
'QVbdjkVpstDu',
'PlVwIqtxVj',
'PQmKNbSfPl',
'OvPyGetXsc',
'NtlcOXCpEsYHh',
'NkGXJrbWaNcoti',
'LVOXqgTVGX',
'KrTldonoTFQVbd',
'IWwRMMATwBEDxT',
'FmVyuhHlfgb',
'FlHnNUchIJ',
'EsYHhQmNod',
'EDxTpjKBFR',
'CmAXVtcxyaoIp',
'CjrMWFTlJwBQVG']
list2=['°$OwStvCI.&VYBxJxSuTN*XR?Gv',
'y^#=pNcvRyq._UXQ:"Ps#msGhx!;',
'y"WpTl,(@uM,a&nrGSGxR°!+StOC',
'x!;!VgpLJxO+kTqoI_eJEWOLh',
'[email protected]^J$_BP$=J+Ffx)OLDu',
'vDdU=vcN?v@XR^^jUsOjQ@?,.Qg',
'ub;tBHsw°PeI:=D=UyhlWF&.w',
'tw!iDTHsHL+)XIFpMva@(dqK*ig$',
'tOCsHdupn;bH&ykl$(DqBf.ndLwl*a',
'sdbqdSmK*H&qm)JuF?.$SXXC!°N',
'sHkrcrh(QURDq;%+A°FMBOG,jg',
'qq@k)nDTV(SoP&kh(eVgL,O&^lf;eI',
'o%Eo°AhuQEGY"gSud(PTyA^DXsJr',
'n?LLO°;KIiHkek.%(Q*VI#$uVUNo=',
'l*apH#;[email protected],QL"X',
'l#Me@)+.YyS_Aoj,!^mnA%n$MhNC',
'ig$%K°B$sdMmJgNgp*kSHRJ°o%',
'hNCSV#BqtuoBi)B°Ocwn&g:,&r)b$-',
'hNCKV@+g):y#dii°=Cs;Xrg"C@B',
'flhqopIajp%?JkbeXgY)gu.)LEa^I',
'flhDn+ye$,AEl_ylBH_jT!,*hvLdw',
'eNMv&Q#°bL.=EY+)GMDDQN&@n',
'eISBe_Nbne-cBPW:nGlGkjjwWkd*',
'c.vJh?#,*neHdr(CVGtuF$.:R',
'_BxaJTtB*xxiVoD,IrRNYvQ*rH#oB',
'_BNaa°("NQH_s=$rlumGm?AolKJU%X:',
'^YKrD#yV&c+txRIVWrn*AS_Pk,?',
'X^qqhp*evwu:$owd#e-CWFOl_Rv;Jd',
'VpLyf°)^:;wPRfc_RfMQDW?O^Y',
'V,k%:yEAocud+qTJnBq&AJkhUW$VpL',
'T°&j-K@%vnS(ajJ(n_R(BWingaP$V,',
'S)$,K_hXcxV:)!+Db(r@SVUfk&',
'R!ySX_agBsbl*":jiLq-n;ujD#?JX',
'QgPfSGy%YjrX.AYUa$nhkLAusHk',
'PTXW*Rj-hXHLfd,Nnp#:BcG*V,-hNC',
'P)?wLmc&PAyXSpSiCtWQjG=bQyTUvy',
'OtWaB;mtMO+Th@fDHBILQc@rso?eN',
'OLhT.KP:j.+G)mg°%hnl"aNleBbf*E',
'KNtWC):IsCtenH.qM°v@DdR@F+Iq?G',
'JrdA°_$&=sUN("-o^SamfCe.PA',
'JXi)T!(x°M#DTrQTVF*&lF#=!Asd',
'I$ttN)l-°EP(I*Cp;=-*%v+bFI;oa',
'GuFCCdC:sUn:"Il-;_SdlGDS)$',
'FWVcX%F(&-NQ°jr;SCxspUQLhnMay"',
'E$lRIBGinwGFAf$BQWnArVv)hUT°',
'DhISs.GIeDd-:$s.."o=&))BwAc',
'DAdbjw!I(^Xu_&nCy)cmGDE$EK_B',
'A!jyU@pe°H.#+ky@ABH$vtM*Tu_BN',
'@B%,mgcwHOtH:)UcQ=*QGcJbgPTX',
'?shpvPTa?GNRify=f,A?(#FVR@-(',
'=$l:CJLocwT&MA,j)fctgd)Po_EpkR',
';oaXNwmN*,#oiFM(=e°j)k°CEur$k',
';JdDxJ!MN°K;O$=tw_gtYM_Mflh',
';-yAXq-SELM-I,^xeg$f°$)DA',
':Rb!-dY==@O+R^t!k?o^RR°^"YcXD',
'.wSR:hT°Chih:C:,kXVP-Dc*+B#',
'-DMxeq°B^-G^UVVo"G_cuaREkhMflh',
',jgbY"EL;=L^v*KpNlY*uSHe;-',
'+B#vV=GljgoUUg;k=sfdi*!-D',
'*sKbuEJ-ct%I^a*YPD":y@$_A(RR(t',
'*n;@&hmc@#F&&p%Bj&FyVm^guOtW',
')KdLx+&hN°Wk_yv+sbf^lygc$_qq',
'(D%MklBitTLJ#UsJRRHdV:fVuwwK',
'&i*cy%e#TBaO-,@[email protected]',
'%X:=M*RCBnL?)*^tGSH&.?EaP)',
'$kIn#+AgSy:°kS__xv,Jjo(D#:IeFW',
'$-weivcvf_o:ExonqAJCui%°$',
'#ovN"jlHsYTDm=MhsYu#:kLmJ?h*n',
'#oBd^V#(DiJ&rHnGLke,;C&@st)K',
'!°NC%Cl%P?CDTf!_;.MOFhD*+B#']
list3=['t%vw;$GSG*&Q!&I?v*sqk)WJL',
'p&lE°jqHf&ekh$@+It%$Yb^h^*Vo:',
'o:.jq,Ew=*JfjTxG,OjPisML#TvWAD',
'n*KqK-Wax^C+D%lTS!PEQ;+#sCQ',
'm.jK@fyp"@gmSmB=NXGbnC^jxXVp',
'k?=.=Na".EBBwGe@"ngbCMWTd',
'iq#;uE,lOg?L-,u:KbBEPVR=xuFt',
'fyohhG+oe)jVBUV@)"B:)*KSd(Pm',
'_AHHQFj*wtaisDAqIW;UpJ+@Y##',
'Ycm.oXIyC=UfK!&=#$V#^^u"He',
'Y=D#(FwKY;n$d&F(Kqo;tdtWG',
'XdV@klgS^+FReN,NDb=HMfU*,',
'WAD-u$D#qrOwuEAdj??@?JheY',
'Vpwdt&FR.Xsn,j+cW^SwhR@Xh^pU',
'VE$:Wh@l#aSo*c@-pX)LWukYWUTl',
'UTl#ftb°"S-@RMp#.+WkmVMy;ac)',
'UEYm^WTdgF,Dwkh_uTF@WuFv,IX',
'Td(*=RLrijibeqbC#xkLrsDkmGHu',
'QU(:sqLPNa***_&mBI)$;vF).T&;',
'Pm*n(A)f$ewG°DrpI)Ca$oLSkI*dXd',
'LXfCQcroU^xUUo;)@QU.G"wwy_',
'KX*vpaoyDxD@maij(=,;l=cMoi',
'Hu=Q,(c;YnPn:NO!gIBapKQ:;j',
'HeUig?&j)!Tw°NLMv!l$PaQ^=@;KX*',
'GfHgjm&,kiRSwBv@o?tpCBMe!Jcq%r',
'G$U$&a;yHftvC,WqtQFHIGQ;$L',
'FtG-wPOMop@VniPmHjbP;O-"??Nu!;',
'FRIgRGKP^_xtlME(:PSb;Vn#C',
'COQ"(iIj@s)OMVu=It")ab!j;.Y',
'Bx,ty&WO**iBSTHrJQ%SVmu@yRxBV',
';j?jPOQqEGktJjW)SfhPB"mGfH',
';Uhii%q%+vVebS:w?:QaDqk?=',
';$L;(i+°NUNFBRy',
',IXgb.:Xf!KvOwaGfflLjcK(+mA,-',
',HQ#k+HV-sM_kJgcpyXKcX""t%',
',-sKy%blq$FAHye,!"uq*gG;IEn',
')°jI=,KK^?rnKqWMl_uTIH)fy',
'%reE(;XJLa*E!BAeNgoO-GBx,',
'##s=^U.aLcNubR!=IdqyaSSlFRI',
'!;yE#bUUl"ypT*o_n:RN°YiYaMp']
|