Feb-07-2017, 04:21 PM
This is my problem statement. American states are identified by 2 letter abbreviations. There are 50 such. Find the combination of abbreviations that form valid English words.
Find valid 4 letter, 6 letter, 8 letter and 10 letter words. No abbreviation should be used more than once in a word.
Below is the code I used for finding 10 letter words. The others are all very similar. The script ran pretty quick for 4 and 6 letter words, took almost 40 minutes for 8 letter words and for 10 letter words I let it run for 2 hours and it still didn't complete.
words.txt is a file I downloaded from here
svnwebdotfreebsddotorgslashcsrgslashshareslashdictslashwordsquestionmarkrevision=61569ampersandview=co
Would appreciate your insight on how to speed up. I think it's almost all CPU and not much of i/o- it is executing the 2 if's 50^5 times, right?
Find valid 4 letter, 6 letter, 8 letter and 10 letter words. No abbreviation should be used more than once in a word.
Below is the code I used for finding 10 letter words. The others are all very similar. The script ran pretty quick for 4 and 6 letter words, took almost 40 minutes for 8 letter words and for 10 letter words I let it run for 2 hours and it still didn't complete.
words.txt is a file I downloaded from here
svnwebdotfreebsddotorgslashcsrgslashshareslashdictslashwordsquestionmarkrevision=61569ampersandview=co
Would appreciate your insight on how to speed up. I think it's almost all CPU and not much of i/o- it is executing the 2 if's 50^5 times, right?
states1 = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"] states2 = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"] states3 = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"] states4 = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"] states5 = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"] wordFile = open('words.txt') temp = wordFile.read().splitlines() outputFile = open('outfile10LetterWords.txt', 'w') for state1 in states1: for state2 in states2: for state3 in states3: for state4 in states4: for state5 in states5: if (state5 != state4 and state5 != state3 and state5 !=state2 and state5 != state1 and state4 != state3 and state4 !=state2 and state4 != state1 and state3 != state2 and state3 !=state1 and state2 != state1): word = state1 + state2 + state3 + state4 + state5 if word.lower() in temp: outputFile.write(word + '\n') outputFile.close()