Problem: Check if a list contains a word and then continue with the next word

Mangono · (This post was last modified: Aug-11-2021, 04:20 PM by Mangono.)

I'm new to Python and while I have some experience with Java, I haven't coded in a long time. I'm currently trying to adapt a word search solver from wordsearch_solver by aphrx to find offensive words in a puzzle and append them to a list. I need a little help with the logic.

The solver should be part of a word search puzzle generator I already have. The solver works by checking "if word in row" (horizontally, vertically or diagonally) and appending the word to a list (fWords) if it's true. While/as long as the list (fWords) contains words, it should regenerate the word search puzzle.
However, I've encountered the problem that some offensive words are often part of words that were purposely placed in the puzzle, e.g. hell in hello, tit in title or ass in assemble. Since these words were purposely placed in the puzzle they will always be part of it and the code gets stuck in the loop regenerating the puzzle till the list of offensive words is empty which it never will be.
I've provided a sample grid (grid), a list of words hidden in the puzzle (wordlist) and a list of offensive words (badwords) that should be avoided. In my code the loop it gets stuck in is under "make_puzzle".For better demonstration I also have a screenshot of the grid/puzzle:

https://imgur.com/a/NJ6cxjb

grid = [[
        'IZCTFXORREKETTRHSKZD'], ['HZADCBYLXYMWOENGYHXI'],
        ['MUHLSFGHHTIULSPUYKPN'], ['IYKXCJFKTWOEENOCQGGD'],
        ['NNWBGTOKLSHZGZSZFFVW'], ['MPIHEAHJMLJHUPLADQWB'],
        ['OWIRTPLMZJYBGZGXQHDD'], ['APOLUARPVYBRLWTFUDMJ'],
        ['OLPZROPYQUQHKBNKEUKK'], ['WRMZQPFGVDNDNRBDHOHW'],
        ['WAUSQEZEWTZVZQIEIQVQ'], ['QLCTRHLCBLAYCJLZZLMN'],
        ['OCCBUDTLJFDXJLMKVNEF'], ['DBQHGLEMPTHXOAQVMZDM'],
        ['RYGFSRIRTGIICICTGIPZ'], ['DAECZQHDDCETDRSTOEZH'],
        ['NVUNGMSHTWTFLQXBTCQK'], ['UIDEOZMOEYRBSEPFBYOC'],
        ['ZNDCJEHVUAKXTSIMLVOG'], ['ZDIPBYTCWWQTHUCMSEPN']]
badwords = ['TIT', 'HELL', 'WTF']
wordlist = ['HELLO', 'TITLE', 'FOUR', 'ONE', 'TWO']
fWords = []
# nrows is hard-coded in this example because the grid doesn't change
nrows = 20


def check_word(row, word, wlist, dir):
    if word in str(row):
        print(word + ' is in ' + str(row) + ' Direction: ' + dir)
        fWords.append(word)
        return True


def find_word(p2, bwords, rows, wlist):
    for word in bwords:
        find_horizontal(p2, word, wlist)
        find_vertical(p2, word, wlist)
        find_diagonal(p2, word, rows, wlist)


def find_horizontal(p2, word, wlist):
    dir = 'horizontal'
    dir_r = 'horizontal_reverse'
    for row in enumerate(p2):
        check_word(str(row), word, wlist, dir)
        row_r = str(row)[::-1]
        check_word(str(row_r), word, wlist, dir_r)
    return False


def find_vertical(p2, word, wlist):
    dir = 'vertical'
    dir_r = 'vertical_reverse'
    for char in range(len(p2[0][0])):
        temp = []
        for col in range(len(p2)):
            temp.append(p2[col][0][char])
        temp = ''.join(temp)
        temp_r = temp[::-1]
        check_word(str(temp), word, wlist, dir)
        check_word(str(temp_r), word, wlist, dir_r)
    return False


def find_diagonal(p2, word, rows, wlist):
    dir = 'diagonal'
    for a in range(0, len(p2[0][0])):
        temp = [[] for i in range(8)]
        ranges = [[] for i in range(8)]
        i = 0
        while ((a - i) >= 0) and (i < len(p2)):
            coords = [[i, a - i], [(rows - 1) - i, a - i], [(rows - 1) - i, (rows - 1) - (a - i)], [i, (rows - 1) - (a - i)]]
            for cx, c in enumerate(coords):
                temp[cx].append(p2[c[0]][0][c[1]])
                ranges[cx].append((c[0], c[1]))
                ranges[cx + 4].append((c[1], c[0]))
            i += 1

        for ti in range(4):
            temp[ti] = ''.join(temp[ti])
            temp[ti + 4] = temp[ti][::-1]

        for t in enumerate(temp):
            check_word(str(t), word, wlist, dir)
    return False


def checker(grid, nrows, wlist):
    find_word(grid, badwords, nrows, wlist)
    return fWords


# START OF THE PROGRAM
# def _make_puzzle(...)
checker(grid, nrows, wordlist)

# def make_puzzle(...) - This is where the puzzle loops
while fWords:
    print(fWords)
    fWords.clear()
    # The puzzle grid would get regenerated here: grid = _make_puzzle(*args, **kwargs)
if not fWords:
    if grid:
        pass
        # The final grid without offensive words would be returned here: return grid

Output:

Output:TIT is in (1, 'MIELTITJCZPRLIPN') Direction: diagonal
TIT is in (5, 'NPILRPZCJTITLEIM') Direction: diagonal
HELL is in (6, 'DMUHELLOIEWEVT') Direction: diagonal
WTF is in (7, ['APOLUARPVYBRLWTFUDMJ']) Direction: horizontal
WTF is in (16, ['NVUNGMSHTWTFLQXBTCQK']) Direction: horizontal
['TIT', 'TIT', 'HELL', 'WTF', 'WTF']

The part that needs fixing is def check_word. The logic of how to implement the changes I want is a little beyond me. I got a little lost with all the nested loops and if statements. I think I have to
1. Check if a word from badwords is in the current row being checked.
2. Check if a word_wordlist from wordlist is in the current row being checked.
I've tried the following code in different variations:

def check_word(row, word, wlist, dir):
    if word in str(row):
        for word_wordlist in wlist:
            if (word not in word_wordlist) or (word not in word_wordlist[::-1]):
                print(word + ' is in ' + str(row) + ' Direction: ' + dir)
                fWords.append(word)
                return True
            elif (word in word_wordlist) or (word in word_wordlist[::-1]):
                continue

Also:

def check_word(row, word, wlist, dir):
    for word_wordlist in wlist:
        if word_wordlist in str(row):
            continue
        if (word_wordlist not in str(row)) and (word in str(row)):
            print(word + ' is in ' + str(row) + ' Direction: ' + dir)
            fWords.append(word)
            return True

I've tried rearranging the loops and the if statements but the nesting is a little confusing to me. Essentially, if a word is found as part of a word_wordlist, it should skip that word and continue with the next word but I'm not sure how to implement that properly.
Other approaches I've thought of are appending the row to a temporary list if the row contains a word_wordlist from wordlist and then skipping rows that are identical to the items in the temporary list. A problem here then though is that the row could still contain other offensive words and by skipping that row, they would not be flagged/added to the list fWords, e.g. a b c T I T e f g H E L L O. Since this row contains hello, it would get skipped and tit would not get added to fWords.

I've been working on this for around a week trying different variations and approaches but I think I might be approaching this wrong. Any guidance is appreciated.

jefsummers · Aug-11-2021, 06:17 PM

Thinking about this, I don't have an answer but would point out that you need to be careful of words that overlap as well. So, if the good words list has she and hello, bad words list has hell, you have to pull both the she and hello out of ['sssshelloxxx'] and not flag the hell out of it, but you would want it to flag in ['xxxhellxxshelloyy']. This is not trivial.

palladium · Aug-12-2021, 04:25 PM

Sounds like you are trying to filter out bad words and bad words only. If that is the case, regex is probably the way to go. I made the following snippet (warning: may content offensive material, but kinda need to do it to illustrate my point):

import re

string = ['assassin', 'title','titassembled', 'ass', 'tit', 'asstit', 'hello']

def find_bad_word(string):
        bad_word_list = ['ass', 'tit']
        bad_iter = iter(bad_word_list)
        badword = next(bad_iter)
        cleaned = re.sub(badword, '', string)

        if len(cleaned) == 0 or cleaned == badword:
            return 'found bad word'

        while len(cleaned) != 0 or badword in cleaned:
            try:
                badword = next(bad_iter)
                cleaned = re.sub(badword, '', cleaned)
                if len(cleaned) == 0 or cleaned == badword:
                    return 'found bad word'
                    break

            except StopIteration:
                return string


for i in string:
    print(find_bad_word(i))

The above will print:

Output:assassin
title
titassembled
found bad word
found bad word
found bad word
hello

I am sure there is a more elegant way to do it but its past midnight now so yeah.....

The above example assumes phrases like 'hellwtf' constitutes bad words and needs to be filtered out, whereas 'hellwtfs' is okay.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Help with to check an Input list data with a data read from an external source	sacharyya	3	414	Mar-09-2024, 12:33 PM Last Post: Pedroski55
	Retrieve word from string	knob	4	497	Jan-22-2024, 06:40 PM Last Post: Pedroski55
	[solved] list content check	paul18fr	6	715	Jan-04-2024, 11:32 AM Last Post: deanhystad
	How to create a table with different sizes of columns in MS word	pepe	8	1,580	Dec-08-2023, 07:31 PM Last Post: Pedroski55
	extract substring from a string before a word !!	evilcode1	3	548	Nov-08-2023, 12:18 AM Last Post: evilcode1
	Replace a text/word in docx file using Python	Devan	4	3,449	Oct-17-2023, 06:03 PM Last Post: Devan
	How to summarize an article that is stored in a word document on your laptop?	Mikedicenso87	2	670	Oct-06-2023, 12:07 PM Last Post: Mikedicenso87
	Convert word into pdf and copy table to outlook body in a prescribed format	email2kmahe	1	758	Sep-22-2023, 02:33 PM Last Post: carecavoador
	Guess the word game help	jackthechampion	3	3,036	Sep-06-2023, 06:51 AM Last Post: Pedroski55
	Automate Word snippets	PHbench	0	555	Jun-06-2023, 06:59 PM Last Post: PHbench

Problem: Check if a list contains a word and then continue with the next word

User Panel Messages

Announcements