Python Forum
Problem: Check if a list contains a word and then continue with the next word
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem: Check if a list contains a word and then continue with the next word
#1
Question 
I'm new to Python and while I have some experience with Java, I haven't coded in a long time. I'm currently trying to adapt a word search solver from wordsearch_solver by aphrx to find offensive words in a puzzle and append them to a list. I need a little help with the logic.

The solver should be part of a word search puzzle generator I already have. The solver works by checking "if word in row" (horizontally, vertically or diagonally) and appending the word to a list (fWords) if it's true. While/as long as the list (fWords) contains words, it should regenerate the word search puzzle.
However, I've encountered the problem that some offensive words are often part of words that were purposely placed in the puzzle, e.g. hell in hello, tit in title or ass in assemble. Since these words were purposely placed in the puzzle they will always be part of it and the code gets stuck in the loop regenerating the puzzle till the list of offensive words is empty which it never will be.
I've provided a sample grid (grid), a list of words hidden in the puzzle (wordlist) and a list of offensive words (badwords) that should be avoided. In my code the loop it gets stuck in is under "make_puzzle".For better demonstration I also have a screenshot of the grid/puzzle:
    https://imgur.com/a/NJ6cxjb

grid = [[
        'IZCTFXORREKETTRHSKZD'], ['HZADCBYLXYMWOENGYHXI'],
        ['MUHLSFGHHTIULSPUYKPN'], ['IYKXCJFKTWOEENOCQGGD'],
        ['NNWBGTOKLSHZGZSZFFVW'], ['MPIHEAHJMLJHUPLADQWB'],
        ['OWIRTPLMZJYBGZGXQHDD'], ['APOLUARPVYBRLWTFUDMJ'],
        ['OLPZROPYQUQHKBNKEUKK'], ['WRMZQPFGVDNDNRBDHOHW'],
        ['WAUSQEZEWTZVZQIEIQVQ'], ['QLCTRHLCBLAYCJLZZLMN'],
        ['OCCBUDTLJFDXJLMKVNEF'], ['DBQHGLEMPTHXOAQVMZDM'],
        ['RYGFSRIRTGIICICTGIPZ'], ['DAECZQHDDCETDRSTOEZH'],
        ['NVUNGMSHTWTFLQXBTCQK'], ['UIDEOZMOEYRBSEPFBYOC'],
        ['ZNDCJEHVUAKXTSIMLVOG'], ['ZDIPBYTCWWQTHUCMSEPN']]
badwords = ['TIT', 'HELL', 'WTF']
wordlist = ['HELLO', 'TITLE', 'FOUR', 'ONE', 'TWO']
fWords = []
# nrows is hard-coded in this example because the grid doesn't change
nrows = 20


def check_word(row, word, wlist, dir):
    if word in str(row):
        print(word + ' is in ' + str(row) + ' Direction: ' + dir)
        fWords.append(word)
        return True


def find_word(p2, bwords, rows, wlist):
    for word in bwords:
        find_horizontal(p2, word, wlist)
        find_vertical(p2, word, wlist)
        find_diagonal(p2, word, rows, wlist)


def find_horizontal(p2, word, wlist):
    dir = 'horizontal'
    dir_r = 'horizontal_reverse'
    for row in enumerate(p2):
        check_word(str(row), word, wlist, dir)
        row_r = str(row)[::-1]
        check_word(str(row_r), word, wlist, dir_r)
    return False


def find_vertical(p2, word, wlist):
    dir = 'vertical'
    dir_r = 'vertical_reverse'
    for char in range(len(p2[0][0])):
        temp = []
        for col in range(len(p2)):
            temp.append(p2[col][0][char])
        temp = ''.join(temp)
        temp_r = temp[::-1]
        check_word(str(temp), word, wlist, dir)
        check_word(str(temp_r), word, wlist, dir_r)
    return False


def find_diagonal(p2, word, rows, wlist):
    dir = 'diagonal'
    for a in range(0, len(p2[0][0])):
        temp = [[] for i in range(8)]
        ranges = [[] for i in range(8)]
        i = 0
        while ((a - i) >= 0) and (i < len(p2)):
            coords = [[i, a - i], [(rows - 1) - i, a - i], [(rows - 1) - i, (rows - 1) - (a - i)], [i, (rows - 1) - (a - i)]]
            for cx, c in enumerate(coords):
                temp[cx].append(p2[c[0]][0][c[1]])
                ranges[cx].append((c[0], c[1]))
                ranges[cx + 4].append((c[1], c[0]))
            i += 1

        for ti in range(4):
            temp[ti] = ''.join(temp[ti])
            temp[ti + 4] = temp[ti][::-1]

        for t in enumerate(temp):
            check_word(str(t), word, wlist, dir)
    return False


def checker(grid, nrows, wlist):
    find_word(grid, badwords, nrows, wlist)
    return fWords


# START OF THE PROGRAM
# def _make_puzzle(...)
checker(grid, nrows, wordlist)

# def make_puzzle(...) - This is where the puzzle loops
while fWords:
    print(fWords)
    fWords.clear()
    # The puzzle grid would get regenerated here: grid = _make_puzzle(*args, **kwargs)
if not fWords:
    if grid:
        pass
        # The final grid without offensive words would be returned here: return grid
Output:
Output:
TIT is in (1, 'MIELTITJCZPRLIPN') Direction: diagonal TIT is in (5, 'NPILRPZCJTITLEIM') Direction: diagonal HELL is in (6, 'DMUHELLOIEWEVT') Direction: diagonal WTF is in (7, ['APOLUARPVYBRLWTFUDMJ']) Direction: horizontal WTF is in (16, ['NVUNGMSHTWTFLQXBTCQK']) Direction: horizontal ['TIT', 'TIT', 'HELL', 'WTF', 'WTF']
The part that needs fixing is def check_word. The logic of how to implement the changes I want is a little beyond me. I got a little lost with all the nested loops and if statements. I think I have to
1. Check if a word from badwords is in the current row being checked.
2. Check if a word_wordlist from wordlist is in the current row being checked.
I've tried the following code in different variations:

def check_word(row, word, wlist, dir):
    if word in str(row):
        for word_wordlist in wlist:
            if (word not in word_wordlist) or (word not in word_wordlist[::-1]):
                print(word + ' is in ' + str(row) + ' Direction: ' + dir)
                fWords.append(word)
                return True
            elif (word in word_wordlist) or (word in word_wordlist[::-1]):
                continue
Also:
def check_word(row, word, wlist, dir):
    for word_wordlist in wlist:
        if word_wordlist in str(row):
            continue
        if (word_wordlist not in str(row)) and (word in str(row)):
            print(word + ' is in ' + str(row) + ' Direction: ' + dir)
            fWords.append(word)
            return True
I've tried rearranging the loops and the if statements but the nesting is a little confusing to me. Essentially, if a word is found as part of a word_wordlist, it should skip that word and continue with the next word but I'm not sure how to implement that properly.
Other approaches I've thought of are appending the row to a temporary list if the row contains a word_wordlist from wordlist and then skipping rows that are identical to the items in the temporary list. A problem here then though is that the row could still contain other offensive words and by skipping that row, they would not be flagged/added to the list fWords, e.g. a b c T I T e f g H E L L O. Since this row contains hello, it would get skipped and tit would not get added to fWords.

I've been working on this for around a week trying different variations and approaches but I think I might be approaching this wrong. Any guidance is appreciated.
Reply
#2
Thinking about this, I don't have an answer but would point out that you need to be careful of words that overlap as well. So, if the good words list has she and hello, bad words list has hell, you have to pull both the she and hello out of ['sssshelloxxx'] and not flag the hell out of it, but you would want it to flag in ['xxxhellxxshelloyy']. This is not trivial.
Reply
#3
Sounds like you are trying to filter out bad words and bad words only. If that is the case, regex is probably the way to go. I made the following snippet (warning: may content offensive material, but kinda need to do it to illustrate my point):

import re

string = ['assassin', 'title','titassembled', 'ass', 'tit', 'asstit', 'hello']

def find_bad_word(string):
        bad_word_list = ['ass', 'tit']
        bad_iter = iter(bad_word_list)
        badword = next(bad_iter)
        cleaned = re.sub(badword, '', string)

        if len(cleaned) == 0 or cleaned == badword:
            return 'found bad word'

        while len(cleaned) != 0 or badword in cleaned:
            try:
                badword = next(bad_iter)
                cleaned = re.sub(badword, '', cleaned)
                if len(cleaned) == 0 or cleaned == badword:
                    return 'found bad word'
                    break

            except StopIteration:
                return string


for i in string:
    print(find_bad_word(i))
The above will print:
Output:
assassin title titassembled found bad word found bad word found bad word hello
I am sure there is a more elegant way to do it but its past midnight now so yeah.....

The above example assumes phrases like 'hellwtf' constitutes bad words and needs to be filtered out, whereas 'hellwtfs' is okay.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 414 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Retrieve word from string knob 4 497 Jan-22-2024, 06:40 PM
Last Post: Pedroski55
  [solved] list content check paul18fr 6 715 Jan-04-2024, 11:32 AM
Last Post: deanhystad
  How to create a table with different sizes of columns in MS word pepe 8 1,580 Dec-08-2023, 07:31 PM
Last Post: Pedroski55
  extract substring from a string before a word !! evilcode1 3 548 Nov-08-2023, 12:18 AM
Last Post: evilcode1
  Replace a text/word in docx file using Python Devan 4 3,449 Oct-17-2023, 06:03 PM
Last Post: Devan
  How to summarize an article that is stored in a word document on your laptop? Mikedicenso87 2 670 Oct-06-2023, 12:07 PM
Last Post: Mikedicenso87
Thumbs Up Convert word into pdf and copy table to outlook body in a prescribed format email2kmahe 1 758 Sep-22-2023, 02:33 PM
Last Post: carecavoador
  Guess the word game help jackthechampion 3 3,036 Sep-06-2023, 06:51 AM
Last Post: Pedroski55
  Automate Word snippets PHbench 0 555 Jun-06-2023, 06:59 PM
Last Post: PHbench

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020