Remove a sentence if it contains a word. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Remove a sentence if it contains a word. (/thread-24340.html) |
Remove a sentence if it contains a word. - lokhtar - Feb-10-2020 I have a paragraph, contained in a string variable, that looks like this: Quote:This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it doesn't, I want to keep it. I want the string to become: Quote:This is an example of a paragraph that I have. If it doesn't, I want to keep it. I could use any help! Thank you in advance! RE: Remove a sentence if it contains a word. - Larz60+ - Feb-10-2020 what have you tried so far? RE: Remove a sentence if it contains a word. - lokhtar - Feb-10-2020 str_to_clean = re.sub("^.*\b(bad|naughty)\b.*$", "", str_to_clean, flags=re.IGNORECASE) RE: Remove a sentence if it contains a word. - Larz60+ - Feb-10-2020 you can also use existing packages here's one that claims to be much faster than regex: https://pypi.org/project/better-profanity/ RE: Remove a sentence if it contains a word. - lokhtar - Feb-11-2020 Thanks! I looked into that, but it simply replaces the words - I need to remove the whole sentence. I have it working where I go through the paragraph by line (each '.' is a new loop), and then combining the strings that don't contain those words. It works - but it seems like an extremely inelegant solution. RE: Remove a sentence if it contains a word. - lokhtar - Feb-11-2020 str_to_clean = "This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it is naughty, I do not want it. If it doesn't, I want to keep it." cleaned_str = "" for sentence in str_to_clean.split("."): if not (re.search("bad|naughty", sentence, flags=re.IGNORECASE)): cleaned_str = cleaned_str + sentence print(cleaned_str)The above works, but it seems...not the best. RE: Remove a sentence if it contains a word. - stullis - Feb-11-2020 Since you split the string on ".", you need to reinsert the periods. Changing cleaned_str to a list and using str.join() will get that done. import re str_to_clean = "This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it is naughty, I do not want it. If it doesn't, I want to keep it." cleaned_str = [] for sentence in str_to_clean.split("."): if not (re.search("bad|naughty", sentence, flags=re.IGNORECASE)): cleaned_str.append(sentence) print(".".join(cleaned_str)) |