Python Forum
Remove a sentence if it contains a word.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Remove a sentence if it contains a word.
#1
I have a paragraph, contained in a string variable, that looks like this:

Quote:This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it doesn't, I want to keep it.

I want the string to become:

Quote:This is an example of a paragraph that I have. If it doesn't, I want to keep it.

I could use any help!

Thank you in advance!
Reply
#2
what have you tried so far?
Reply
#3
str_to_clean = re.sub("^.*\b(bad|naughty)\b.*$", "", str_to_clean, flags=re.IGNORECASE)
Reply
#4
you can also use existing packages
here's one that claims to be much faster than regex: https://pypi.org/project/better-profanity/
Reply
#5
Thanks! I looked into that, but it simply replaces the words - I need to remove the whole sentence. I have it working where I go through the paragraph by line (each '.' is a new loop), and then combining the strings that don't contain those words. It works - but it seems like an extremely inelegant solution.
Reply
#6
str_to_clean = "This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it is naughty, I do not want it. If it doesn't, I want to keep it."

cleaned_str = ""

for sentence in str_to_clean.split("."):
    if not (re.search("bad|naughty", sentence, flags=re.IGNORECASE)):
        cleaned_str = cleaned_str + sentence

print(cleaned_str)
The above works, but it seems...not the best.
Reply
#7
Since you split the string on ".", you need to reinsert the periods. Changing cleaned_str to a list and using str.join() will get that done.

import re

str_to_clean = "This is an example of a paragraph that I have. I would like to remove any sentences containing certain words, for example the word bad, or naughty. If it has bad, I don't want it. If it is naughty, I do not want it. If it doesn't, I want to keep it."
 
cleaned_str = []
 
for sentence in str_to_clean.split("."):
    if not (re.search("bad|naughty", sentence, flags=re.IGNORECASE)):
        cleaned_str.append(sentence)

print(".".join(cleaned_str))
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Problem: Check if a list contains a word and then continue with the next word Mangono 2 2,455 Aug-12-2021, 04:25 PM
Last Post: palladium
  remove vowels in word with conditional ambrozote 12 4,004 May-02-2021, 06:57 PM
Last Post: perfringo
  while sentence kimyyya 3 2,905 Mar-20-2021, 06:00 AM
Last Post: Pedroski55
  List / arrays putting in sentence Kurta 3 2,514 Dec-25-2020, 11:29 AM
Last Post: Larz60+
  How to make a telegram bot respond to the specific word in a sentence? Metodolog 2 6,271 Dec-22-2020, 07:30 AM
Last Post: martabassof
  How to match partial sentence in long sentence Mekala 1 1,486 Jul-22-2020, 02:21 PM
Last Post: perfringo
  Python Speech recognition, word by word AceScottie 6 15,857 Apr-12-2020, 09:50 AM
Last Post: vinayakdhage
  Regex Help for clubbing similar sentence segments regstuff 3 2,113 Nov-20-2019, 06:46 AM
Last Post: perfringo
  Cannot Remove the Double Quotes on a Certain Word (String) Python BeautifulSoup soothsayerpg 5 6,989 Oct-27-2019, 09:53 AM
Last Post: newbieAuggie2019
  print a word after specific word search evilcode1 8 4,717 Oct-22-2019, 08:08 AM
Last Post: newbieAuggie2019

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020