Python Forum
Case Insensitive Censor Function
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Case Insensitive Censor Function
#1
hello,
I am trying to write a case insensitive censor function which takes an input(text, banned) with banned being a list of banned words to censor from the text. I have the basic censor function written however I'm having trouble with trying to make it case insensitive. I'll put my code below.

def censor2(text, banned):
    for i in banned:
        if i in text:
            text = text.replace(i,'*'*len(i))
    return text
This function works for the most part, except when the banned word has different cases in the the text. Note that if the input is (Hello World, hello), I cannot simply insert the line text = text.lower() before the for loop as this would return ***** world, instead of ***** World. Any advice anyone has about making this function case insensitive without lowering all of the text as the output I'd really appreciate!
Reply
#2
you use i for index in both first and second loop, so the second loop overrides i of the first

It runs if you're lucky, but it is not going to deliver what you expect in the first loop.

you need to rename either the first or second 'i'
Reply
#3
(Jan-14-2021, 08:38 PM)Larz60+ Wrote: you use i for index in both first and second loop, so the second loop overrides i of the first

It runs if you're lucky, but it is not going to deliver what you expect in the first loop.

you need to rename either the first or second 'i'

No, he doesn't. The first i is indeed in a loop:
    for i in banned:
but the second is just a test if that first i is present in the text
       if i in text:
The question is about preserving the original case in text.
I would use regular expressions with IGNORECASE in something like:
import re

def censor2(text, banned):
    for i in banned:
        replace = re.compile(re.escape(i), re.IGNORECASE)
        substitute = '*'*len(i)
        text = replace.sub(substitute, text)
    return text
Larz60+ likes this post
Reply
#4
When I wrote my post yesterday I was pretty tired. Maybe an explanation is in place.

The "IGNORECASE" flag is pretty obvious, it secures matching regardless of case, the "escape" function escapes all special characters in the pattern "i", if any. I would maybe use better names, but that is mostly left to personal taste.
You can use other flags and combine them with "|" (bitwise OR). It is unicode safe unless you specify a langauage or ASCII flag. Best to consult https://docs.python.org/.

I tested it this morning:
>>> import re
>>> def censor2(text, banned):
...     for i in banned:
...         replace = re.compile(re.escape(i), re.IGNORECASE)
...         substitute = '*'*len(i)
...         text = replace.sub(substitute, text)
...     return text
>>> print(censor2('Hello World, hElLo IDIOT, HeLlO', ['hello', 'idiot']))
***** World, ***** *****, *****
>>> 
HNiChuimin likes this post
Reply
#5
(Jan-15-2021, 08:20 AM)Serafim Wrote: When I wrote my post yesterday I was pretty tired. Maybe an explanation is in place.

The "IGNORECASE" flag is pretty obvious, it secures matching regardless of case, the "escape" function escapes all special characters in the pattern "i", if any. I would maybe use better names, but that is mostly left to personal taste.
You can use other flags and combine them with "|" (bitwise OR). It is unicode safe unless you specify a langauage or ASCII flag. Best to consult https://docs.python.org/.

I tested it this morning:
>>> import re
>>> def censor2(text, banned):
...     for i in banned:
...         replace = re.compile(re.escape(i), re.IGNORECASE)
...         substitute = '*'*len(i)
...         text = replace.sub(substitute, text)
...     return text
>>> print(censor2('Hello World, hElLo IDIOT, HeLlO', ['hello', 'idiot']))
***** World, ***** *****, *****
>>> 
Thank you for taking the time to reply!! Code doesn't come easy to me so its great to learn from things like this, I really appreciate it thank you!
:)
Reply
#6
THat's what we are here for. Glad I could be of help-
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  List Censor Thread palmtrees 6 6,986 Oct-05-2016, 06:22 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020