Finding how many times substring is in a string using re module

ranbarr · May-19-2021, 12:34 PM

Hi all,
Im learning now the re module and I'm trying to use it to find how many times the words "i" and "i'm" in the song Party Rock Anthem.
I changed the song to be in lowercase and wrote this:

i_in_new = (re.findall(r"\bi\b|\bi\'m\\b" ,readable_new))

but my out put is only the numbers of i and not i'm,, I think the issue is that it sees the: ' as a string but I dont know how to change it.
appreciate any help!

DPaul · (This post was last modified: May-19-2021, 03:29 PM by DPaul.)

if you import string and use count, it's a piece of cake. Smile

Little caveat if the 'i' in "i'm" does - or does not - count for i.

Of course if you want to use the re module, a few things are unclear.
What is the "i" word ? ("i", "i ", "i'll" , "i." etc...)
Or is it capital "I" ?, not easy, can be followed by any punctuation mark.

Paul

ranbarr · May-20-2021, 02:07 PM

(May-19-2021, 03:29 PM)DPaul Wrote: if you import string and use count, it's a piece of cake.
Little caveat if the 'i' in "i'm" does - or does not - count for i.

Of course if you want to use the re module, a few things are unclear.
What is the "i" word ? ("i", "i ", "i'll" , "i." etc...)
Or is it capital "I" ?, not easy, can be followed by any punctuation mark.

Paul

It was really helpfull! thanks!

***snippsat*** · (This post was last modified: May-21-2021, 05:07 PM by snippsat.)

When count all i can not use \b(word boundary) as it do exact match of i alone.
Just use i.

>>> import re 
>>> 
>>> s = "Tight jeans, tattoo,cause I'm rock and roll"
>>> re.findall(r'i', s.lower())
['i', 'i']
>>> len(re.findall(r'i', s.lower()))
2

So maybe task say something about overlapping occurrences.
As eg should this count as 5 or 7.

>>> s = "Party rock is in the house tonight I'm and i'm"
>>> len(re.findall("i|\bi'm\b", s.lower()))
5
>>> 
>>> len(re.findall("i", s.lower())) + len(re.findall(r"\bi'm\b", s.lower()))
7

**nilamo** · May-21-2021, 06:14 PM

Instead of str.lower(), you can also use the IGNORECASE regex flag, which would likely be faster for larger strings.

>>> import re
>>> regex = re.compile("i", re.IGNORECASE)
>>> regex.findall("If I had a nickle for every time I used the 'if I had a nickle' line, I'd be rich.")
['I', 'I', 'i', 'i', 'I', 'i', 'I', 'i', 'i', 'I', 'i']
>>> len(regex.findall("If I had a nickle for every time I used the 'if I had a nickle' line, I'd be rich."))
11

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Checking if string starts the same but end differently using re module	ranbarr	1	2,178	May-20-2021, 06:23 PM Last Post: Gribouillis
	Substring in a string	Propoganda	1	2,911	Dec-01-2019, 08:45 AM Last Post: perfringo
	Find a substring in a dictonary; value substring of key	aapurdel	2	8,403	May-31-2019, 06:14 PM Last Post: ichabod801
	Finding and storing all string with character A at middle position	Pippi	2	3,383	Jan-20-2019, 08:23 AM Last Post: Pippi
	Find how many times a user played an artist and how many times	disruptfwd8	1	3,250	May-04-2018, 08:32 AM Last Post: killerrex
	Python: Returning the most frequently occurring substring in a larger string	sskkddrit	2	4,788	Feb-09-2018, 06:41 AM Last Post: sskkddrit
	[Discussion] Re. Substring counting	Mekire	9	6,856	Jan-22-2018, 01:56 PM Last Post: wavic
	Finding repetition in string	student8	4	6,199	Oct-15-2017, 07:26 PM Last Post: student8

Finding how many times substring is in a string using re module

User Panel Messages

Announcements