Python Forum
Finding how many times substring is in a string using re module
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Finding how many times substring is in a string using re module
#1
Hi all,
Im learning now the re module and I'm trying to use it to find how many times the words "i" and "i'm" in the song Party Rock Anthem.
I changed the song to be in lowercase and wrote this:
i_in_new = (re.findall(r"\bi\b|\bi\'m\\b" ,readable_new))
but my out put is only the numbers of i and not i'm,, I think the issue is that it sees the: ' as a string but I dont know how to change it.
appreciate any help!
Reply
#2
if you import string and use count, it's a piece of cake. Smile
Little caveat if the 'i' in "i'm" does - or does not - count for i.

Of course if you want to use the re module, a few things are unclear.
What is the "i" word ? ("i", "i ", "i'll" , "i." etc...)
Or is it capital "I" ?, not easy, can be followed by any punctuation mark.

Paul
ranbarr likes this post
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#3
(May-19-2021, 03:29 PM)DPaul Wrote: if you import string and use count, it's a piece of cake. Smile
Little caveat if the 'i' in "i'm" does - or does not - count for i.

Of course if you want to use the re module, a few things are unclear.
What is the "i" word ? ("i", "i ", "i'll" , "i." etc...)
Or is it capital "I" ?, not easy, can be followed by any punctuation mark.

Paul

It was really helpfull! thanks!
Reply
#4
When count all i can not use \b(word boundary) as it do exact match of i alone.
Just use i.
>>> import re 
>>> 
>>> s = "Tight jeans, tattoo,cause I'm rock and roll"
>>> re.findall(r'i', s.lower())
['i', 'i']
>>> len(re.findall(r'i', s.lower()))
2
So maybe task say something about overlapping occurrences.
As eg should this count as 5 or 7.
>>> s = "Party rock is in the house tonight I'm and i'm"
>>> len(re.findall("i|\bi'm\b", s.lower()))
5
>>> 
>>> len(re.findall("i", s.lower())) + len(re.findall(r"\bi'm\b", s.lower()))
7
Reply
#5
Instead of str.lower(), you can also use the IGNORECASE regex flag, which would likely be faster for larger strings.

>>> import re
>>> regex = re.compile("i", re.IGNORECASE)
>>> regex.findall("If I had a nickle for every time I used the 'if I had a nickle' line, I'd be rich.")
['I', 'I', 'i', 'i', 'I', 'i', 'I', 'i', 'i', 'I', 'i']
>>> len(regex.findall("If I had a nickle for every time I used the 'if I had a nickle' line, I'd be rich."))
11
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Checking if string starts the same but end differently using re module ranbarr 1 1,647 May-20-2021, 06:23 PM
Last Post: Gribouillis
  Substring in a string Propoganda 1 2,197 Dec-01-2019, 08:45 AM
Last Post: perfringo
  Find a substring in a dictonary; value substring of key aapurdel 2 6,988 May-31-2019, 06:14 PM
Last Post: ichabod801
  Finding and storing all string with character A at middle position Pippi 2 2,649 Jan-20-2019, 08:23 AM
Last Post: Pippi
  Find how many times a user played an artist and how many times disruptfwd8 1 2,550 May-04-2018, 08:32 AM
Last Post: killerrex
  Python: Returning the most frequently occurring substring in a larger string sskkddrit 2 3,834 Feb-09-2018, 06:41 AM
Last Post: sskkddrit
  [Discussion] Re. Substring counting Mekire 9 5,170 Jan-22-2018, 01:56 PM
Last Post: wavic
  Finding repetition in string student8 4 5,001 Oct-15-2017, 07:26 PM
Last Post: student8

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020