Python Forum
counting characters in an object
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
counting characters in an object
#1
i have a string (str,bytes,bytearray) and an object with one or more characters (str,bytes,bytearray,set,frozenset,list,tuple). the beginning of the string has some number of characters that would get a True result if the in operator is used with that object while stepping through the string until it reaches a character that would get False. as this is hard to explain, here is some code:
def runlen(s,o):
    for n in range(len(s)):
        if s[n] in o:
            continue
        return n
what i would like to know is if there is a way to call something to do that loop internally so it would run faster. maybe re can do this, somehow. once i have the position n, i will be using s[:n] though not s[n:]. so, something that just gives me the prefix would do the job.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
Yes, you could do a regex. If your string were supercalifragilistic and your set of OK characters were lrepscau, then the first failure should be the "i" in position number 8.

>>> import re
>>> s = "supercalifragilistic"
>>> chars = "lrepscau"
>>> re.match(fr"[{chars}]*", s)
<_sre.SRE_Match object; span=(0, 8), match='supercal'>
It matched from (0, 8), so position #8 did not match. Or you could do the inverse character class and find the first match:

>>> re.search(fr"[^{chars}]", s)
<_sre.SRE_Match object; span=(8, 9), match='i'>
You can ask for the span() of any successful match.

>>> re.search(fr"[^{chars}]", s).span()
(8, 9)
Reply
#3
can you code it like a function replacing the one i posted (e.g. with no literals)?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#4
There are no literals. I just used "chars" in place of your "o".

You'd need to add in some conditional to handle the cases when the match doesn't succeed, and you might need to convert your object with characters to a string. But after that, the match is good.

def runlen(s, o):
    return re.search(fr"[^{o}]", s).span()[0]
Reply
#5
sometimes, the caller is working in bytes, or in bytearray, and passes those in. at least the same is not returned for this, since it is specifically int. there could be cases where the whole string is in the (class defined by the) object, which will often be a set (and may have ints instead of 1-bytes).
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#6
You can use the __contains__ dunder method
from itertools import takewhile
from more_itertools import ilen

def runlen(s, o):
    return ilen(takewhile(o.__contains__, s))
I don't know if it will run faster however.
Reply
#7
tell me more about __contains__. is it checking multiple characters?

(May-31-2020, 03:30 AM)bowlofred Wrote: There are no literals. I just used "chars" in place of your "o".

but the literals leave me guessing what is what. use another variable name if you wish. just have the prototype after "def" with names for the string and the object a character can be "in", and use those variable names in the code.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#8
Skaperen Wrote:tell me more about __contains__. is it checking multiple characters?
There is not much to say about it. o.__contains__(x) has the same value as x in o.
Reply
#9
why would anyone use __contains__ is in gives the same value?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#10
Because I can use o.__contains__ as a pointer to function, which I cannot do with the in keyword. For example I can write takewhile(o.__contains__, s)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Remove escape characters / Unicode characters from string DreamingInsanity 5 13,672 May-15-2020, 01:37 PM
Last Post: snippsat
  Counting number of characters in a string Drone4four 1 3,434 Aug-16-2018, 02:33 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020