Python Forum
counting characters in an object - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: counting characters in an object (/thread-27242.html)

Pages: 1 2


counting characters in an object - Skaperen - May-31-2020

i have a string (str,bytes,bytearray) and an object with one or more characters (str,bytes,bytearray,set,frozenset,list,tuple). the beginning of the string has some number of characters that would get a True result if the in operator is used with that object while stepping through the string until it reaches a character that would get False. as this is hard to explain, here is some code:
def runlen(s,o):
    for n in range(len(s)):
        if s[n] in o:
            continue
        return n
what i would like to know is if there is a way to call something to do that loop internally so it would run faster. maybe re can do this, somehow. once i have the position n, i will be using s[:n] though not s[n:]. so, something that just gives me the prefix would do the job.


RE: counting characters in an object - bowlofred - May-31-2020

Yes, you could do a regex. If your string were supercalifragilistic and your set of OK characters were lrepscau, then the first failure should be the "i" in position number 8.

>>> import re
>>> s = "supercalifragilistic"
>>> chars = "lrepscau"
>>> re.match(fr"[{chars}]*", s)
<_sre.SRE_Match object; span=(0, 8), match='supercal'>
It matched from (0, 8), so position #8 did not match. Or you could do the inverse character class and find the first match:

>>> re.search(fr"[^{chars}]", s)
<_sre.SRE_Match object; span=(8, 9), match='i'>
You can ask for the span() of any successful match.

>>> re.search(fr"[^{chars}]", s).span()
(8, 9)



RE: counting characters in an object - Skaperen - May-31-2020

can you code it like a function replacing the one i posted (e.g. with no literals)?


RE: counting characters in an object - bowlofred - May-31-2020

There are no literals. I just used "chars" in place of your "o".

You'd need to add in some conditional to handle the cases when the match doesn't succeed, and you might need to convert your object with characters to a string. But after that, the match is good.

def runlen(s, o):
    return re.search(fr"[^{o}]", s).span()[0]



RE: counting characters in an object - Skaperen - May-31-2020

sometimes, the caller is working in bytes, or in bytearray, and passes those in. at least the same is not returned for this, since it is specifically int. there could be cases where the whole string is in the (class defined by the) object, which will often be a set (and may have ints instead of 1-bytes).


RE: counting characters in an object - Gribouillis - Jun-01-2020

You can use the __contains__ dunder method
from itertools import takewhile
from more_itertools import ilen

def runlen(s, o):
    return ilen(takewhile(o.__contains__, s))
I don't know if it will run faster however.


RE: counting characters in an object - Skaperen - Jun-01-2020

tell me more about __contains__. is it checking multiple characters?

(May-31-2020, 03:30 AM)bowlofred Wrote: There are no literals. I just used "chars" in place of your "o".

but the literals leave me guessing what is what. use another variable name if you wish. just have the prototype after "def" with names for the string and the object a character can be "in", and use those variable names in the code.


RE: counting characters in an object - Gribouillis - Jun-02-2020

Skaperen Wrote:tell me more about __contains__. is it checking multiple characters?
There is not much to say about it. o.__contains__(x) has the same value as x in o.


RE: counting characters in an object - Skaperen - Jun-08-2020

why would anyone use __contains__ is in gives the same value?


RE: counting characters in an object - Gribouillis - Jun-08-2020

Because I can use o.__contains__ as a pointer to function, which I cannot do with the in keyword. For example I can write takewhile(o.__contains__, s)