Python Forum

Full Version: Behavior of statistics.mean
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello.

Let's make a co-recursive class, or unfoldr somehow popular among functional programming folks, relying on its definition of the Haskell programming language, designed with a Python's fancy iterator.

class unfoldr:
    def __init__(self, f, seed):
        self.f = f
        self.seed = seed
    def __iter__(self):
        return self
    def __next__(self):
        match self.f(self.seed):
            case a, b:
                self.seed = b
                return a
            case None:
                raise StopIteration
Now, let's solve the problem below.

Quote:You input numbers. If you input 0, it means stopping input. Write a program showing the sum you have inputted.

The code of its solution is something like below, using the unfoldr iterator.

def bar():
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(sum(unfoldr(foo, 0)))
You may run the code without any problems.

>>> bar()
1
2
3
4
5
6
7
8
9
10
0
55
Next, try solving a similar problem as below.

Quote:You input numbers. If you input 0, it means stopping input. Write a program showing the average you have inputted.

The logic is almost as same as the above code shown, though we use the "statistics.mean" module/function because writing "average" is dull.

def buzz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(unfoldr(foo, 0)))
Here comes the problem. Even though the logic of the function, buzz, is as same as the logic of the function, bar, buzz needs 0 twice to stop inputting and return its result.

>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
I do not understand why this weird fact occurs.
Could anybody explain what happens?

Thanks, regards.
Oops! It seems like Python's Bug.

Let's show the code again.

# unfoldr: the co-recursive iterator
class unfoldr:
    def __init__(self, f, seed):
        self.f = f
        self.seed = seed
    def __iter__(self):
        return self
    def __next__(self):
        match self.f(self.seed):
            case a, b:
                self.seed = b
                return a
            case None:
                raise StopIteration
Now, let's see the code showing the weird behavior again.

# Q: You input numbers. If you input 0, it means stopping input. Write a program showing the average you have inputted.
def buzz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(unfoldr(foo, 0)))
My Xubuntu has Python 3.10, 3.11, 3.12. I would show the comparison.

# On Python 3.12.1
>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
# On Python 3.11.7 
>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
# On Python 3.10.12
buzz()
1
2
3
4
5
6
7
8
9
10
0 # This stops input as I expected
5.5
This means, since Python 3.11, some "change" in Python causes its strange behavior.
(Jan-29-2024, 09:18 AM)cametan Wrote: [ -> ]This means, since Python 3.11, some "change" in Python causes its strange behavior.

Could this be the cause of the new behavior?
What's new in Python 3.11 Wrote:The statistics functions mean(), variance() and stdev() now consume iterators in one pass rather than converting them to a list first. This is twice as fast and can save substantial memory. (Contributed by Raymond Hettinger in gh-90415.)
See What's new in Python 3.11
(Jan-29-2024, 09:33 AM)Gribouillis Wrote: [ -> ]
(Jan-29-2024, 09:18 AM)cametan Wrote: [ -> ]This means, since Python 3.11, some "change" in Python causes its strange behavior.

Could this be the cause of the new behavior?
What's new in Python 3.11 Wrote:The statistics functions mean(), variance() and stdev() now consume iterators in one pass rather than converting them to a list first. This is twice as fast and can save substantial memory. (Contributed by Raymond Hettinger in gh-90415.)
See What's new in Python 3.11

Thanks. That could be the cause.
This seems a sort of pitfall. It might be safe to convert from the result of unfoldr to a list.

def baz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(list(unfoldr(foo, 0)))) # converting from unfold's iterable to list first, before passing it to statistics.mean
Anyway, thanks for your advice!
In real world applications, you could use itertools.takewhile.
The itertools module is written in C, so it's faster than Python code.

from itertools import takewhile


def sum_user_input():

    def ask():
        while True:
            user_input = input("Enter a number: ")

            try:
                number = int(user_input)
            except ValueError:
                print(user_input, "is not a number")
                continue

            yield number

    return sum(takewhile(lambda x: x != 0, ask()))


result = sum_user_input()
print(result)