Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Behavior of statistics.mean
#1
Hello.

Let's make a co-recursive class, or unfoldr somehow popular among functional programming folks, relying on its definition of the Haskell programming language, designed with a Python's fancy iterator.

class unfoldr:
    def __init__(self, f, seed):
        self.f = f
        self.seed = seed
    def __iter__(self):
        return self
    def __next__(self):
        match self.f(self.seed):
            case a, b:
                self.seed = b
                return a
            case None:
                raise StopIteration
Now, let's solve the problem below.

Quote:You input numbers. If you input 0, it means stopping input. Write a program showing the sum you have inputted.

The code of its solution is something like below, using the unfoldr iterator.

def bar():
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(sum(unfoldr(foo, 0)))
You may run the code without any problems.

>>> bar()
1
2
3
4
5
6
7
8
9
10
0
55
Next, try solving a similar problem as below.

Quote:You input numbers. If you input 0, it means stopping input. Write a program showing the average you have inputted.

The logic is almost as same as the above code shown, though we use the "statistics.mean" module/function because writing "average" is dull.

def buzz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(unfoldr(foo, 0)))
Here comes the problem. Even though the logic of the function, buzz, is as same as the logic of the function, bar, buzz needs 0 twice to stop inputting and return its result.

>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
I do not understand why this weird fact occurs.
Could anybody explain what happens?

Thanks, regards.
Reply
#2
Oops! It seems like Python's Bug.

Let's show the code again.

# unfoldr: the co-recursive iterator
class unfoldr:
    def __init__(self, f, seed):
        self.f = f
        self.seed = seed
    def __iter__(self):
        return self
    def __next__(self):
        match self.f(self.seed):
            case a, b:
                self.seed = b
                return a
            case None:
                raise StopIteration
Now, let's see the code showing the weird behavior again.

# Q: You input numbers. If you input 0, it means stopping input. Write a program showing the average you have inputted.
def buzz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(unfoldr(foo, 0)))
My Xubuntu has Python 3.10, 3.11, 3.12. I would show the comparison.

# On Python 3.12.1
>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
# On Python 3.11.7 
>>> buzz()
1
2
3
4
5
6
7
8
9
10
0 # the first 0
0 # the second 0 -> that makes buzz finished and return the result
5.5
# On Python 3.10.12
buzz()
1
2
3
4
5
6
7
8
9
10
0 # This stops input as I expected
5.5
This means, since Python 3.11, some "change" in Python causes its strange behavior.
Reply
#3
(Jan-29-2024, 09:18 AM)cametan Wrote: This means, since Python 3.11, some "change" in Python causes its strange behavior.

Could this be the cause of the new behavior?
What's new in Python 3.11 Wrote:The statistics functions mean(), variance() and stdev() now consume iterators in one pass rather than converting them to a list first. This is twice as fast and can save substantial memory. (Contributed by Raymond Hettinger in gh-90415.)
See What's new in Python 3.11
« We can solve any problem by introducing an extra level of indirection »
Reply
#4
(Jan-29-2024, 09:33 AM)Gribouillis Wrote:
(Jan-29-2024, 09:18 AM)cametan Wrote: This means, since Python 3.11, some "change" in Python causes its strange behavior.

Could this be the cause of the new behavior?
What's new in Python 3.11 Wrote:The statistics functions mean(), variance() and stdev() now consume iterators in one pass rather than converting them to a list first. This is twice as fast and can save substantial memory. (Contributed by Raymond Hettinger in gh-90415.)
See What's new in Python 3.11

Thanks. That could be the cause.
This seems a sort of pitfall. It might be safe to convert from the result of unfoldr to a list.

def baz():
    from statistics import mean
    def foo(x):
        i = int(input())
        return None if i == 0 else (i, x + 1)
    print(mean(list(unfoldr(foo, 0)))) # converting from unfold's iterable to list first, before passing it to statistics.mean
Anyway, thanks for your advice!
Reply
#5
In real world applications, you could use itertools.takewhile.
The itertools module is written in C, so it's faster than Python code.

from itertools import takewhile


def sum_user_input():

    def ask():
        while True:
            user_input = input("Enter a number: ")

            try:
                number = int(user_input)
            except ValueError:
                print(user_input, "is not a number")
                continue

            yield number

    return sum(takewhile(lambda x: x != 0, ask()))


result = sum_user_input()
print(result)
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Simple statistics with range function Pythonlearner2019 2 2,131 Nov-25-2019, 05:25 PM
Last Post: Pythonlearner2019

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020