Calculating Average with Error Handling

Calculating Average with Error Handling - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Calculating Average with Error Handling (/thread-42067.html)

Calculating Average with Error Handling - mikasa - May-03-2024

I'm trying to write a Python program to calculate the average of a list of numbers, but I'm getting an error message: "TypeError: unsupported operand type(s) for +: 'int' and 'str'".

numbers = [1, 2, 3, "four", 5, "seven", "nine"]

def calculate_average(numbers):
  total = 0
  count = 0
  for num in numbers:
    try:
      total += int(num)  # Attempt to convert string to integer
      count += 1
    except ValueError:  # Handle conversion errors
      print(f"Error: Could not convert '{num}' to a number.")
  if count > 0:
    average = total / count
    return average
  else:
    return None  # Return None if no valid numbers found

average = calculate_average(numbers)

if average:
  print(f"The average of the numerical values is: {average}")
else:
  print("No valid numbers found in the list.")

Can someone help me point out what's wrong and suggest a solution?
Link Removed

RE: Calculating Average with Error Handling - sawtooth500 - May-03-2024

So are you just trying to do this manually as an exercise? If so that's fine but if you just need the average... you are working way too hard.

Just get your entire list converted to ints, convert it to a numpy array, and numpy has a build in function to find the mean.

RE: Calculating Average with Error Handling - snippsat - May-05-2024

(May-03-2024, 08:46 AM)mikasa Wrote: but I'm getting an error message: "TypeError: unsupported operand type(s) for +: 'int' and 'str'".

Should not get that error message,this is what i get i run your code with eg Python 3.12.

Output:Error: Could not convert 'four' to a number.
Error: Could not convert 'seven' to a number.
Error: Could not convert 'nine' to a number.
The average of the numerical values is: 2.75

So it work as it should no TypeError.

To clean it up a litlle.

def calculate_average(numbers):
    total = 0
    count = 0
    for num in numbers:
        try:
            total += int(num)
            count += 1
        except ValueError:
            print(f"Error: Could not convert '{num}' to a number.")
    if count > 0:
        average = total / count
        return average

if __name__ == "__main__":
    numbers = [1, 2, 3, "four", 5, "seven", "nine"]
    average = calculate_average(numbers)
    if average:
        print(f"The average of the numerical values is: {average}")
    else:
        print("No valid numbers found in the list.")

Output:Error: Could not convert 'four' to a number.
Error: Could not convert 'seven' to a number.
Error: Could not convert 'nine' to a number.
The average of the numerical values is: 2.75

RE: Calculating Average with Error Handling - paul18fr - May-06-2024

I cannot avoid a list comprehension here, and maybe there's a simplier (and faster) way?
In the following, the pattern has been duplicated 1 million times and it took 4 seconds approx.

import numpy as np
import time

numbers = [1, 2, 3, "four", 5, "seven", "nine"]
# numbers = ["four", "seven", "nine"]
numbers = 1_000000*numbers

t0 = time.time()

M = [isinstance(i, int) for i in numbers]
M = np.asarray(M)
index = np.where(M == True)

if np.prod(np.shape(index)) == 0:
    print("No valid numbers found in the list.")
else:
    M = np.asarray(numbers)
    M = M[index].astype(int)
    Average = np.mean(M)
    print(f"The average of the numerical values is: {Average}")
    
t1 = time.time()
print(f"Duration = {(t1 - t0)}")

Output:The average of the numerical values is: 2.75
Duration = 3.5027401447296143

RE: Calculating Average with Error Handling - paul18fr - May-06-2024

Here bellow more general way if both integers and floats exist in the list, but i'm wondering: is there a better way to directlycombine the 2 boolean list (to avoid stacknig indexes)?

Output:[ True  True  True False  True False False False]
[False False False False False False False  True]

combination:
[ True  True  True False  True False False True]

import numpy as np
import time

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
# numbers = ["four", "seven", "nine"]
numbers = 1_000000*numbers

t0 = time.time()

M_int   = np.asarray([isinstance(i, int) for i in numbers])
M_float = np.asarray([isinstance(i, float) for i in numbers])

index_int = np.where(M_int == True)
index_float = np.where(M_float == True)
index = np.hstack((index_int, index_float))

if np.prod(np.shape(index)) == 0:
    print("No valid numbers found in the list.")
else:
    M = np.asarray(numbers)
    M = M[index].astype(float)
    Average = np.mean(M)
    print(f"The average of the numerical values is: {Average}")
    
t1 = time.time()
print(f"Duration = {(t1 - t0)}")

RE: Calculating Average with Error Handling - snippsat - May-06-2024

(May-06-2024, 04:17 AM)paul18fr Wrote: I cannot avoid a list comprehension here, and maybe there's a simplier (and faster) way?
In the following, the pattern has been duplicated 1 million times and it took 4 seconds approx.

Just to mention that first code work without list comprehension.
You make this more complicated than it need to be.
Can just do it like this.

import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine"]
filtered_numbers = [i for i in numbers if isinstance(i, int)]
average = np.mean(filtered_numbers)
print(average)

Output:
2.75

Also numpy may be overkill for a task like this,it will faster on large datasets eg matrix and can do vectorized calculation.
For measure small code like this use timeit
A example.

import timeit
import numpy as np
from statistics import mean

def num_py():
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = np.mean(filtered_numbers)
    #print(average)

def plain():
    # No libaries
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = sum(filtered_numbers) / len(filtered_numbers)
    #print(average)

def stat():
    numbers = [1, 2, 3, "four", 5, "seven", "nine"] * 100
    filtered_numbers = [i for i in numbers if isinstance(i, int)]
    average = mean(filtered_numbers)
    #print(average)

if __name__ == '__main__':
    lst = ['num_py', 'plain', 'stat']
    for func in lst:
        time_used = timeit.Timer(f"{func}()", f'from __main__ import {func}').timeit(number=100000)
        print(f'{func} --> {time_used:.2f}')

Output:num_py --> 11.04
plain --> 5.05
stat --> 18.80

This run each function 100000 times and give back average time used.
So the plain code is faster here even if make the list bigger * 100,so this task is not best suited for numpy.
That said for this task all work fine,as task is simple as most the work is done in the list comprehension.

RE: Calculating Average with Error Handling - paul18fr - May-07-2024

@snippsat: I figured out how simply using [i for i in numbers if isinstance(i, int)] is much more efficient and smarter: i bow down Smile

import time
import numpy as np

n = 1_000_000
numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
# numbers = ["four", "seven", "nine"]
numbers = numbers*n
 
t0 = time.time()

filtered_numbers1 = np.hstack(( np.asarray([i for i in numbers if isinstance(i, int)]), 
                               np.asarray([i for i in numbers if isinstance(i, float)]) ))

if np.prod(np.shape(filtered_numbers1)) == 0:
    print("No valid numbers found in the list.")
else:
    Average = np.mean(filtered_numbers1)
    print(f"The average of the numerical values is: {Average}")
     
t1 = time.time()
print(f"Duration#1 = {(t1 - t0)}")


filtered_numbers2 =  [i for i in numbers if isinstance(i, int)] + \
                     [i for i in numbers if isinstance(i, float)]

if np.prod(np.shape(filtered_numbers2)) == 0:
    print("No valid numbers found in the list.")
else:
    average = sum(filtered_numbers2) / len(filtered_numbers2)
    print(f"The average of the numerical values is: {Average}")
     
t2 = time.time()
print(f"Duration#2 = {(t2 - t1)}")

RE: Calculating Average with Error Handling - snippsat - May-07-2024

Can do int and float it one isinstance call.

import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
filtered_numbers = [i for i in numbers if isinstance(i, (int, float))]
average = np.mean(filtered_numbers)
print(average)

Output:
2.666

Also if measure as mention use timeit.
Here just put all code in a string and run it 1000000 and get back the averge time used.

import timeit

mycode = '''
import numpy as np

numbers = [1, 2, 3, "four", 5, "seven", "nine", 2.33]
filtered_numbers = [i for i in numbers if isinstance(i, (int, float))]
average = np.mean(filtered_numbers)
#print(average)
'''

print(timeit.timeit(stmt=mycode, number=1000000))

Output:
8.478885899996385