Python Forum

Full Version: Using pandas library
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I am a teacher and we are using python and pandas data structure. I am having trouble with this code:

import pandas as pd
numbers = pd.DataFrame([[3],[8],[4],[9],[14],[17]])
numbers.index = ["Odd", "Even", "Even", "Odd", "Even", "Odd"]
# check if 0 or 1
for i in numbers:
    i = numbers % 2
    print(i)
    if i == 0:
        numbers[1] = ["Y"]
    else:
        numbers[1] = ["N"]
I get i printed correctly but when i run the if statements and trying to add new column data i get this error:

Quote:Traceback (most recent call last):
File "C:/Users/ramit/PycharmProjects/grade10A1/lesson.py", line 15, in <module>
if i == 0:
File "C:\Users\ramit\PycharmProjects\grade10A1\venv\lib\site-packages\pandas\core\generic.py", line 1326, in __nonzero__
raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Please help,
Thanks.
You're overwriting the for loop variable i within the loop. You never use i as an index into the dataframe.
Also when using Pandas is looping through rows/columns rarely the best solution.
In fact Pandas even has a red warning about this.
Pandas Wrote:Iterating through pandas objects is generally slow.
In many cases, iterating manually over the rows is not needed and can be avoided (using) a vectorized solution,
many operations can be performed using built-in methods or NumPy functions, (boolean) indexing.
So as example with your code,could eg use np.where.
import pandas as pd
import numpy as np

numbers = pd.DataFrame([[3],[8],[4],[9],[14],[17]], columns=['Numbers'])
numbers['Odd/Even'] = np.where(numbers['Numbers'] % 2, 'Odd', 'Even')
Output:
>>> numbers Numbers Odd/Even 0 3 Odd 1 8 Even 2 4 Even 3 9 Odd 4 14 Even 5 17 Odd
(Oct-25-2020, 09:52 AM)snippsat Wrote: [ -> ]Also when using Pandas is looping through rows/columns rarely the best solution.
In fact Pandas even has a red warning about this.
Pandas Wrote:Iterating through pandas objects is generally slow.
In many cases, iterating manually over the rows is not needed and can be avoided (using) a vectorized solution,
many operations can be performed using built-in methods or NumPy functions, (boolean) indexing.
So as example with your code,could eg use np.where.
import pandas as pd
import numpy as np

numbers = pd.DataFrame([[3],[8],[4],[9],[14],[17]], columns=['Numbers'])
numbers['Odd/Even'] = np.where(numbers['Numbers'] % 2, 'Odd', 'Even')
Output:
>>> numbers Numbers Odd/Even 0 3 Odd 1 8 Even 2 4 Even 3 9 Odd 4 14 Even 5 17 Odd

Thanks for answering but we didn't give students the numpy, is there other ways?
My understanding is that Pandas is built on Numpy (based on a google search and checking a number of sites), so if you gave them Pandas you also gave them numpy implicitly. Try it out
Did figure out something that does not use numpy.
import pandas as pd
 
numbers = pd.DataFrame([[3],[8],[4],[9],[14],[17]], columns=['Numbers'])
numbers['Odd/Even'] = numbers['Numbers'].map(lambda x : x%2)
numbers
Output:
Numbers Odd/Even 0 3 1 1 8 0 2 4 0 3 9 1 4 14 0 5 17 1