Python Forum

Full Version: Error not supported between instances of str and int
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello everyone. I am having trouble with my script. I am new to python and have recently just started using it. So everything runs smoothly until i get to my sub1 script.

import pandas
import numpy

healthdata = pandas.read_csv("addhealth_pds.csv", low_memory=False)

print(len(healthdata)) #rows
print(len(healthdata.columns)) #columns

healthdata["H1MO3"] = healthdata["H1MO3"].convert_objects(convert_numeric=True)
healthdata["H1PA1"] = healthdata["H1PA1"].convert_objects(convert_numeric=True)

print("Counts for question 3 from Motivations to Engage in Risky Behaviors: If you had sexual intercourse, afterward, you would feel guilty.")
c1c = healthdata["H1MO3"].value_counts(sort=False)
print(c1c)
print("Percentages for question 3 from Motivations to Engage in Risky Behaviors: If you had sexual intercourse, afterward, you would feel guilty.")
p1c = healthdata["H1MO3"].value_counts(sort=False, normalize=True)
print(p1c)
# Mother
print("Counts for question 1 from Parents’ Attitudes: How would [your mom] feel about your having sex at this time in your life?")
c2c = healthdata["H1PA1"].value_counts(sort=False)
print(c2c)
print("Percentages for question 1 from Parents’ Attitudes: How would [your mom] feel about your having sex at this time in your life?")
p2c = healthdata["H1PA1"].value_counts(sort=False, normalize=True)
print(p2c)
sub1=healthdata[healthdata["age"]>=15]
This is the error I get:
    sub1=healthdata[healthdata["age"]>=15]

  File "C:\Users\Steph\Anaconda3\lib\site-packages\pandas\core\ops.py", line 1253, in wrapper
    res = na_op(values, other)

  File "C:\Users\Steph\Anaconda3\lib\site-packages\pandas\core\ops.py", line 1140, in na_op
    result = _comp_method_OBJECT_ARRAY(op, x, y)

  File "C:\Users\Steph\Anaconda3\lib\site-packages\pandas\core\ops.py", line 1119, in _comp_method_OBJECT_ARRAY
    result = libops.scalar_compare(x, y, op)

  File "pandas\_libs\ops.pyx", line 98, in pandas._libs.ops.scalar_compare

TypeError: '>=' not supported between instances of 'str' and 'int'
Can anyone help me out. greatly appreciated
Hello,
the error message states quite clearly that the problem is comparing 2 variables that can't be compared - in this case a string and an integer. The line where (I believe) it happens is:

sub1=healthdata[healthdata["age"]>=15]
In that case it means healthdata["age"] must be a string. A simple workaround is to convert string to integer (if this string "represents" an integer, that is): int(healthdata["age"])
in addition to what @j.crater explained - you can specify type of data in columns by supplying optional argument dtype or converters to pandas.read_csv
https://pandas.pydata.org/pandas-docs/st...d_csv.html
Thank you both!