Python Forum

Full Version: TypeError: unsupported operand type(s) for -: 'str' and 'str'
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I am new to python. I am trying to do some operation with my data set

I am trying to do this operation

df['dif_and_sq']=(df['imdbRating']).sub((df['imdbVotes']))
i am getting this error

Error:
TypeError: unsupported operand type(s) for -: 'str' and 'str'
Even i tried to convert the variable to numeric but its not converting a it remains in serious

If I tried like this

df['dif_and_sq']=int(df['imdbRating']).sub(int(df['imdbVotes']))
The error is

Error:
TypeError: cannot convert the series to <class 'int'>
how to reslove this error
import os
import pandas as pd
import numpy as np
import matplotlib as mlt
import csv


os.chdir("F:\Data Science\python program")


df=pd.read_csv("IMDB_data.csv",encoding='cp1252',skiprows=[2])


IMDB_Genre= df['Genre'].value_counts().to_frame().T


df.sort_values("Genre", axis = 0, ascending = True, 
                 inplace = True, na_position ='last') 


df['imdbRating']=pd.to_numeric(df['imdbRating'], errors='ignore',downcast='integer')
df['imdbVotes']=pd.to_numeric(df['imdbVotes'], errors='ignore',downcast='integer')


df['imdbVotes']=pd.to_numeric(df['imdbVotes'], errors='ignore',downcast='integer')
Error:
*​--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-9-b8c9307b8a6b> in <module> ----> 1 df['dif_and_sq']=int(df['imdbRating']).sub(int(df['imdbVotes'])) C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py in wrapper(self) 91 return converter(self.iloc[0]) 92 raise TypeError("cannot convert the series to " ---> 93 "{0}".format(str(converter))) 94 95 wrapper.__name__ = "__{name}__".format(name=converter.__name__) TypeError: cannot convert the series to <class 'int'>* df['dif_and_sq']=df['imdbRating'].values-df['imdbVotes'].values *--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-16-00740a4575a9> in <module> ----> 1 df['dif_and_sq']=df['imdbRating'].values-df['imdbVotes'].values TypeError: unsupported operand type(s) for -: 'str' and 'str'
you ignore errors, so probably it doesn't convert the values to numeric and just return the input. check your data
also when you read from csv, you can supply dtype argument so that you convert str from the file to proper data type
Thanks Buran,
But in my data set it contain string also so it through an error
[python]

df=pd.read_csv("IMDB_data.csv",encoding='cp1252',skiprows=[2],dtype='float64')

[\python]

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
<ipython-input-90-449a88fabf75> in <module>
----> 1 df=pd.read_csv("IMDB_data.csv",encoding='cp1252',skiprows=[2],dtype='float64')

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
700 skip_blank_lines=skip_blank_lines)
701
--> 702 return _read(filepath_or_buffer, kwds)
703
704 parser_f.__name__ = name

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
433
434 try:
--> 435 data = parser.read(nrows)
436 finally:
437 parser.close()

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1137 def read(self, nrows=None):
1138 nrows = _validate_integer('nrows', nrows)
-> 1139 ret = self._engine.read(nrows)
1140
1141 # May alter columns / col_dict

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
1993 def read(self, nrows=None):
1994 try:
-> 1995 data = self._reader.read(nrows)
1996 except StopIteration:
1997 if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

ValueError: could not convert string to float: "Despite his tarnished reputation after the events of The Dark Knight, in which he took the rap for Dent's crimes, Batman feels compelled to intervene to assist the city and its police force which is struggling to cope with Bane's plans to destroy the city."
please, when use BBcode tags, use forward slash, not back slash for closing tag
also, when post errors, use error tag.

check documentation for the correct format of dtype

Quote:dtype : Type name or dict of column -> type, optional

Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object together with suitable na_values settings to preserve and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.

obviously you are not able to convert str to float. Either specify data type for all columns [obviously it doesn't work in your case) or pass a dict with column name and data type for columns you want to convert

as to your original code - remove errors=ignore in order to check what is going on and why you are not able to convert to float respective columns