Python Forum
Convert dataframe string column to numeric in Python - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Convert dataframe string column to numeric in Python (/thread-24995.html)



Convert dataframe string column to numeric in Python - darpInd - Mar-14-2020

Hello,
I have taken a sample data as dataframe from an url and then added columns in that. While I try to perform some calculations, I realised that column 'Dage' and 'Cat_Ind' are not numeric but string. So, how to convert them to numeric so as to do next level of analysis?

df2=pd.read_csv("http://users.stat.ufl.edu/~winner/data/agedeath.dat", header = None)
df2.columns =["col1"]
print (df2)
df3 = df2.col1.str.split(expand = True)
df3.columns = ["Cat","Dage", "Cat_Ind"]
print (df3)
Resulting dataframe is:
Output:
Cat Dage Cat_Ind 0 aris 21 1 1 aris 21 2 2 aris 21 3 3 aris 21 4 4 aris 21 5 ... ... ... ... 6181 sovr 95 1436 6182 sovr 95 1437 6183 sovr 97 1438 6184 sovr 100 1439 6185 sovr 101 1440 [6186 rows x 3 columns]
Here is the problem: I use below code -
df3['Dage'].min()
getting output as:
Output:
'100'
I suspect that this must be due to the fact that column 'Dage' is not here as the integer- but string.

Q. How to get this column converted to integer values? Please help


RE: Convert dataframe string column to numeric in Python - ndc85430 - Mar-14-2020

Did you even look at the docs? read_csv takes a parameter dtype that lets you specify the types of the columns: https://pandas.pydata.org/docs/user_guide/io.html#io-read-csv-table.