Python Forum
Pandas| iterrows | csv.replace - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Homework (https://python-forum.io/forum-9.html)
+--- Thread: Pandas| iterrows | csv.replace (/thread-3394.html)



Pandas| iterrows | csv.replace - BeerLover - May-19-2017

Hello forum,

I wanted to ask you to check what is wrong this piece of code: 

import pandas as pd
import requests
import io

csv = requests.get('https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data').content
csv = csv.replace(' ', '')                                                                                                ### !!!! 
df = pd.read_csv(io.StringIO(csv.decode('utf-8')), header=None)
df.columns = ['age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship',
              'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'result']
df = df.drop(['fnlwgt','education-num', 'capital-gain', 'capital-loss','result'], axis=1)

# I want to create a df with all divorced ppl so i type 
Divorced = df[df['martial-status']=="Divorced"]
print(Divorced)                                                                                                            ###   yields an empty df

for index, row in df.iterrows():
     print(row['age'], row['education'])   #
I am using anaconda's spyder for python 3.6. I had no serious issues with the package/spyder before. I would really appreciate if someone could help me with these questions :

1)  csv = csv.replace(' ', '') command doesnt seem to work and gives a following error: 
runfile('G:...')
Traceback (most recent call last):

  File "<ipython-input-27-fbfcdec3f55d>", line 1, in <module>
    runfile('G:....')

  File "C:\Users\HP\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Users\HP\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "G:/Study/Masters/UvA Data Science/final exam draft.py", line 7, in <module>
    csv = csv.replace(' ', '')

TypeError: a bytes-like object is required, not 'str'
2) when csv_replace is just commented out, I can not use boolean indexing for some reason! The piece of code which is supposed to create me a df with divorced humans is giving me an empty df!

Any help suggestions appreciated! Thank you for your time.

Cheers,
Max


RE: Pandas| iterrows | csv.replace - buran - May-19-2017

from the docs: You can also access the response body as bytes, for non-text requests

you should use text instead

also, I would not use csv name. that is a standard library module and if you need to import and use it in the future you will need to refactor the code in order to avoid conflict