Python Forum
copy of a slice from a DataFrame
Thread Rating:
  • 3 Vote(s) - 3.67 Average
  • 1
  • 2
  • 3
  • 4
  • 5
copy of a slice from a DataFrame
#1
Hello Phyton Fans,
I just started a coursere training to learn more about phyton.
When I used the jupiter notebook provided by coursere the below code worked without error however
in my local anaconda environment the error says:

--------------
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: ....
 app.launch_new_instance()
--------------

def answer_three():
        atleastonegold = df[(df['Gold.1']+df['Gold']) >1] #df.where(df['Gold']+df['Gold.1']>1)
        atleastonegold['summerminuswinter']= atleastonegold['Gold']-atleastonegold['Gold.1'] #=ADD a new column
        return atleastonegold #atleastonegold['biggestdifference'].idxmax()
answer_three()
I am only trying to create a new column.
Reply
#2
(Feb-18-2017, 04:06 PM)metalray Wrote:
def answer_three():
        atleastonegold = df #snip

What's a df, and why isn't your error that it's undefined? Because... it's not defined.
Also, python. Like the snake. Not phyton, like... a misspelt photon. Or, another way to remember it, the file extension is ".py", not ".ph".
Reply
#3
Hi nilamo,
Thanks for your reply. df is a pandas data frame.

import pandas as pd

df = pd.read_csv('olympics.csv', index_col=0, skiprows=1)

for col in df.columns:
    if col[:2]=='01':
        df.rename(columns={col:'Gold'+col[4:]}, inplace=True)
    if col[:2]=='02':
        df.rename(columns={col:'Silver'+col[4:]}, inplace=True)
    if col[:2]=='03':
        df.rename(columns={col:'Bronze'+col[4:]}, inplace=True)
    if col[:1]=='№':
        df.rename(columns={col:'#'+col[1:]}, inplace=True)

names_ids = df.index.str.split('\s\(') # split the index by '('

df.index = names_ids.str[0] # the [0] element is the country name (new index) 
df['ID'] = names_ids.str[1].str[:3] # the [1] element is the abbreviation or ID (take first 3 characters from that)

df = df.drop('Totals')
df.head()
A red box appears in jupiter notebooks (I imagine that means error)
and it reads:
--------------------------------------------------------------------------------------------------
C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:3: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/sta...ersus-copy
  app.launch_new_instance()
--------------------------------------------------------------------------------------------------

The error results from the third row of this code:

def answer_three():
        atleastonegold = df[(df['Gold.1']+df['Gold']) >1] #df.where(df['Gold']+df['Gold.1']>1)#=INCLUDES NaN
        atleastonegold['summerminuswinter']= atleastonegold['Gold']-atleastonegold['Gold.1'] #=ADD a new columns
        return atleastonegold #atleastonegold['biggestdifference'].idxmax()
answer_three()
I try to add a new column. I will try using loc.atleastonegold['Gold']-loc.atleastonegold['Gold.1'] and see if that works.

when I try
atleastonegold.loc['Gold']-atleastonegold.loc['Gold.1']
A key error appears "KeyError: 'the label [Gold] is not in the [index]'"
Reply
#4
I don't really know pandas (...at all), but I did a little googling, and it looks like adding a column to an existing dataframe is a little tricky (as sometimes you have a copy/view, and sometimes you have the frame itself).  Have you tried using assign?

Maybe...
df = df.assign(summerminuswinter = lambda row: row['Gold'] - row['Gold.1'])
Reply
#5
It's not error, just a warning that pandas doesn't know if you really know what are you doing (working on copy x working on view).

In this case it looks like false positive, so you can ignore it or even suppress this kind of warning with
pandas.options.mode.chained_assignment = None 
But perhaps the best choice is to make clear that you want to work with a copy with .copy()
atleastonegold = df[(df['Gold.1']+df['Gold']) >1].copy()
atleastonegold['summerminuswinter']= atleastonegold['Gold']-atleastonegold['Gold.1']
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020