Python Forum

Full Version: Why is first argument sometimes rows and sometimes columns?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I can do this:

# creating a value with all null values in new data frame 
new["Null Column"]= None
In this example, the first argument in brackets calls for a column.

I can also do this:
>>>df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
...  index=['cobra', 'viper', 'sidewinder'],
...  columns=['max_speed', 'shield'])

>>>df.loc['viper']
max_speed    4
shield       5
Name: viper, dtype: int64
In this example, the first argument in brackets references a row.

Is it accurate to say in more general terms that .loc and .iloc methods (and maybe others?) call for (row, column) whereas subsetting/slicing methods/syntax call for (column, row)?

If so, then is there a reason why it's not more consistent?
You probably read the documentation (as this example is from there) but somehow missed concept of 'label' i.e. row and column names (two first rows in documentation of pd.DataFrame.loc):.

Quote:Access a group of rows and columns by label(s) or a boolean array.

.loc[] is primarily label based, but may also be used with a boolean array.

These labels are positional, which means that you can skip columns:

In [18]: df.loc['viper']       # no column labels specified, all are selected implicitly
Out[18]:
max_speed    3
shield       4
Name: viper, dtype: int64

In [19]: df.loc['viper', :]    # all column labels selected explicitly with :
Out[19]:
max_speed    3
shield       4
Name: viper, dtype: int64
But not rows:

In [20]: df.loc[:, 'shield']   # all rows are explicitly selected
Out[20]:
cobra         2
viper         4
sidewinder    6
Name: shield, dtype: int64

In [21]: df.loc['shield']      # no row label 'shield' (first positional argument)
/...../
KeyError: 'shield'