Python Forum
How to filter data using a panda.DateFrame.loc - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: How to filter data using a panda.DateFrame.loc (/thread-18693.html)



How to filter data using a panda.DateFrame.loc - pawlo392 - May-27-2019

I do not know why I am getting such a mistake.
Error:
File "C:\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2726, in _getitem_array indexer = self.loc._convert_to_indexer(key, axis=1) File "C:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1327, in _convert_to_indexer .format(mask=objarr[mask])) KeyError: "Index(['1950', '1960', '1970', '1980', '1990', '2000', '2010'], dtype='object') not in index"
My code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
plt.rcParams["figure.figsize"] = (14,5)
df_k = pd.read_csv("smiertelnosc_polska_kobiety.csv", index_col="Wiek")
df_m = pd.read_csv("smiertelnosc_polska_mezczyzni.csv", index_col="Wiek")
years = df_k.columns.str.strip('Rok')
df_k.columns=years.astype(int)
plt.style.use('ggplot')
df_k.plot(kind='bar')
plt.ylabel('Probability death-women')
df_k = df_k.loc[df_k[years] == 2010]


years = df_m.columns.str.strip('Rok')
df_m.columns=years.astype(int)
plt.style.use('ggplot')
df_m.plot(kind='bar')
plt.ylabel('probability death -men')
df_m = df_m.loc[df_m[years] == 2010]
I want to create a graph for only 2010.


RE: How to filter data using a panda.DateFrame.loc - michalmonday - May-27-2019

Notice that 2010 is a name of the column. With pandas it is very easy to get all values from specific column:
df['column name']

So you can do something like:
import pandas as pd
import matplotlib.pyplot as plt

df_k = pd.read_csv("smiertelnosc_polska_kobiety.csv", index_col="Wiek")

plt.rcParams["figure.figsize"] = (14,5)
plt.style.use('ggplot')
plt.ylabel('Probability death-women')
plt.plot(df_k['2010'])
plt.show()