Python Forum

Full Version: key error
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
The following code is not working and I believe it should.

q = df.loc[df2['machine_status']==0]['time_period']
p = df.loc[df2['machine_status']==1]['time_period'][:q.shape[0]]

pq = np.sum(p * np.log(p/q))
qp = np.sum(q * np.log(q/p))
print('KL(P || Q) : %. pq)%.3f' % pq)
print('KL(Q || P) : %. pq)%.3f' % qp)
It gives me the following error:

Error:
KeyError Traceback (most recent call last) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\indexes\base.py:3621, in Index.get_loc(self, key, method, tolerance) 3620 try: -> 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err: File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\_libs\index.pyx:136, in pandas._libs.index.IndexEngine.get_loc() File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\_libs\index.pyx:163, in pandas._libs.index.IndexEngine.get_loc() File pandas\_libs\hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item() File pandas\_libs\hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'time_period' The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) Input In [50], in <cell line: 1>() ----> 1 q = df.loc[df2['machine_status']==0]['time_period'] 2 p = df.loc[df2['machine_status']==1]['time_period'][:q.shape[0]] 4 pq = np.sum(p * np.log(p/q)) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\frame.py:3505, in DataFrame.__getitem__(self, key) 3503 if self.columns.nlevels > 1: 3504 return self._getitem_multilevel(key) -> 3505 indexer = self.columns.get_loc(key) 3506 if is_integer(indexer): 3507 indexer = [indexer] File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance) 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err: -> 3623 raise KeyError(key) from err 3624 except TypeError: 3625 # If we have a listlike key, _check_indexing_error will raise 3626 # InvalidIndexError. Otherwise we fall through and re-raise 3627 # the TypeError. 3628 self._check_indexing_error(key) KeyError: 'time_period' 1 ​
Now time period is not an array of numbers and Kullback-Liebler cannot work.

It cannot compare anything but numbers. I am only going to ask for now is that the correct explanation for this error?

I do know what a key error is. Please explain.

I am going to show the last columns so anyone can see that is the error.

I decided to attach a screenshot it is better an easier to show the error.

Respectfully,

LZ
"time_period" is not a column in df. This is easy to test by printing "df" and looking at the columns. "time_period" will not be there. I don't know if this is because there is no time_period column, or if the string doesn't match ("time_period" does not match "time period" or "time_period ").

As far as I can tell, df.loc[df2['machine_status']==0]['time_period'] and df[df2['machine_status']==0]['time_period'] produce the same result.