Python Forum

Link to question on StackOverflow
in the stackoverflow Yaloa_21 understood what i want to make but i always get errors
https://stackoverflow.com/questions/7157...5_71593640

Question about pandas
it's a little bit complicated , i have this dataframe :

ID           TimeandDate      Date        Time
10   2020-08-07 07:40:09  2022-08-07   07:40:09
10   2020-08-07 08:50:00  2022-08-07   08:50:00
10   2020-08-07 12:40:09  2022-08-07   12:40:09
10   2020-08-08 07:40:09  2022-08-08   07:40:09
10   2020-08-08 17:40:09  2022-08-08   17:40:09
12   2020-08-07 08:03:09  2022-08-07   08:03:09
12   2020-08-07 10:40:09  2022-08-07   10:40:09
12   2020-08-07 14:40:09  2022-08-07   14:40:09
12   2020-08-07 16:40:09  2022-08-07   16:40:09
13   2020-08-07 09:22:45  2022-08-07   09:22:45
13   2020-08-07 17:57:06  2022-08-07   17:57:06

i want to create new dataframe with 2 new columns the first one is df["Check-in"] , as you can see my data doesnt have any indicator to show what time the id has checked in , so i will suppose that the first time for every id is a check-in , and the next row is a check-out and will be inserted in df["Check-out"] , also if a check-in doesnt have a check-out time it has to be registred as the check-out for the previous check-out of the same day

i tried to do this but i'm afraid its not efficient because it shows the first and last one imagine if ID=13 has entered at 07:40:09 and the he check out at 08:40:09 , later that day he returns at 19:20:00 and leave in the next 10 minutes 19:30:00 if i do that fonction it will show that he worked for 12 hours

group = df.groupby(['ID', 'Date'])
def TimeDifference(df):
    in = df['TimeandDate'].min()
    out = df['TimeandDate'].max()
    df2 = p.DataFrame([in-out], columns=['TimeDiff'])
    return df2
group.apply(TimeDifference)

Desired Result

Output:ID         Date   Check-in    Check-out
10   2020-08-07   07:40:09     12:40:09
10   2020-08-08   07:40:09     17:40:09
12   2020-08-07   08:03:09     10:40:09
12   2020-08-07   14:40:09     16:40:09 
13   2020-08-07   09:22:45     17:57:06

If you get errors, show the error traceback (within BBcode error tags), complete and unaltered.

(Apr-03-2022, 10:38 PM)Larz60+ Wrote: [ -> ]If you get errors, show the error traceback (within BBcode error tags), complete and unaltered.

hello , when i tried this function in stack overflow:

new_col = []
for i in df.ID.unique():
    for d in df.Date.unique():
        p = df.loc[(df.ID==i)&(df.Date==d)]
        suffix = sorted(list(range(1,len(p)))*2)[:len(p)]
        if len(suffix)%2!=0 and len(suffix)>1:
            suffix[-2]=np.nan
            suffix[-1]-=1
        new_col.extend(suffix)

df['new'] = new_col
df.dropna().groupby(['ID','Date','new'], as_index=False).agg({'Time':[min,max]}).drop('new', axis=1, level=0)

i always get this error

Error:
ValueError: Length of values (2623) does not match length of index (2667)

You're getting a bad value exception, but the error message does not look like a complete message.

(Apr-03-2022, 10:47 PM)Larz60+ Wrote: [ -> ]You're getting a bad value exception, but the error message does not look like a complete message.

This is the full error message :

Error:---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-338-9b9ac9bdf42b> in <module>
     10         new_col.extend(suffix)
     11 
---> 12 df['new'] = new_col

~\anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
   3038         else:
   3039             # set column
-> 3040             self._set_item(key, value)
   3041 
   3042     def _setitem_slice(self, key: slice, value):

~\anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
   3114         """
   3115         self._ensure_valid_index(value)
-> 3116         value = self._sanitize_column(key, value)
   3117         NDFrame._set_item(self, key, value)
   3118 

~\anaconda3\lib\site-packages\pandas\core\frame.py in _sanitize_column(self, key, value, broadcast)
   3762 
   3763             # turn me into an ndarray
-> 3764             value = sanitize_index(value, self.index)
   3765             if not isinstance(value, (np.ndarray, Index)):
   3766                 if isinstance(value, list) and len(value) > 0:

~\anaconda3\lib\site-packages\pandas\core\internals\construction.py in sanitize_index(data, index)
    745     """
    746     if len(data) != len(index):
--> 747         raise ValueError(
    748             "Length of values "
    749             f"({len(data)}) "

ValueError: Length of values (2623) does not match length of index (2667)

Not enough provided, The error is from pandas, but you are not showing your pandas dataframe.

you are trying to use your dataframe prior to it's being created (begining with loop line 2).
You can do this if you move your loop into a function, so long as the function is called after the dataframe has been defined.

(Apr-04-2022, 04:05 PM)Larz60+ Wrote: [ -> ]you are trying to use your dataframe prior to it's being created (begining with loop line 2).
You can do this if you move your loop into a function, so long as the function is called after the dataframe has been defined.

Sorry for the late reply , i tried moving it into a function and that didnt work too , based on what i said and what i have tried can you tell if its possible to get the desired results ?

please show your code attempt

Smordy

Larz60+

Smordy

Larz60+

Smordy

Larz60+

Larz60+

Smordy

Larz60+