Python Forum
Pandas : How to create an algorithm that helps me improve results and creating new co
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas : How to create an algorithm that helps me improve results and creating new co
#1
Question 
Link to question on StackOverflow
in the stackoverflow Yaloa_21 understood what i want to make but i always get errors
https://stackoverflow.com/questions/7157...5_71593640

Question about pandas
it's a little bit complicated , i have this dataframe :

ID           TimeandDate      Date        Time
10   2020-08-07 07:40:09  2022-08-07   07:40:09
10   2020-08-07 08:50:00  2022-08-07   08:50:00
10   2020-08-07 12:40:09  2022-08-07   12:40:09
10   2020-08-08 07:40:09  2022-08-08   07:40:09
10   2020-08-08 17:40:09  2022-08-08   17:40:09
12   2020-08-07 08:03:09  2022-08-07   08:03:09
12   2020-08-07 10:40:09  2022-08-07   10:40:09
12   2020-08-07 14:40:09  2022-08-07   14:40:09
12   2020-08-07 16:40:09  2022-08-07   16:40:09
13   2020-08-07 09:22:45  2022-08-07   09:22:45
13   2020-08-07 17:57:06  2022-08-07   17:57:06
i want to create new dataframe with 2 new columns the first one is df["Check-in"] , as you can see my data doesnt have any indicator to show what time the id has checked in , so i will suppose that the first time for every id is a check-in , and the next row is a check-out and will be inserted in df["Check-out"] , also if a check-in doesnt have a check-out time it has to be registred as the check-out for the previous check-out of the same day

i tried to do this but i'm afraid its not efficient because it shows the first and last one imagine if ID=13 has entered at 07:40:09 and the he check out at 08:40:09 , later that day he returns at 19:20:00 and leave in the next 10 minutes 19:30:00 if i do that fonction it will show that he worked for 12 hours

group = df.groupby(['ID', 'Date'])
def TimeDifference(df):
    in = df['TimeandDate'].min()
    out = df['TimeandDate'].max()
    df2 = p.DataFrame([in-out], columns=['TimeDiff'])
    return df2
group.apply(TimeDifference) 
Desired Result

Output:
ID Date Check-in Check-out 10 2020-08-07 07:40:09 12:40:09 10 2020-08-08 07:40:09 17:40:09 12 2020-08-07 08:03:09 10:40:09 12 2020-08-07 14:40:09 16:40:09 13 2020-08-07 09:22:45 17:57:06
Reply
#2
If you get errors, show the error traceback (within BBcode error tags), complete and unaltered.
Reply
#3
(Apr-03-2022, 10:38 PM)Larz60+ Wrote: If you get errors, show the error traceback (within BBcode error tags), complete and unaltered.
hello , when i tried this function in stack overflow:
new_col = []
for i in df.ID.unique():
    for d in df.Date.unique():
        p = df.loc[(df.ID==i)&(df.Date==d)]
        suffix = sorted(list(range(1,len(p)))*2)[:len(p)]
        if len(suffix)%2!=0 and len(suffix)>1:
            suffix[-2]=np.nan
            suffix[-1]-=1
        new_col.extend(suffix)

df['new'] = new_col
df.dropna().groupby(['ID','Date','new'], as_index=False).agg({'Time':[min,max]}).drop('new', axis=1, level=0)
i always get this error
Error:
ValueError: Length of values (2623) does not match length of index (2667)
Reply
#4
You're getting a bad value exception, but the error message does not look like a complete message.
Reply
#5
(Apr-03-2022, 10:47 PM)Larz60+ Wrote: You're getting a bad value exception, but the error message does not look like a complete message.
This is the full error message :
Error:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-338-9b9ac9bdf42b> in <module> 10 new_col.extend(suffix) 11 ---> 12 df['new'] = new_col ~\anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value) 3038 else: 3039 # set column -> 3040 self._set_item(key, value) 3041 3042 def _setitem_slice(self, key: slice, value): ~\anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value) 3114 """ 3115 self._ensure_valid_index(value) -> 3116 value = self._sanitize_column(key, value) 3117 NDFrame._set_item(self, key, value) 3118 ~\anaconda3\lib\site-packages\pandas\core\frame.py in _sanitize_column(self, key, value, broadcast) 3762 3763 # turn me into an ndarray -> 3764 value = sanitize_index(value, self.index) 3765 if not isinstance(value, (np.ndarray, Index)): 3766 if isinstance(value, list) and len(value) > 0: ~\anaconda3\lib\site-packages\pandas\core\internals\construction.py in sanitize_index(data, index) 745 """ 746 if len(data) != len(index): --> 747 raise ValueError( 748 "Length of values " 749 f"({len(data)}) " ValueError: Length of values (2623) does not match length of index (2667)
Reply
#6
Not enough provided, The error is from pandas, but you are not showing your pandas dataframe.
Reply
#7
you are trying to use your dataframe prior to it's being created (begining with loop line 2).
You can do this if you move your loop into a function, so long as the function is called after the dataframe has been defined.
Reply
#8
(Apr-04-2022, 04:05 PM)Larz60+ Wrote: you are trying to use your dataframe prior to it's being created (begining with loop line 2).
You can do this if you move your loop into a function, so long as the function is called after the dataframe has been defined.

Sorry for the late reply , i tried moving it into a function and that didnt work too , based on what i said and what i have tried can you tell if its possible to get the desired results ?
Reply
#9
please show your code attempt
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Better python library to create ER Diagram by using pandas data frames as tables klllmmm 0 1,121 Oct-19-2023, 01:01 PM
Last Post: klllmmm
Bug New to coding, Using the zip() function to create Diret and getting weird results Shagamatula 6 1,446 Apr-09-2023, 02:35 PM
Last Post: Shagamatula
Sad pandas writer create "corrupted" file freko75 1 2,814 Jun-14-2022, 09:57 PM
Last Post: snippsat
  Helps with reading csv file - 3 methods hhchenfx 4 3,278 May-13-2021, 04:15 AM
Last Post: buran
  I really need help, I am new to python, I am using a book that helps me to learn JaprO 5 2,982 Nov-28-2020, 02:30 PM
Last Post: JaprO
  Search Results Web results Printing the number of days in a given month and year afefDXCTN 1 2,235 Aug-21-2020, 12:20 PM
Last Post: DeaD_EyE
  How to append one function1 results to function2 results SriRajesh 5 3,152 Jan-02-2020, 12:11 PM
Last Post: Killertjuh
  Creating multiple text fies from new url retrieve results wmc326 1 3,129 Jul-13-2017, 10:57 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020