Python Forum

Full Version: Do Calculation between Rows based on Column values - Pandas Dataframe
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello, I have a sharepoint list that I am importing in as a Pandas dataframe. The resulting code generates a df which looks something like below

                                        Task Name              % Complete  ...   Modified By   Version   Identifier
0    Equipment Intelligent Special Network System                    0.00  ...  Dominic Leow     1.0          1
1              Core Switch, 24SFP+8GE, Combo+4SFP                    0.00  ...  Dominic Leow     1.0        1.1
2                                         Level 1                    0.00  ...  Dominic Leow     1.0      1.1.1
3                                         Hacking                    0.30  ...  Dominic Leow     2.0    1.1.1.1
4                                      PVC Piping                    0.20  ...  Dominic Leow     2.0    1.1.1.2
5                                        Trunking                    0.45  ...  Dominic Leow     2.0    1.1.1.3
6                                         Cabling                    0.90  ...  Dominic Leow     2.0    1.1.1.4
7                                         Testing                    0.25  ...  Dominic Leow     2.0    1.1.1.5
8                                     Termination                    0.10  ...  Dominic Leow     2.0    1.1.1.6
9                                         Level 2                    0.00  ...  Dominic Leow     1.0      1.1.2
10                                        Hacking                    0.00  ...  Dominic Leow     1.0    1.1.2.1
11                                     PVC Piping                    0.00  ...  Dominic Leow     1.0    1.1.2.2
12                                       Trunking                    0.00  ...  Dominic Leow     1.0    1.1.2.3
13                                        Cabling                    0.00  ...  Dominic Leow     1.0    1.1.2.4
14                                        Testing                    0.00  ...  Dominic Leow     1.0    1.1.2.5
15                                    Termination                    0.00  ...  Dominic Leow     1.0    1.1.2.6
16                                        Level 3                    0.00  ...  Dominic Leow     1.0      1.1.3
The last column is the 'Identifier' field which basically has patterns based on the level of task. I need to calculate the % complete between two Identifiers that has a pattern (x.x.x - where x is a digit) and then update the total percentage task completed in the % complete column for the upper task having Identifier (x.x.x)

Below is the code i have come up with until now but I don't know where to go with this anymore - Please help, the field reports for my company is dependent on this.

from shareplum import Site
from shareplum import Office365
import pandas as pd
import re
pd.set_option('display.max_rows', None)

authcookie = Office365('https://speedmax.sharepoint.com', username='username', password='password').GetCookies()
site = Site('https://speedmax.sharepoint.com/sites/jdtstadium', authcookie=authcookie)
sp_list = site.List('joblist')
data = sp_list.GetListItems('All Tasks', rowlimit=5000)
df = pd.DataFrame(data)

stringMatch_mainTask = re.compile(r'^\d$')
stringMatch_bqItem = re.compile(r'^\d'+'.'+'\d$')
stringMatch_level = re.compile(r'^\d'+'.'+'\d'+'.'+'\d$')
stringMatch_job	= re.compile(r'^\d'+'.'+'\d'+'.'+'\d'+'.'+'\d$')

mainTaskdf = df[df['Identifier'].str.contains(stringMatch_mainTask)]
bqItemdf = df[df['Identifier'].str.contains(stringMatch_bqItem)]
leveldf = df[df['Identifier'].str.contains(stringMatch_level)]
jobdf = df[df['Identifier'].str.contains(stringMatch_job)]


for index, row in df.iterrows():
    countRows = jobdf.shape[0]
    visualRows = df
    print(df)
    break

# print(jobdf)