Filter data based on a value from another dataframe column and create a file using lo

pawanmtm · (This post was last modified: Jul-09-2020, 05:49 PM by pawanmtm.)

Hi All,

I started learning Python few days ago. With google help i started coding to automate a process as described below:

I have a file with some data (n rows and 79 columns), do some processing like filtering, merging, etc. Once it is processed, i have to created a file with 3 excel sheets (2 data and 1 summary sheet) based on the value of another dataframe column (column contains unique email id values). The file name should be the value of the column, i.e., email id.

could someone help to proceed further.

below is the code for your reference.

import pandas as pd
import numpy as np
import datetime as dt
from dateutil.relativedelta import relativedelta
from datetime import date

#read data from excel sheet
df = pd.read_excel(r'C:\Users\ppadyala\Desktop\User Wise Testing\Data.xlsx', sheet_name='Raw Data') 

# to assign list of values a variable
array = ["Query", "Approval"]

# to filter based on multiple values in a column
df = df.loc[df['Current Major Status'].isin(array)]
df['Submitted for Approval'] = df['Submitted for Approval'].str.replace(' CET', '')

#convert from object to datetime64[ns]
df['Submitted for Approval'] = pd.to_datetime(df['Submitted for Approval']) 

# calculate receipt aging
df['Receipt Aging'] = (df['Today'] - df['Created Date.1']).dt.days 

# replace null with submitted for approval
df['Last Query Raised On.1'] = df['Last Query Raised On.1'].fillna(df['Submitted for Approval']) 

# replace null with created date
df['Last Query Raised On.1'] = df['Last Query Raised On.1'].fillna(df['Created Date.1'])

# calculate exception aging
df['Exception Aging'] = (df['Today'] - df['Last Query Raised On.1']).dt.days

#create a list of conditions
conditions = [
    (df['Exception Aging'] < 6),
    (df['Exception Aging'] < 11),
    (df['Exception Aging'] < 21),
    (df['Exception Aging'] < 31),
    (df['Exception Aging'] >= 31),
]

# create a list of the values we want to assign for each condition
values = ['0-5 Days', '6-10 Days', '11-20 Days', '21-30 Days', '30+ Days']

# create a new column and use np.select to assign values to it using our lists as arguments
df['Exception Aging Bucket'] = np.select(conditions, values)

# read client email id's list
clientlist = pd.read_excel(r'C:\Users\ppadyala\Desktop\User Wise Testing\User List.xlsx')

# do vlookup with client list
df_new = pd.merge(df, clientlist, on = 'Task Owner Full Name', how='left')

# replace null with email id
df_new['Latest External Query Resolver'] = df_new['Latest External Query Resolver'].fillna(df_new['mail'])

# find the missing latest external query resolver
missingLEQR = pd.isnull(df_new['Latest External Query Resolver'])

# Assign the missing latest external query resolver to a variable
blankLEQR = df_new[missingLEQR]

# write the missing ones to excel
blankLEQR.to_excel('blankLEQR.xlsx', index = False)

#remove mail column
df_new = df_new.drop(['mail'], axis = 1)

#remove blank rows from specific column
df_new = df_new.dropna(subset=['Latest External Query Resolver'])

#sort the table based on leqr
df_new.sort_values(by=['Latest External Query Resolver'])

# copy leqr to new df
leqr = df_new['Latest External Query Resolver']

# remove duplicates from leqr
leqr = leqr.drop_duplicates()

Regards,
Pavan

pawanmtm · Jul-15-2020, 06:20 PM

Hi All,

Just wondering if anyone has got any luck on the above requirement.

Regards,
Pavan

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Create dataframe from the unique data of two dataframes	Calab	6	1,365	Mar-02-2025, 01:51 PM Last Post: Pedroski55
	Find duplicates in a pandas dataframe list column on other rows	Calab	2	2,500	Sep-18-2024, 07:38 PM Last Post: Calab
	Find strings by index from a list of indexes in a different Pandas dataframe column	Calab	3	1,795	Aug-26-2024, 04:52 PM Last Post: Calab
	Create new column in dataframe	Scott	10	3,976	Jun-30-2024, 10:18 PM Last Post: Scott
	attempt to split values from within a dataframe column	mbrown009	9	6,367	Jun-20-2024, 07:59 PM Last Post: AdamHensley
	Putting column name to dataframe, can't work.	jonah88888	2	3,403	Jun-18-2024, 09:19 PM Last Post: AdamHensley
	concat 3 columns of dataframe to one column	flash77	2	2,256	Oct-03-2023, 09:29 PM Last Post: flash77
	HTML Decoder pandas dataframe column	mbrown009	3	2,871	Sep-29-2023, 05:56 PM Last Post: deanhystad
	Supervised learning, tree based model - problems splitting data	Pixel	0	1,347	May-16-2023, 05:25 PM Last Post: Pixel
	Grouping Data based on 30% bracket	purnima1	0	1,461	Feb-16-2023, 07:14 PM Last Post: purnima1

Filter data based on a value from another dataframe column and create a file using lo

User Panel Messages

Announcements