Posts: 50
Threads: 2
Joined: Nov 2019
Jan-03-2020, 07:26 PM
(This post was last modified: Jan-03-2020, 07:26 PM by sandeep_ganga.)
try add below line as needed and see if that helps,
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove rows that match letter D in selected dataframe column
df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
import pandas as pd
xls = pd.ExcelFile(r'C:\Users\testuser\Downloads\DL\book12.xls')
data = pd.read_excel(xls, sheet_name="Sheet1")
pd.options.display.max_rows = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
df["Date"].fillna("No Date", inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#added lines from here
print("****************")
df=pd.DataFrame(aircraft)
#print(df) #original df
print("*****************")
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
print("*****************")
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS'], value='DHS')
#print(df)#replaces the SD with DS and HSD/HDS with DHS
print("*********************")
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
print(df)
print("*********************") Output: python test1.py
[nan, 'L', 'S', 'H', 'LHS', 'D', 'HS', 'P', 'LSSD', 'LSS', 'SS', 'LH', 'DH', 'DHS', 'SSSHH']
****************
*****************
*****************
*********************
Aircraft-combined
Venue BID Date DISPLAY/
ANNAN 606 No Date DISPLAY DS
COTTESMORE 757 No Date DISPLAY DS
EAST FORTUNE 406 No Date DISPLAY DHS
GUERNSEY 582 2008-09-10 00:00:00 DISPLAY DHS
JERSEY 581 2008-09-10 00:00:00 DISPLAY DHS
KEMBLE 316 2008-06-21 00:00:00 DISPLAY DS
No Date DISPLAY DHS
PORTRUSH 76 2008-09-05 00:00:00 DISPLAY DHS
2008-09-06 00:00:00 DISPLAY DHS
ROMSEY 175 No Date DISPLAY DHS
ROUGHAM 551 No Date DISPLAY DL
SANICOLE 69 No Date DISPLAY DHS
SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY DS
SHAWBURY 188 2008-07-24 00:00:00 DISPLAY DS
SOUTHPORT 584 No Date DISPLAY DHS
SUNDERLAND 333 2008-07-25 00:00:00 DISPLAY DH
No Date DISPLAY SHSD
TARRANT RUSHDEN 132 No Date DISPLAY DHS
WHITBY 125 2008-08-22 00:00:00 DISPLAY DHS
2008-08-23 00:00:00 DISPLAY DHS
WIGTON 144 2008-07-26 00:00:00 DISPLAY DHS
WINDERMERE 138 2008-07-25 00:00:00 DISPLAY DHS
2008-07-26 00:00:00 DISPLAY DHS
*********************
Best Regards,
Sandeep
GANGA SANDEEP KUMAR
Posts: 218
Threads: 27
Joined: May 2018
Jan-05-2020, 01:31 AM
(This post was last modified: Jan-05-2020, 01:37 AM by eddywinch82.)
Wow, thankyou so much Sandeep, for all your efforts, I very much appreciate your help.
How do I get that Data, shown in the Output part, of your last reply. Displayed in the Style and Font, the Data shown in the Photo is ? From the Website Link below ?
I.e. in that Font, with that same Data, rather than in the plane text ?
Is that possible ? i.e. in the proper DataFrame Font ?
Here is a link, to the picture of what I mean :-
https://www.flickr.com/photos/[email protected]/
Also what should be typed to change, all the 2008 part, of the dates that show, in the date column, to 2009 i.e. from
2008-07-24 to 2009-07-24. And all the dates, that show 1999 as the year, also changed to 2009 ? And how could they all be changed, to the order Day-Month-Year i.e. 24-07-2009 ?
I want that done, as the BBMF Schedule, is from 2009 not 2008.
And what should be typed in the Code, to get the Data, to display in the order it was in, in the original Excel File.
I say that because most if not all the Flypast appearances, don't show the date. So when the venues are shown, in alphabetical order, it makes it harder to know if the Flypast, was associated with a relevant display, i.e. when the aircraft were transiting to the display etc. If the Data is in the Original, non alphabetical order for the Venues, the relevant Flypasts should be underneath, I think, I may be wrong though.
To have data shown, for both Displays and Flypasts, I use the following Code :-
selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])] Update :-
My latest Code is the following :-
import pandas as pd
xls = pd.ExcelFile(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\BBMF Display Schedule 2009.xls')
data = pd.read_excel(xls, sheet_name="Sheet1")
pd.options.display.max_rows = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID','Durn'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
df["Date"].fillna("No Date", inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['DISPLAY/'] == 'FLYPAST']
#selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])]
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/', 'Durn'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#added lines from here
print("****************")
df=pd.DataFrame(aircraft)
#print(df) #original df
print("*****************")
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
print("*****************")
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
#print(df)#replaces the SD with DS and HSD,HDS and SHSD with DHS
print("*********************")
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
df = df.drop(df[df['Aircraft-combined'] == 'DD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'DL'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LDS'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LSSD'].index)
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
print(df)
print("*********************")
#pd.DataFrame(aircraft) And I get the following Output :-
Output: Venue BID Date DISPLAY/ Durn
ANNAN 606 No Date DISPLAY 00:05:00 DS
COTTESMORE 757 No Date DISPLAY 00:05:00 DS
EAST FORTUNE 406 No Date DISPLAY 00:10:00 DHS
KEMBLE 316 2008-06-21 00:00:00 DISPLAY 00:15:00 DS
No Date DISPLAY 00:20:00 DHS
PORTRUSH 76 2008-09-05 00:00:00 DISPLAY 00:15:00 DHS
2008-09-06 00:00:00 DISPLAY 00:15:00 DHS
SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY 00:05:00 DS
SHAWBURY 188 2008-07-24 00:00:00 DISPLAY 00:05:00 DS
SOUTHPORT 584 No Date DISPLAY 00:15:00 DHS
TARRANT RUSHDEN 132 No Date DISPLAY 00:15:00 DHS
WHITBY 125 2008-08-22 00:00:00 DISPLAY 00:15:00 DHS
2008-08-23 00:00:00 DISPLAY 00:15:00 DHS
WIGTON 144 2008-07-26 00:00:00 DISPLAY 00:15:00 DHS
I want to now, have a new additional Column, called In The Same Display Slot. Because of the way, the data is collated, in the Source .xls File. Only the Venues, where the Aircraft are all in the same Display slot, Show here, I think because, when DAK and HS become DHS, as you will understand from the Code, and can see in the Original .XLS document when those values in the Duration Column i.e. Durn don't match i.e. 0:05 and 0:10, i.e. 5 and 10 minutes they don't show in the Output.
Only when both values, are the same do they show. I.e. when DHS are all in the same display slot, although for some reason for the Venue Woodstock, where both values are 0:15 The Row isn't showing, any ideas, why that is the case ?
The Woodstock Row, wasn't showing before either, can't work out why. All other Data Rows that are not showing now, but were before, with the others i.e :-
Output: Venue BID Date DISPLAY/
ANNAN 606 No Date DISPLAY DS
COTTESMORE 757 No Date DISPLAY DS
EAST FORTUNE 406 No Date DISPLAY DHS
GUERNSEY 582 2008-09-10 00:00:00 DISPLAY DHS
JERSEY 581 2008-09-10 00:00:00 DISPLAY DHS
KEMBLE 316 2008-06-21 00:00:00 DISPLAY DS
No Date DISPLAY DHS
PORTRUSH 76 2008-09-05 00:00:00 DISPLAY DHS
2008-09-06 00:00:00 DISPLAY DHS
ROMSEY 175 No Date DISPLAY DHS
SANICOLE 69 No Date DISPLAY DHS
SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY DS
SHAWBURY 188 2008-07-24 00:00:00 DISPLAY DS
SOUTHPORT 584 No Date DISPLAY DHS
SUNDERLAND 333 2008-07-25 00:00:00 DISPLAY DH
No Date DISPLAY DHS
TARRANT RUSHDEN 132 No Date DISPLAY DHS
WHITBY 125 2008-08-22 00:00:00 DISPLAY DHS
2008-08-23 00:00:00 DISPLAY DHS
WIGTON 144 2008-07-26 00:00:00 DISPLAY DHS
WINDERMERE 138 2008-07-25 00:00:00 DISPLAY DHS
2008-07-26 00:00:00 DISPLAY DHS
Sanicole, Romsey Windermere etc, I would also like those Rows showing, with the new In The Same Display Slot Column, in addition to the ones in the first Output, which are the ones that are in the same display slot. So all those Rows Showing, but in the, "In The Same Display Slot" Column I would like all The Venue Rows, where they are in the same Display Slot saying YES and the ones that are not in the same display slot saying NO. Also I have noticed, that in the Date Column, it says No Date in the DataFrame Output. This is when the Date isn't present, in the 'Date' Column, in Rows Of Data in the .xls File. I.e. the Layout, of the Data in the .xls File being the issue.
Here is a Link to a Picture, of a sample of the .xls File Data. Is there a way, to have the Date populate, the empty Columns in the Rows with Data in, up to the right point in the Dataframe ?
Here is the link, to a Picture of what I mean :-
https://www.flickr.com/photos/[email protected]/
Kembles's DHS appearance on the 20th June, won't show a date in the DataFrame, as that Data is missing from that Row. The Date is higher up, but they have not included the Date for every Row, even though the relevant Data for the Date, is lower down. How could this be done for all Rows, giving the Correct Date ?
Any help would be appreciated
Best Wishes From
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Jan-05-2020, 10:54 PM
(This post was last modified: Jan-05-2020, 11:11 PM by eddywinch82.)
I have tried to attempt, based on a Pandas Video on Youtube, I watched earlier. To achieve the Column, 'Dakota And Fighters In Same Slot' With Either YES or No, referring to part, of what I need help with in the Code, mentioned in my last reply.
Here is the Code I typed :-
df['Dakota And Fighters In The Same Display Slot'] = ((df.DISPLAY/ == 'DISPLAY') & (df.Durn == 5|10|15|20)).map({True:'YES', False:'NO'}) I get an Invalid Syntax Error, when I add that Code, to the rest of the Code, The Error Pointer Arrow, points to the '==' in the :-
((df.DISPLAY/ == 'DISPLAY') part of the code, what is causing that issue ?
The Minutes Format is :-
00:05:00 for 5 Minutes, 00:10:00 for 10 Minutes etc, as
can be seen, in one of the Outputs, in my previous reply.
Could that be an issue too ?
Regards
Eddie
Posts: 50
Threads: 2
Joined: Nov 2019
Jan-06-2020, 05:57 AM
(This post was last modified: Jan-06-2020, 05:57 AM by sandeep_ganga.)
I believe the syntax you given is correct only, but when we firstly had the join, group by and other processings done before assigning to dataframe initially, the column names alignment turned bad and which is why the syntax error. the dataframe thinking column name is only 'Aircraft-combined' and header0 as [Venue ,BID,Date,DISPLAY/,Durn] and remaining as data row values i guess.
Output: #original dataframe head 3 rows
print(df.head(n=3))
Aircraft-combined <<<<<<<<<<<<<<<<
Venue BID Date DISPLAY/ Durn <<<<<<<<<<<<<<
ANNAN 606 No Date DISPLAY 00:05:00 DS
ARDINGLY(SUSSEX) 189 No Date DISPLAY 00:10:00 HS
AUDLEM 706 2008-07-26 00:00:00 DISPLAY 00:05:00 S
I workarounded this way for now, export to csv and read again with column names using read_csv.
And, in regard to compare minutes of Durn column, convert the dataframe['Durn'] column to datetime64 and then compare as needed and revert datatype to object.
import pandas as pd
xls = pd.ExcelFile(r'C:\Users\testuser\Downloads\DL\Book12.xls')
data = pd.read_excel(xls, sheet_name="Sheet1")
pd.options.display.max_rows = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID','Durn'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
df["Date"].fillna("No Date", inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
print("=======")
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['DISPLAY/'] == 'FLYPAST']
#selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])]
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/', 'Durn'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#####################
#####################
#####################
#####################
df=pd.DataFrame(aircraft)
#print(df) #original df
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
#print(df)#replaces the SD with DS and HSD,HDS and SHSD with DHS
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
df = df.drop(df[df['Aircraft-combined'] == 'DD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'DL'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LDS'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LSSD'].index)
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
#column names before alignment
print("column names: before change ", list(df.columns.values.tolist()))
print("=======")
print(df)
print("=======")
#to fix column name alignment
df.to_csv(r'C:\Users\testuser\Downloads\DL\tempfile.csv')
df = pd.read_csv(r'C:\Users\testuser\Downloads\DL\tempfile.csv',names=['Venue','BID','Date','DISPLAY/','Durn','Aircraft-combined'], header=0 )
#column names before alignment
print("column names: after change ",list(df.columns.values.tolist()))
print("=======")
print(df)
print("=======")
#convert df['Durn'] column to datetime foramt
df['Durn']= pd.to_datetime(df['Durn'],format='%H:%M:%S').apply(pd.Timestamp)
print("changing datatype of column Durn----->",df['Durn'].dtype)
print("=======")
#compare minutes with extracting minutes from datetime64 column
comparelist=[10,15,20]
df['Dakota And Fighters In The Same Display Slot'] = ((df['DISPLAY/'] == 'DISPLAY') & (df['Durn'].dt.minute.isin(comparelist))).map({True:'YES', False:'NO'})
#print(df)
print("applied... the condition to compare minutes for Durn column")
print("=======")
#return df['Durn'] column datetime64 format to object type
df['Durn'] = pd.to_datetime(df['Durn'], format='%H:%M:%S').dt.time
print("reverted datatype of column Durn back to ----->",df['Durn'].dtype)
print("=======")
print("\n\n*** FINAL RESULTSET ***\n\n")
print(df) #final resultset
print("=======") Output: python test2.py
[nan, 'L', 'S', 'H', 'LHS', 'D', 'HS', 'P', 'LSSD', 'LSS', 'SS', 'LH', 'DH', 'DHS', 'SSSHH']
=======
column names: before change ['Aircraft-combined']
=======
Aircraft-combined
Venue BID Date DISPLAY/ Durn
ANNAN 606 No Date DISPLAY 00:05:00 DS
COTTESMORE 757 No Date DISPLAY 00:05:00 DS
EAST FORTUNE 406 No Date DISPLAY 00:10:00 DHS
KEMBLE 316 2008-06-21 00:00:00 DISPLAY 00:15:00 DS
No Date DISPLAY 00:20:00 DHS
PORTRUSH 76 2008-09-05 00:00:00 DISPLAY 00:15:00 DHS
2008-09-06 00:00:00 DISPLAY 00:15:00 DHS
SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY 00:05:00 DS
SHAWBURY 188 2008-07-24 00:00:00 DISPLAY 00:05:00 DS
SOUTHPORT 584 No Date DISPLAY 00:15:00 DHS
TARRANT RUSHDEN 132 No Date DISPLAY 00:15:00 DHS
WHITBY 125 2008-08-22 00:00:00 DISPLAY 00:15:00 DHS
2008-08-23 00:00:00 DISPLAY 00:15:00 DHS
WIGTON 144 2008-07-26 00:00:00 DISPLAY 00:15:00 DHS
=======
column names: after change ['Venue', 'BID', 'Date', 'DISPLAY/', 'Durn', 'Aircraft-combined']
=======
Venue BID Date DISPLAY/ Durn Aircraft-combined
0 ANNAN 606 No Date DISPLAY 00:05:00 DS
1 COTTESMORE 757 No Date DISPLAY 00:05:00 DS
2 EAST FORTUNE 406 No Date DISPLAY 00:10:00 DHS
3 KEMBLE 316 2008-06-21 00:00:00 DISPLAY 00:15:00 DS
4 KEMBLE 316 No Date DISPLAY 00:20:00 DHS
5 PORTRUSH 76 2008-09-05 00:00:00 DISPLAY 00:15:00 DHS
6 PORTRUSH 76 2008-09-06 00:00:00 DISPLAY 00:15:00 DHS
7 SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY 00:05:00 DS
8 SHAWBURY 188 2008-07-24 00:00:00 DISPLAY 00:05:00 DS
9 SOUTHPORT 584 No Date DISPLAY 00:15:00 DHS
10 TARRANT RUSHDEN 132 No Date DISPLAY 00:15:00 DHS
11 WHITBY 125 2008-08-22 00:00:00 DISPLAY 00:15:00 DHS
12 WHITBY 125 2008-08-23 00:00:00 DISPLAY 00:15:00 DHS
13 WIGTON 144 2008-07-26 00:00:00 DISPLAY 00:15:00 DHS
=======
changing datatype of column Durn-----> datetime64[ns]
=======
applied... the condition to compare minutes for Durn column
=======
reverted datatype of column Durn back to -----> object
=======
*** FINAL RESULTSET ***
Venue BID Date DISPLAY/ Durn Aircraft-combined Dakota And Fighters In The Same Display Slot
0 ANNAN 606 No Date DISPLAY 00:05:00 DS NO
1 COTTESMORE 757 No Date DISPLAY 00:05:00 DS NO
2 EAST FORTUNE 406 No Date DISPLAY 00:10:00 DHS YES
3 KEMBLE 316 2008-06-21 00:00:00 DISPLAY 00:15:00 DS YES
4 KEMBLE 316 No Date DISPLAY 00:20:00 DHS YES
5 PORTRUSH 76 2008-09-05 00:00:00 DISPLAY 00:15:00 DHS YES
6 PORTRUSH 76 2008-09-06 00:00:00 DISPLAY 00:15:00 DHS YES
7 SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY 00:05:00 DS NO
8 SHAWBURY 188 2008-07-24 00:00:00 DISPLAY 00:05:00 DS NO
9 SOUTHPORT 584 No Date DISPLAY 00:15:00 DHS YES
10 TARRANT RUSHDEN 132 No Date DISPLAY 00:15:00 DHS YES
11 WHITBY 125 2008-08-22 00:00:00 DISPLAY 00:15:00 DHS YES
12 WHITBY 125 2008-08-23 00:00:00 DISPLAY 00:15:00 DHS YES
13 WIGTON 144 2008-07-26 00:00:00 DISPLAY 00:15:00 DHS YES
=======
Best Regards,
Sandeep
GANGA SANDEEP KUMAR
Posts: 218
Threads: 27
Joined: May 2018
Hi Sandeep,
Many thanks for your latest reply, to my post.
Could you please read my reply, before my last reply, i.e reply
number 12 on this page. And reply back accordingly, your help has been marvelous, it's very
much appreciated by me.
Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Jan-11-2020, 02:57 PM
(This post was last modified: Jan-11-2020, 02:57 PM by eddywinch82.)
Hi Sandeep,
Based on what you have read, in the 12th Post on this Page, aka mine,
could you help me, sort out what I want to achieve here ?
Have you installed, Jupyter Notebook yet, if you run the Python Codes especially mine, my posts will make more sense. Or could someone else, on this forum help me ?
Any help would be much appreciated.
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Can anyone help me ?
Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Hi Sandeep,
I have sent you a private message, about this Thread, and the other Thread.
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Jan-25-2020, 09:57 PM
(This post was last modified: Jan-25-2020, 09:59 PM by eddywinch82.)
Hi there,
This is the Python Code I have :-
import pandas as pd
xls = pd.ExcelFile(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\BBMF Display Schedule 2009.xls')
data = pd.read_excel(xls, sheet_name="Sheet1")
pd.options.display.max_rows = 1000
pd.options.display.max_columns = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID','Durn'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
df["Date"].fillna("No Date", inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
print("=======")
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['DISPLAY/'] == 'FLYPAST']
#selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])]
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/', 'Durn'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#####################
#####################
#####################
#####################
df=pd.DataFrame(aircraft)
#print(df) #original df
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
#print(df)#replaces the SD with DS and HSD,HDS and SHSD with DHS
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
df = df.drop(df[df['Aircraft-combined'] == 'DD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'DL'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LDS'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LSSD'].index)
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
#column names before alignment
#print("column names: before change ", list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
#to fix column name alignment
df.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv')
df = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv',names=['Venue','BID','Date','DISPLAY/','Durn','Aircraft-combined'], header=0 )
#column names before alignment
#print("column names: after change ",list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
#convert df['Durn'] column to datetime foramt
df['Durn']= pd.to_datetime(df['Durn'],format='%H:%M:%S').apply(pd.Timestamp)
#print("changing datatype of column Durn----->",df['Durn'].dtype)
#print("=======")
#compare minutes with extracting minutes from datetime64 column
comparelist=[5,10,15,20]
df['Dakota And Fighters In The Same Display Slot'] = ((df['DISPLAY/'] == 'DISPLAY') & (df['Durn'].dt.minute.isin(comparelist))).map({True:'YES', False:'NO'})
#print(df)
#print("applied... the condition to compare minutes for Durn column")
#print("=======")
#return df['Durn'] column datetime64 format to object type
df['Durn'] = pd.to_datetime(df['Durn'], format='%H:%M:%S').dt.time
#print("reverted datatype of column Durn back to ----->",df['Durn'].dtype)
#print("=======")
#print("\n\n*** FINAL RESULTSET ***\n\n")
pd.DataFrame(df)
#print(df) #final resultset
#print("=======") I would also Like the Rows showing, which don't show, due to none matching numerical values, and then have the Word 'NO' In the Dakota And Fighters In The Same Display Slot Column.
When the two Numerical Values, are 5 and 10 in the 'Durn' Column I.e. Display Slot Duration, or two other None matching Values, the Dakota and Fighters, are in different display slots. When they match, i.e. 15 and 15 they are in the same Display Slot. Those are the Venue Rows that show, in the current Output I get.
In the 'Venue' Column in the Excel Document, the Venue Name will occupy two Rows, and the Two Numbers in the 'Durn' Column i.e. :-
Output: Venue Durn
KEMBLE 15
KEMBLE 15
So the Row would show.
In The Case of WINDERMERE :-
Output: Venue Durn
WINDERMERE 5
WINDERMERE 10
The Numerical Values differ, so that Venue and others, with differing
Numerical Values, don't show.
Here is the Full Current Output I get :-
Output: Venue BID Date DISPLAY/ Durn Aircraft-combined Dakota And Fighters In The Same Display Slot
0 ANNAN 606 No Date DISPLAY 00:05:00 DS YES
1 COTTESMORE 757 No Date DISPLAY 00:05:00 DS YES
2 EAST FORTUNE 406 No Date DISPLAY 00:10:00 DHS YES
3 KEMBLE 316 2008-06-21 00:00:00 DISPLAY 00:15:00 DS YES
4 KEMBLE 316 No Date DISPLAY 00:20:00 DHS YES
5 PORTRUSH 76 2008-09-05 00:00:00 DISPLAY 00:15:00 DHS YES
6 PORTRUSH 76 2008-09-06 00:00:00 DISPLAY 00:15:00 DHS YES
7 SCARBOROUGH 610 2008-05-25 00:00:00 DISPLAY 00:05:00 DS YES
8 SHAWBURY 188 2008-07-24 00:00:00 DISPLAY 00:05:00 DS YES
9 SOUTHPORT 584 No Date DISPLAY 00:15:00 DHS YES
10 TARRANT RUSHDEN 132 No Date DISPLAY 00:15:00 DHS YES
11 WHITBY 125 2008-08-22 00:00:00 DISPLAY 00:15:00 DHS YES
12 WHITBY 125 2008-08-23 00:00:00 DISPLAY 00:15:00 DHS YES
13 WIGTON 144 2008-07-26 00:00:00 DISPLAY 00:15:00 DHS YES
Could someone please sort out the Output Data, it's all unaligned ?
There are other Columns of course,
Could someone please tell me, coding I could type, to achieve that ?
Regards
Eddie Winch
P.S.
Here is the Link, to the Excel .XLS File in Question :-
http://web.archive.org/web/2009080423493...6AA506.xls
Posts: 218
Threads: 27
Joined: May 2018
Jan-27-2020, 08:11 PM
(This post was last modified: Jan-27-2020, 08:11 PM by eddywinch82.)
Hi there,
In the Python Code I have replaced :-
df["Date"].fillna("No Date", inplace = True) With :-
df["Date"].fillna(method='ffill', inplace = True) So now, the 'No Dates' showing in the 'Date' Column now have the correct Date, in the Date Format I want. As also, I have added the following two lines to the Code :-
df['Date']= pd.to_datetime(df['Date'],format='%Y-%m-%d')
df['Date']= pd.to_datetime(df['Date']).dt.strftime('%d-%m-%Y') Which were modified, from the lines of Code, you kindly supplied today Sandeep, in another Thread of mine.
The Output I get now is :-
Output: Venue BID Date DISPLAY/ Durn Aircraft-combined \
0 ANNAN 606 04-07-2008 DISPLAY 00:05:00 DS
1 COTTESMORE 757 24-07-2008 DISPLAY 00:05:00 DS
2 EAST FORTUNE 406 25-07-2008 DISPLAY 00:10:00 DHS
3 KEMBLE 316 20-06-2008 DISPLAY 00:20:00 DHS
4 KEMBLE 316 21-06-2008 DISPLAY 00:15:00 DS
5 PORTRUSH 76 05-09-2008 DISPLAY 00:15:00 DHS
6 PORTRUSH 76 06-09-2008 DISPLAY 00:15:00 DHS
7 ROMSEY 175 31-08-2008 DISPLAY 00:15:00 DHS
8 SCARBOROUGH 610 25-05-2008 DISPLAY 00:05:00 DS
9 SHAWBURY 188 24-07-2008 DISPLAY 00:05:00 DS
10 SOUTHPORT 584 26-09-2008 DISPLAY 00:15:00 DHS
11 TARRANT RUSHDEN 132 04-06-2008 DISPLAY 00:15:00 DHS
12 WHITBY 125 22-08-2008 DISPLAY 00:15:00 DHS
13 WHITBY 125 23-08-2008 DISPLAY 00:15:00 DHS
14 WIGTON 144 26-07-2008 DISPLAY 00:15:00 DHS
15 WOODSTOCK 362 13-09-2008 DISPLAY 00:15:00 DHS
Output: Dakota And Fighters In The Same Display Slot
0 YES
1 YES
2 YES
3 YES
4 YES
5 YES
6 YES
7 YES
8 YES
9 YES
10 YES
11 YES
12 YES
13 YES
14 YES
15 YES
How do I change all 2008 in the Date Column to 2009 ? And how do I sort the Data, so that the Rows, go descending, earliest Date to the latest Date, in the 'Date' Column ? And what will I need to change, so that the Rows not showing, due to the non matching values in the 'Durn' Column, do show and have NO showing in the 'Dakota And Fighters In The Same Display Slot' Column, for those Rows ? For the matching values, in comparelist=[5,10,15,20,] if they match they occupy both rows, the same Value, so 5+5 = 10, 10+10 = 20, 15+15 = 30, 20+20 = 40.
Whereas for those that don't match, i.e. 5 in one row and 10 in the other, 5+10 = 15. How would I write Code, to achieve what I am getting at ? If you look at the .xls File document, you will see what I mean, the Link to that .xls File, is in my previous Post, i.e. number 19. Any help would be appreciated by you Sandeep, or someone else.
Regards
Eddie Winch
|