Posts: 218
Threads: 27
Joined: May 2018
Hi there,
After some research on the Internet, finding some useful Python Code for Dates, I now have the following end bit of the Python Code :-
#return df['Durn'] column datetime64 format to object type
df['Durn'] = pd.to_datetime(df['Durn'], format='%H:%M:%S').dt.time
#print("reverted datatype of column Durn back to ----->",df['Durn'].dtype)
#print("=======")
#print("\n\n*** FINAL RESULTSET ***\n\n")
df['Date']= pd.to_datetime(df['Date'],format='%Y-%m-%d')
#df['Date']= pd.to_datetime(df['Date']).dt.strftime('%d-%m-%Y')
##added two lines above to convert date format
df['Date'] = df['Date'].mask(df['Date'].dt.year == 2008,
df['Date'] + pd.offsets.DateOffset(year=2009))
df=df.loc[df.Date.dt.strftime('%m%d').astype(int).argsort()]
df['Date']= pd.to_datetime(df['Date']).dt.strftime('%d-%m-%Y')
pd.DataFrame(df) So that now I have the Dates, in the Format I want, saying the correct Year 2009, instead of 2008.
And sorted by Date descending, Earliest Date to Latest, in the Output I get, when I run the Code.
Now what will I need to add, to the Code, so that the Rows not showing, due to the non matching values in the 'Durn' Column, do show and have NO showing in the 'Dakota And Fighters In The Same Display Slot' Column, for those Rows ? For the matching values, in comparelist=[5,10,15,20,] if they match they occupy both rows, the same Value, so 5+5 = 10, 10+10 = 20, 15+15 = 30, 20+20 = 40.
Whereas for those that don't match, i.e. 5 in one row and 10 in the other, 5+10 = 15. How would I write Code, to achieve what I am getting at ? If you look at the .xls File document, you will see what I mean, the Link to that .xls File, is in my number 19 Post. Any help would be appreciated by you Sandeep, or someone else.
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Jan-30-2020, 11:54 PM
(This post was last modified: Jan-30-2020, 11:54 PM by eddywinch82.)
Can anyone help me ? Is it fairly straightforward, what I am trying to achieve here ?
Regards
Eddie
Posts: 218
Threads: 27
Joined: May 2018
Jan-31-2020, 04:02 PM
(This post was last modified: Jan-31-2020, 04:02 PM by eddywinch82.)
Hi there,
Could a Moderator move, my following Thread, to the Jobs Forum Section, from the Data Science Forum Section ?
I would like help, to finish My Python Code, but nobody has answered, my recent questions.
So I would like to pay someone, to help finish my Code off, if that is okay ? Unless someone, is able and willing to help me ?
Here is the Thread Link :-
https://python-forum.io/Thread-Filtering...cal-Values
Best Regards
Eddie Winch
Posts: 4,784
Threads: 76
Joined: Jan 2018
I moved the thread but I think you should clearly say in the thread that you are ready to pay someone to finish the program.
Posts: 218
Threads: 27
Joined: May 2018
Jan-31-2020, 04:39 PM
(This post was last modified: Jan-31-2020, 05:17 PM by eddywinch82.)
Hi there,
If someone could sort out, what I want to achieve in my Code, by reading all 3 pages, of this Thread ? Could this be sorted out, fairly cheaply, for minimal cost ? I am willing to pay someone, to help finish of my Code.
Here is the full Python Code :-
import pandas as pd
xls = pd.ExcelFile(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\BBMF Display Schedule 2009.xls')
data = pd.read_excel(xls, sheet_name="Sheet1")
pd.options.display.max_rows = 1000
pd.options.display.max_columns = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID','Durn'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
#df["Date"].fillna("No Date", inplace = True)
df["Date"].fillna(method='ffill', inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
print("=======")
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['DISPLAY/'] == 'FLYPAST']
#selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])]
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/', 'Durn'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#####################
#####################
#####################
#####################
df=pd.DataFrame(aircraft)
#print(df) #original df
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
#print(df)#replaces the SD with DS and HSD,HDS and SHSD with DHS
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
df = df.drop(df[df['Aircraft-combined'] == 'DD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'DL'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LDS'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LSSD'].index)
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
#column names before alignment
#print("column names: before change ", list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
#to fix column name alignment
df.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv')
df = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv',names=['Venue','BID','Date','DISPLAY/','Durn','Aircraft-combined'], header=0 )
#column names before alignment
#print("column names: after change ",list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
#convert df['Durn'] column to datetime foramt
df['Durn']= pd.to_datetime(df['Durn'],format='%H:%M:%S').apply(pd.Timestamp)
#print("changing datatype of column Durn----->",df['Durn'].dtype)
#print("=======")
#compare minutes with extracting minutes from datetime64 column
comparelist=[5,10,15,20,]
df['Dakota And Fighters In The Same Display Slot'] = ((df['DISPLAY/'] == 'DISPLAY') & (df['Durn'].dt.minute.isin(comparelist))).map({True:'YES', False:'NO'})
#print(df)
#print("applied... the condition to compare minutes for Durn column")
#print("=======")
#return df['Durn'] column datetime64 format to object type
df['Durn'] = pd.to_datetime(df['Durn'], format='%H:%M:%S').dt.time
#print("reverted datatype of column Durn back to ----->",df['Durn'].dtype)
#print("=======")
#print("\n\n*** FINAL RESULTSET ***\n\n")
df['Date']= pd.to_datetime(df['Date'],format='%Y-%m-%d')
#df['Date']= pd.to_datetime(df['Date']).dt.strftime('%d-%m-%Y')
##added two lines above to convert date format
df['Date'] = df['Date'].mask(df['Date'].dt.year == 2008,
df['Date'] + pd.offsets.DateOffset(year=2009))
df=df.loc[df.Date.dt.strftime('%m%d').astype(int).argsort()]
df['Date']= pd.to_datetime(df['Date']).dt.strftime('%d-%m-%Y')
pd.DataFrame(df)
#print(df) #final resultset
#print("=======") Also I would like to Backfill, NaN Values For a given Column, i.e. in this case 'Date', and for a Particular Date from that Column.
I have allready Forward Filled for missing Dates, in my DataFrame in Pandas. I am working with an Excel .xls File , in my DataFrame in Pandas, using Jupyter Notebook.
This is the Line Of Code I used :-
df["Date"].fillna(method='ffill', inplace = True) But I would like to narrow that down to a particular Date, this time back filling.
Could someone tell me, what I should type to do that ?
Any help would be appreciated, with this Code.
Best Regards
Eddie Winch
To better explain, The Rows not showing, due to the non matching values in the 'Durn' Column, I also would like showing in the DataFrame Output, and have NO showing in the 'Dakota And Fighters In The Same Display Slot' Column for those rows, for those Rows ? For the matching values, in comparelist=[5,10,15,20,] if they match for a given 'Venue', from the Venue Column, there are two values, in the two cells, in the Excel Document, two matching values, so 5+5 = 10, 10+10 = 20, 15+15 = 30, 20+20 = 40.
Whereas for those that don't match, i.e. 5 in one row and 10 in the other, 5+10 = 15. How would I write Code, to achieve what I am getting at, i.e. to use Maths in Code, to get the necessary DataFrame Output ? Or can Code, be written, saying to also show rows, where the two values arn't matching ? If someone could show both ways, If you look at the .xls File document, you will see what I mean, the Link to that .xls File, is in my number 19 Post.
Any help would be appreciated
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Feb-03-2020, 12:52 PM
(This post was last modified: Feb-03-2020, 12:52 PM by eddywinch82.)
If someone could PM me, using the Messaging System, if you are willing to help me ? Or send me an Email, to [email protected] letting me know how much the Job, could be done for etc ?
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Feb-18-2020, 10:19 AM
(This post was last modified: Feb-18-2020, 10:19 AM by eddywinch82.)
Hi there,
I have the following SQL Lines of Code :-
SELECT * INTO #Temp4
FROM
(
SELECT
tt5.[Venue],
tt5.[BID],
MIN(tt1.[Date]) AS [Date],
tt5.[Display / Flypast],
SUM(tt1.[Duration]) AS [Duration (mins)],
(SELECT DISTINCT '; ' + t5.[Aircraft Combined]
FROM #Temp5 t5
WHERE tt5.[Venue] = t5.[Venue]
AND tt5.[BID] = t5.[BID]
AND tt5.[Display / Flypast] = t5.[Display / Flypast]
FOR XML PATH ('')) [Aircraft Combined],
SUM(tt5.[Dakota and Fighters in Same Slot?]) AS [Dakota and Fighters in Same Slot?] Those lines of SQL Code I posted, are significant, because it deals, with adding of two particular Numerical values together, which I wan't to do in the Python Pandas Code, I have at the moment. I go into more detail about this, in earlier Posts, in this Thread, I run the Code in Jupyter Notebook.
So if someone could tell me what the Python Code, equivalent would be, for those lines of SQL Code. I could change my current Code hopefully, to achieve the Output DataFrame I want.
I really need help with this, would someone be able to help me ?
Regards
Eddie Winch ))
Posts: 8
Threads: 1
Joined: Feb 2020
Feb-23-2020, 01:28 PM
(This post was last modified: Feb-23-2020, 01:28 PM by GodMaster.)
Hi @ eddywinch82
I interested in this thread and made the result.
What I did is: convert origin 'Durn' DateTime to milliseconds -> Sum -> convert to DateTime
Check and let me know if this result is perfect.
Posts: 218
Threads: 27
Joined: May 2018
Feb-24-2020, 08:21 PM
(This post was last modified: Feb-24-2020, 08:21 PM by eddywinch82.)
Hi GodMaster,
Thankyou very much, for your interest in this Thread, and taking the time to attempt, to achieve what I
want, thats very much appreciated.
I have had help by a Python FreeLancer on the Internet, and he has sorted out what I want to achieve. Here below is the Finished Code :-
import pandas as pd
data = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\BBMF-2009-Code.csv')
pd.options.display.max_rows = 1000
pd.options.display.max_columns = 1000
df = pd.DataFrame(data, columns= ['Venue','A/C','DISPLAY/','Date','BID','Durn'])
#df[(df['Venue'].str.contains('[a-zA-Z]') & (df['DISPLAY/'].str.contains('DISPLAY') & df['A/C'].str.contains("DHS|DAK|HS|SPIT")) & (df['A/C'] != 'LHS') & (df['A/C'] != 'LANC'))]
#df["Date"].fillna("No Date", inplace = True)
df["Date"].fillna(method='ffill', inplace = True)
df['A/C'].unique().tolist()
rename_map = {
'DAK': 'D',
'SPIT': 'S',
'LANC': 'L',
'HURRI': 'H',
'PARA': 'P'
}
df['A/C std'] = df['A/C'].replace(rename_map)
print(df['A/C std'].unique().tolist())
print("=======")
selected = df.loc[df['DISPLAY/'] == 'DISPLAY']
#selected = df.loc[df['DISPLAY/'] == 'FLYPAST']
#selected = df.loc[df['DISPLAY/'].isin(['DISPLAY', 'FLYPAST'])]
#selected = df.loc[df['A/C'].isin(['DS', 'DH', 'DHS', 'SD', 'HSD'])]
df1=df
groupby_venue_date = selected.groupby(['Venue', 'BID', 'Date', 'DISPLAY/'])
aircraft = groupby_venue_date['A/C std'].apply(''.join).rename('Aircraft-combined')
def sorted_string(s):
return ''.join(sorted(s))
def remove_duplicate_chars(s):
return ''.join(set(s))
#####################
#####################
#####################
#####################
df=pd.DataFrame(aircraft)
#print(df) #original df
indexNames = df[~df['Aircraft-combined'].str.contains('D',na=False)].index
df.drop(indexNames , inplace=True)
#print(df)#prints all rows having D and removes remaining rows
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace='SD', value='DS')
df['Aircraft-combined']=df['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
#print(df)#replaces the SD with DS and HSD,HDS and SHSD with DHS
df = df.drop(df[df['Aircraft-combined'] == 'D'].index)#remove value that match letter D
df = df.drop(df[df['Aircraft-combined'] == 'DD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'DL'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LD'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LDS'].index)
df = df.drop(df[df['Aircraft-combined'] == 'LSSD'].index)
#df = df.drop(df[df['Aircraft-combined'] != 'D'].index)#if need the rows that match only letter D in selected dataframe column
#column names before alignment
#print("column names: before change ", list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
#to fix column name alignment
df.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv')
df = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv',names=['Venue','BID','Date','DISPLAY/','Aircraft-combined'], header=0 )
#column names before alignment
#print("column names: after change ",list(df.columns.values.tolist()))
#print("=======")
#print(df)
#print("=======")
###My code Starts here
x,y,z=df['Venue'],df['BID'],df['Date']
x=list(x)
y=list(y)
z=list(z)
import numpy as np
h=np.array(y)
np.unique(h)
yy=[]
for jj in y:
yy.append(str(jj))
t= df1['Venue'].isin(x)
df2=df1[t]
df2
g=df2['Date'].isin(z)
df3=df2[g]
df3=pd.DataFrame(df3)
df3
g=df3['BID'].isin(yy)
df4=df3[g]
df4=pd.DataFrame(df4)
df4
column_names = ['Venue','A/C','DISPLAY/','Date','BID','Durn','Aircraft-combined']
dff = pd.DataFrame(columns = column_names)
dff.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv')
dff = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv',names=['Venue','A/C','DISPLAY/','Date','BID','Durn','Aircraft-combined'], header=0 )
df4
df4.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv')
df4 = pd.read_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\tempfile.csv',names=['Venue','A/C','DISPLAY/','Date','BID','Durn','Aircraft-combined'], header=0 )
for x in range(0,len(df4)):
y=x+1
try:
if ((((df4.iloc[x][6]))=='DSH') or (((df4.iloc[x][6]))=='SD') or (((df4.iloc[x][6]))=='DS') or (((df4.iloc[x][6]))=='HSD') or (((df4.iloc[x][6]))=='HDS') or (((df4.iloc[x][6]))=='SHSD') or (((df4.iloc[x][6]))=='DHS')):
dff3 ={'Venue':df4.iloc[x][0],'A/C':df4.iloc[x][1],'DISPLAY/':df4.iloc[x][2],'Date':df4.iloc[x][3],'BID':df4.iloc[x][4],'Durn':df4.iloc[x][5],'Aircraft-combined':df4.iloc[x][6]}
dff = dff.append(dff3,ignore_index=True)
elif ((df4.iloc[x][3]==df4.iloc[y][3]) and (df4.iloc[x][4]==df4.iloc[y][4])):
sidhu=df4.iloc[x][6]+df4.iloc[y][6]
dff1 = {'Venue':df4.iloc[x][0],'A/C':df4.iloc[x][1],'DISPLAY/':df4.iloc[x][2],'Date':df4.iloc[x][3],'BID':df4.iloc[x][4],'Durn':df4.iloc[x][5],'Aircraft-combined':sidhu}
dff2 = {'Venue':df4.iloc[y][0],'A/C':df4.iloc[y][1],'DISPLAY/':df4.iloc[y][2],'Date':df4.iloc[y][3],'BID':df4.iloc[y][4],'Durn':df4.iloc[y][5],'Aircraft-combined':sidhu}
dff = dff.append(dff1,ignore_index=True)
dff = dff.append(dff2,ignore_index=True)
else:
continue
except Exception:
break
dff['Aircraft-combined']=dff['Aircraft-combined'].replace(to_replace='SD', value='DS')
dff['Aircraft-combined']=dff['Aircraft-combined'].replace(to_replace=['HSD','HDS', 'SHSD'], value='DHS')
dff['Dakota And Fighters In The Same Display Slot']='Yes'
for x in range(0,len(dff)):
y=x+1
try:
if ((dff.iloc[x][3]==dff.iloc[y][3]) and (dff.iloc[x][4]==dff.iloc[y][4])):
if dff.iloc[x][5]==dff.iloc[y][5]:
dff['Dakota And Fighters In The Same Display Slot'][x]='Yes'
dff['Dakota And Fighters In The Same Display Slot'][y]='Yes'
else:
dff['Dakota And Fighters In The Same Display Slot'][x]='No'
dff['Dakota And Fighters In The Same Display Slot'][y]='No'
else:
continue
except Exception:
break
dff=dff.drop('A/C',axis=1)
dff=dff.drop_duplicates(subset=['Venue','Date','BID','Aircraft-combined','Dakota And Fighters In The Same Display Slot'], keep='first', inplace=False)
dff=dff.drop_duplicates(subset=None, keep='first', inplace=False)
dff['Durn']= pd.to_datetime(dff['Durn'],format='%M').apply(pd.Timestamp)
dff['Durn'] = pd.to_datetime(dff['Durn'], format='%H:%M:%S').dt.time
dff.at[14,'Aircraft-combined']='DHS'
dff.at[18,'Aircraft-combined']='DHS'
dff.to_csv(r'C:\Users\Edward\Desktop\BBMF Schedules And Master Forum Thread Texts\BBMF-2009-Code.csv')
dff The Code seemed to run initially, but now I get the following TypeError :-
Error: TypeError Traceback (most recent call last)
<ipython-input-58-43d53992dab5> in <module>
17 'PARA': 'P'
18 }
---> 19 df['A/C std'] = df['A/C'].replace(rename_map)
20 print(df['A/C std'].unique().tolist())
21 print("=======")
c:\python37\lib\site-packages\pandas\core\series.py in replace(self, to_replace, value, inplace, limit, regex, method)
3843 return super(Series, self).replace(to_replace=to_replace, value=value,
3844 inplace=inplace, limit=limit,
-> 3845 regex=regex, method=method)
3846
3847 @Appender(generic._shared_docs['shift'] % _shared_doc_kwargs)
c:\python37\lib\site-packages\pandas\core\generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6496
6497 return self.replace(to_replace, value, inplace=inplace,
-> 6498 limit=limit, regex=regex)
6499 else:
6500
c:\python37\lib\site-packages\pandas\core\series.py in replace(self, to_replace, value, inplace, limit, regex, method)
3843 return super(Series, self).replace(to_replace=to_replace, value=value,
3844 inplace=inplace, limit=limit,
-> 3845 regex=regex, method=method)
3846
3847 @Appender(generic._shared_docs['shift'] % _shared_doc_kwargs)
c:\python37\lib\site-packages\pandas\core\generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6545 dest_list=value,
6546 inplace=inplace,
-> 6547 regex=regex)
6548
6549 else: # [NA, ''] -> 0
c:\python37\lib\site-packages\pandas\core\internals\managers.py in replace_list(self, src_list, dest_list, inplace, regex)
557 return _compare_or_regex_search(values, s, regex)
558
--> 559 masks = [comp(s, regex) for i, s in enumerate(src_list)]
560
561 result_blocks = []
c:\python37\lib\site-packages\pandas\core\internals\managers.py in <listcomp>(.0)
557 return _compare_or_regex_search(values, s, regex)
558
--> 559 masks = [comp(s, regex) for i, s in enumerate(src_list)]
560
561 result_blocks = []
c:\python37\lib\site-packages\pandas\core\internals\managers.py in comp(s, regex)
555 return _compare_or_regex_search(maybe_convert_objects(values),
556 getattr(s, 'asm8'), regex)
--> 557 return _compare_or_regex_search(values, s, regex)
558
559 masks = [comp(s, regex) for i, s in enumerate(src_list)]
c:\python37\lib\site-packages\pandas\core\internals\managers.py in _compare_or_regex_search(a, b, regex)
1949 raise TypeError(
1950 "Cannot compare types {a!r} and {b!r}".format(a=type_names[0],
-> 1951 b=type_names[1]))
1952 return result
1953
TypeError: Cannot compare types 'ndarray(dtype=float64)' and 'str'
I can't work out, what the issue is, any ideas ?
Regards
Eddie Winch ))
Posts: 8
Threads: 1
Joined: Feb 2020
Hi @ eddywinch82
I'm glad to hear that you already solve an issue.
I reviewed code and I think, above solution is a hit.
I'm willing to help your new type error issue but I can't notice anything on my side, just copied full code.
Send me your import CSV file so that I can test it again on my local env.
|