Python Forum
Obtaining Correct Date In Pandas DataFrame
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Obtaining Correct Date In Pandas DataFrame
#11
Thankyou so so much Sandeep,

You sorted the problem out for me really well ))

I very much appreciate your help.

Could you read my post 21, in the following Thread of mine ? And respond accordingly ?

In the Following Link :-

https://python-forum.io/Thread-Filtering...ues?page=3

Best Regards

Eddie Winch Smile
Reply
#12
I have modified the Code, on this Thread, For a BBMF Year 2005 Display Schedule, which is broken down, to seperate Urls, for each Month. So I am trying to get, a DataFrame Output, for the Whole Year.

Here is the Modified Code :-

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
import pandas as pd
import requests
from bs4 import BeautifulSoup
 
 
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
 
df = df[0]
 
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
  
  
#make df[0] to list
list=[]
for i in df[0]:
    list.append(i)
   
#reverse the list to make split to sublist easier
list.reverse()
   
#split list to sublist using condition len(val)> 2
size = len(list)
idx_list = [idx + 1 for idx, val in
            enumerate(list) if len(val) > 2]
res = [list[i: j] for i, j in
        zip([0] + idx_list, idx_list +
        ([size] if idx_list[-1] != size else []))]
   
#make monthname to numbers and print
for i in res:
    for j in range(len(i)):
        if i[j].upper()=='JUNE':
            i[j]='6'
        elif i[j].upper() =='MAY':
            i[j]='5'
        elif i[j].upper() == 'APRIL':
            i[j]='4'
        elif i[j].upper() =='JANUARY':
            i[j]='1'
        elif i[j].upper() == 'FEBRUARY':
            i[j]='2'
        elif i[j].upper() =='MARCH':
            i[j]='3'
        elif i[j].upper() == 'JULY':
            i[j]='7'       
        elif i[j].upper() =='AUGUST':
            i[j]='8'
        elif i[j].upper() == 'SEPTEMBER':
            i[j]='9'
        elif i[j].upper() =='OCTOBER':
            i[j]='10'
        elif i[j].upper() == 'NOVEMBER':
            i[j]='11'
        elif i[j].upper() =='DECEMBER':
            i[j]='12'      
   
   
#append string and append to new list
finallist=[]
for i in res:
    for j in range(len(i)):
        if j < len(i) - 1:
            #print(f'2005-{i[-1]}-{i[j]}')
            finallist.append(f'2005-{i[-1]}-{i[j]}')
#print(finallist)
finallist.reverse()
   
#print("\n=== ORIGINAL DF ===\n")
#print(df)
   
#convert dataframe to list
listtemp1=df.values.tolist()
   
#replace found below values with 0000_removable
removelist=['LOCATION','LANCASTER','SPITFIRE','HURRICANE','DAKOTA','DATE','JUNE','JANUARY','FEBRUARY','MARCH','MAY','JULY','AUGUST','SEPTEMBER','OCTOBER','NOVEMBER','DECEMBER','APRIL']
for i in listtemp1:
    for j in range(len(i)):
        for place in removelist:
            if str(i[j]).upper()==place:
                i[j]='0000_removable'
            else:
                pass
   
                   
#remove sublists with the replaced values we redirected
dellist=['0000_removable', '0000_removable', '0000_removable', '0000_removable', '0000_removable', '0000_removable']
res = [i for i in listtemp1 if i != dellist]
   
#assign back to dataframe DF3
df3=pd.DataFrame()
df3=pd.DataFrame(res, columns=['Date','LOCATION','LANCASTER','SPITFIRE','HURRICANE','DAKOTA'])
#print("\n=== AFTER REMOVE month and column names from DF, assigned to new as DF3 ===\n")
#print(df3)
   
   
#now assign that sorted date list to dataframe DF3
idx = 0
df3.insert(loc=idx, column='DATE', value=finallist)
pd.options.display.max_rows = 500
 
df["DATE"].fillna(method='ffill', inplace = True)
 
display = df3[(df3['Location'].str.contains('- Display')) & (df3['Dakota'].str.contains('D')) & (df3['Spitfire'].str.contains('S', na=True)) & (df3['Lancaster'] != 'L')] 
display
 
display['DATE']= pd.to_datetime(display['DATE'],format='%Y-%m-%d')
display['DATE']= pd.to_datetime(display['DATE']).dt.strftime('%d-%m-%Y')
##added two lines above to convert date format
 
display.drop('Lancaster', axis=1, inplace=True)
display.dropna(subset=['Spitfire', 'Hurricane'], how='all')
 
#df[(df['Location'].str.contains('- Display'))
 
#df[(df['Dakota'].str.contains('D'))
 
#(df['Dakota'].str.contains('D'))
 
#(df['Spitfire'] == 'SSS')
I am trying to get a DataFrame Output, for the whole Year 2005, from all those Url Links, in the Code.

But I get the following Traceback Error, when I run the Code, in Jupyter Notebook :-

Error:
TypeError Traceback (most recent call last) <ipython-input-1-ae00b7540e28> in <module> 31 size = len(list) 32 idx_list = [idx + 1 for idx, val in ---> 33 enumerate(list) if len(val) > 2] 34 res = [list[i: j] for i, j in 35 zip([0] + idx_list, idx_list + <ipython-input-1-ae00b7540e28> in <listcomp>(.0) 31 size = len(list) 32 idx_list = [idx + 1 for idx, val in ---> 33 enumerate(list) if len(val) > 2] 34 res = [list[i: j] for i, j in 35 zip([0] + idx_list, idx_list + TypeError: object of type 'float' has no len()
I can't work out, what is causing the Error, Any ideas ?

Any help would be appreciated

Best Regards

Eddie Winch
Reply
#13
Can anyone help me ?

I would really appreciate someones help.

Regards

Eddie Winch
Reply
#14
I have looked on the Internet, for similar Threads on Forums, like mine here, but I can't find a solution, to the issue I am having.

Could someone help me out here, if that is okay ?

Regards

Eddie Winch Smile
Reply
#15
Hi bitasiavi,

I am sorry I didn't get your PM Message, could you send it again for me ?

Also I have sent you a PM Message, could you look at it, and get back to me ?

Regards

Eddie Winch ))
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [Solved] Formatting cells of a pandas dataframe into an OpenDocument ods spreadsheet Calab 1 820 Mar-01-2025, 04:51 AM
Last Post: Calab
  Find duplicates in a pandas dataframe list column on other rows Calab 2 2,350 Sep-18-2024, 07:38 PM
Last Post: Calab
  Find strings by index from a list of indexes in a different Pandas dataframe column Calab 3 1,688 Aug-26-2024, 04:52 PM
Last Post: Calab
  Add NER output to pandas dataframe dg3000 0 1,202 Apr-22-2024, 08:14 PM
Last Post: dg3000
  HTML Decoder pandas dataframe column mbrown009 3 2,778 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  Pandas read csv file in 'date/time' chunks MorganSamage 4 3,102 Feb-13-2023, 11:24 AM
Last Post: MorganSamage
  Use pandas to obtain cartesian product between a dataframe of int and equations? haihal 0 2,071 Jan-06-2023, 10:53 PM
Last Post: haihal
  Pandas Dataframe Filtering based on rows mvdlm 0 2,120 Apr-02-2022, 06:39 PM
Last Post: mvdlm
  Pandas dataframe: calculate metrics by year mcva 1 3,496 Mar-02-2022, 08:22 AM
Last Post: mcva
  Pandas dataframe comparing anto5 0 1,962 Jan-30-2022, 10:21 AM
Last Post: anto5

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020