I'm trying to create the worlds most basic RSS reader - I just want to know the number (len) of entries that were only created TODAY. I don't care if anything older was modified on today, just entries created.
Should I even be using feedparser do this? It seems I could just findAll <datepub> tags with BeautifulSoup and match it today's date, but everyone insists I should be using feedparser. Note: I'm still new at this, so it's hours of struggling either way.
Here's where I'm at - this shows that I have 50 entries:
import feedparser
dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658')
print (len(dgtw['entries']))
This just shows the published date of the first entry:
print(dgtw.entries[0].published)
I just want to findAll published dates that match today's date and give me a len number/return.
I don't see anything in the docs about this specifically:
https://pythonhosted.org/feedparser/date-parsing.html
(Jun-11-2019, 11:40 PM)Biks Wrote: [ -> ]It seems I could just findAll <datepub> tags with BeautifulSoup and match it today's date
You could to that.
Now will task work fine from feedparser to.
If just set a range() to number like eg 10 and loop.
>>> for n in range(10):
... dgtw['entries'][n]['published']
...
'Tue, 11 Jun 2019 16:07:41 GMT'
'Tue, 11 Jun 2019 15:03:52 GMT'
'Tue, 11 Jun 2019 14:43:21 GMT'
'Tue, 11 Jun 2019 13:58:31 GMT'
'Tue, 11 Jun 2019 12:24:25 GMT'
'Tue, 11 Jun 2019 03:13:45 GMT'
'Mon, 10 Jun 2019 21:56:47 GMT'
'Mon, 10 Jun 2019 19:58:24 GMT'
'Mon, 10 Jun 2019 19:38:59 GMT'
'Mon, 10 Jun 2019 14:07:38 GMT'
So see that there is 6 entries today.
Then can just do a quick hack.
>>> from datetime import datetime
>>>
>>> today = datetime.today().day
>>> today
12
>>> today_match = f', {today}'
>>> today_match
', 12'
when have
', 12'
can just use
in
to look if there is a match in published date string.
>>> today_match = ', 11'
>>> count = 0
>>> for n in range(10):
... p_date = dgtw['entries'][n]['published']
... if today_match in p_date:
... count += 1
...
>>> count
6
OK, my grasp of Python code is tenuous. :) I'm having a hard time building the final code from your examples. (sorry)
How do I even see the list of date and times column for:
for n in range(10):
dgtw['entries'][n]['published']
If I print(dgtw), I see all entries.
I noticed you have today_match listed 3 times. Do I need all three?
today_match = f', {today}'
today_match
', 12'
today_match = ', 11'
What's the final output supposed to look like?
Here put together.
import feedparser
from datetime import datetime
dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658')
today = datetime.today().day
today_match = f', {today}'
count = 0
for n in range(50):
p_date = dgtw['entries'][n]['published']
if today_match in p_date:
count += 1
print(f'published entries {datetime.today().ctime()},til now is <{count}>')
Output:
published entries Wed Jun 12 17:50:04 2019,til now is <2>
Can just set a higher number as here 50,also more entries that will ever be published in day.
Then count will be correct.
Hey this is great! Thanks! The last thing I did was toss the final number into a variable:
total = (f'{count}')
print(total)
I'm tossing that number into a Google sheet. (I managed to figure out how to do that by myself) :P
Thanks again!
#one option might be using of filtering the dates using system dates
from datetime import datetime
import feedparser
# Current date time in local system
dt = datetime.now()
today = datetime.today().day
count = 0
dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658')
if dt==today:
for each in range(len(dgtw['entries'])):
count += 1
print(count)