Find today's RSS entries with feedparser - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Find today's RSS entries with feedparser (/thread-19056.html) |
Find today's RSS entries with feedparser - Biks - Jun-11-2019 I'm trying to create the worlds most basic RSS reader - I just want to know the number (len) of entries that were only created TODAY. I don't care if anything older was modified on today, just entries created. Should I even be using feedparser do this? It seems I could just findAll <datepub> tags with BeautifulSoup and match it today's date, but everyone insists I should be using feedparser. Note: I'm still new at this, so it's hours of struggling either way. Here's where I'm at - this shows that I have 50 entries: import feedparser dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658') print (len(dgtw['entries']))This just shows the published date of the first entry: print(dgtw.entries[0].published)I just want to findAll published dates that match today's date and give me a len number/return. I don't see anything in the docs about this specifically: https://pythonhosted.org/feedparser/date-parsing.html RE: Find today's RSS entries with feedparser - snippsat - Jun-12-2019 (Jun-11-2019, 11:40 PM)Biks Wrote: It seems I could just findAll <datepub> tags with BeautifulSoup and match it today's dateYou could to that. Now will task work fine from feedparser to. If just set a range() to number like eg 10 and loop. >>> for n in range(10): ... dgtw['entries'][n]['published'] ... 'Tue, 11 Jun 2019 16:07:41 GMT' 'Tue, 11 Jun 2019 15:03:52 GMT' 'Tue, 11 Jun 2019 14:43:21 GMT' 'Tue, 11 Jun 2019 13:58:31 GMT' 'Tue, 11 Jun 2019 12:24:25 GMT' 'Tue, 11 Jun 2019 03:13:45 GMT' 'Mon, 10 Jun 2019 21:56:47 GMT' 'Mon, 10 Jun 2019 19:58:24 GMT' 'Mon, 10 Jun 2019 19:38:59 GMT' 'Mon, 10 Jun 2019 14:07:38 GMT'So see that there is 6 entries today. Then can just do a quick hack. >>> from datetime import datetime >>> >>> today = datetime.today().day >>> today 12 >>> today_match = f', {today}' >>> today_match ', 12'when have ', 12' can just use in to look if there is a match in published date string.>>> today_match = ', 11' >>> count = 0 >>> for n in range(10): ... p_date = dgtw['entries'][n]['published'] ... if today_match in p_date: ... count += 1 ... >>> count 6 RE: Find today's RSS entries with feedparser - Biks - Jun-12-2019 OK, my grasp of Python code is tenuous. :) I'm having a hard time building the final code from your examples. (sorry) How do I even see the list of date and times column for: for n in range(10): dgtw['entries'][n]['published']If I print(dgtw), I see all entries. I noticed you have today_match listed 3 times. Do I need all three? today_match = f', {today}' today_match ', 12' today_match = ', 11'What's the final output supposed to look like? RE: Find today's RSS entries with feedparser - snippsat - Jun-12-2019 Here put together. import feedparser from datetime import datetime dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658') today = datetime.today().day today_match = f', {today}' count = 0 for n in range(50): p_date = dgtw['entries'][n]['published'] if today_match in p_date: count += 1 print(f'published entries {datetime.today().ctime()},til now is <{count}>') Can just set a higher number as here 50,also more entries that will ever be published in day.Then count will be correct. RE: Find today's RSS entries with feedparser - Biks - Jun-12-2019 Hey this is great! Thanks! The last thing I did was toss the final number into a variable: total = (f'{count}') I'm tossing that number into a Google sheet. (I managed to figure out how to do that by myself) :P Thanks again! RE: Find today's RSS entries with feedparser - abhishek_k_a - Jun-12-2019 #one option might be using of filtering the dates using system dates from datetime import datetime import feedparser # Current date time in local system dt = datetime.now() today = datetime.today().day count = 0 dgtw = feedparser.parse('https://investorshub.advfn.com/boards/rss.aspx?board_id=22658') if dt==today: for each in range(len(dgtw['entries'])): count += 1 print(count) |