Python Forum
Scraping Websites to post on Telegram
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping Websites to post on Telegram
#1
I am newer to Python, but understand the basics. I am trying to create a script that will scrape from a list of websites to news articles or RSS feeds from websites to find a keyword. Then when it finds the specific keyword(s) out of the list I provided send a message to a group i created on telegram with a link to the article.

At the end of the day I am trying to stay up to date with all new gaming articles that have to do with 8k gaming. Its something that interests me so I am searching for all playstation, xbox and computer topics that also have to do with 8k, gaming, VR, TV. I am putting together a list of websites that tend to talk about this type of tech so i don't have to look daily for something that might not hit but once a month or twice a month if i'm lucky.

I have found a lot of really useful guides out there that have given me some good info but i just dont know how to put it all together.
  • I know that I need to use beautifulsoup to parse the html info out of the feeds.
  • I created the bot with botfather on telegram and have the api token.
  • I assume I will need to create arrays with all my rss feeds/websites that I want to look for information from. That way i can create the loop for it to cycle through each of these.
  • Inside of the first array it would also need to go through the keywords that i want to search within each web page with. Using a loop I would need to go through each of these on each page before moving onto the next feed on the array from above. If it finds any of the keywords I want it to skip the rest so i don't get the same link 4 times. Which I assume I could make it leave the loop once a variable becomes true?
  • I have been able to get to this point of finding the information from a single page but not multiple, but not be able to get it from multiple pages and give me a link to the article.

I posted this previously but somehow missed this part of the forum so maybe this will help a little more.
This is about as far as I have gotten and I have more than enough desire to learn how to do this if you know of any place that can show me more on this. Its kind of an odd request so googling a how to was a little hard. I had to take it piece by piece.

Thank you.
Reply
#2
a simple way to make unique elements is to use set. You just put your links in a list and wrap them in set, and it removes duplicates.
>>> set(['link','link','link','link2','link2'])
{'link', 'link2'}
There are feedparsing libraries already existing.
Recommended Tutorials:
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscrapping sport betting websites KoinKoin 3 5,338 Nov-08-2023, 03:00 PM
Last Post: LoriBrown
  Web Scraping Sportsbook Websites Khuber79 17 256,847 Mar-17-2021, 12:06 AM
Last Post: Whitesox1
Thumbs Up Issue facing while scraping the data from different websites in single script. Balamani 1 2,077 Oct-20-2020, 09:56 AM
Last Post: Larz60+
  POST request with form data issue web scraping hoff1022 1 2,649 Aug-14-2020, 10:25 AM
Last Post: kashcode
  Can urlopen be blocked by websites? peterjv26 2 3,324 Jul-26-2020, 06:45 PM
Last Post: peterjv26
  Python program to write into websites for you pythonDEV333 3 2,451 Jun-08-2020, 12:06 PM
Last Post: pythonDEV333
  Scraping Websites to post on Telegram kobryan 0 3,394 Oct-09-2019, 04:11 PM
Last Post: kobryan
  Scrapping .aspx websites boxingowl88 3 8,149 Oct-10-2018, 05:35 PM
Last Post: stranac
  Scrapper for websites stinger 0 2,338 Jul-20-2018, 02:11 AM
Last Post: stinger
  scraping javascript websites with selenium DoctorEvil 1 3,315 Jun-08-2018, 06:40 PM
Last Post: DoctorEvil

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020