Python Forum
looking for direction - scrappy, crawler, beautiful soup
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
looking for direction - scrappy, crawler, beautiful soup
#1
Hello All,

I am looking for recommendation for a good start in building my first Python scraper script.
Actually I am not even sure if I should use scrapy, beautiful soup or else.
I have watched a few dozen of tutorial of scrapy, beautiful soup and others, but none were showing what I am really looking for.

I am a big fan of travel, all inclusives deal and many countries.

I am looking at many suppliers but always need to go back every day, sometime every hour to refresh the search.

The idea is to create a python script to automate this search and push the data in a spreadsheet.

The main issue I am facing is the multiple options on each suppliers, drop down for example.
From: , TO: , destination: , Date:, duration (Length) , Hotels: , Star rating... Then I press search...

I was able to create a script for simple search where we have only 1 option with 1 URL which covers the search and push it to .csv file, but this is different with the drop down options and multiple choices/ criteria in the selection.

https://voyagesarabais.com is the place where I look normally, but there are other url where we have to select multiple options prior to get the price and list of location/resorts.

Any recommendations?
Module, package to use for this mini-project.

Thanks,
Sylvain
Reply
#2
you need a web crawler and scraper.

The web crawler looks at all or a filtered list of sites to determine if they are suitable for scraping
based on the subject 'Travel' (an indexer)
They can both be in the same module, but it makes sense to separate into two separate modules.

Scrapy is a common: https://scrapy.org/ web crawler/scraper - I haven't used it as I prefer to write my own, but it is popular.

Here's a site to help with scrapy: https://www.datacamp.com/community/tutor...apy-python

You can write your own as well
find packages here: https://pypi.org/search/?q=%22web+crawler%22&o=
for blogs, google 'web crawler python 3'

web scraper - Suggest tutorials on this forum
web scraping part 1
web scraping part 2
Reply
#3
Thanks for the quick reply Larz60+ !

I will definitively look into it.

Thanks for your directions.
Regards,
Sylvain
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Beautiful Soup - access a rating value in a class KatMac 1 3,421 Apr-16-2021, 01:27 PM
Last Post: snippsat
  *Beginner* web scraping/Beautiful Soup help 7ken8 2 2,561 Jan-28-2021, 04:26 PM
Last Post: 7ken8
  Help: Beautiful Soup - Parsing HTML table ironfelix717 2 2,623 Oct-01-2020, 02:19 PM
Last Post: snippsat
  Beautiful Soup (suddenly) doesn't get full webpage html j.crater 8 16,398 Jul-11-2020, 04:31 PM
Last Post: j.crater
  Requests-HTML vs Beautiful Soup - How to Choose? robin73 0 3,781 Jun-23-2020, 02:53 PM
Last Post: robin73
  Web Crawler help Mr_Mafia 2 1,847 Apr-04-2020, 07:20 PM
Last Post: Mr_Mafia
  Beautiful soup truncates results jonesjoz 4 3,802 Mar-09-2020, 06:04 PM
Last Post: jonesjoz
  Beautiful soup and tags starter_student 11 6,056 Jul-08-2019, 03:41 PM
Last Post: starter_student
  Beautiful Soup find_all() kirito85 2 3,311 Jun-14-2019, 02:17 AM
Last Post: kirito85
  [split] Using beautiful soup to get html attribute value moski 6 6,225 Jun-03-2019, 04:24 PM
Last Post: moski

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020