Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
page impossible to scrap? :O
#1
Hi guys,

i have found page where i cant scrap the shortest flight

https://www.skyscanner.net/transport/fli...ref=home#/

how it come? it is possible to block site from scrapping?

page = 'https://www.skyscanner.net/transport/flights/mpm/tyoa/191008/191015/?adults=1&children=0&adultsv2=1&childrenv2=&infants=0&cabinclass=economy&rtn=1&preferdirects=false&outboundaltsenabled=false&inboundaltsenabled=false&ref=home#/'
r = requests.get(page)
content = (r.text)
soup = BeautifulSoup(content, 'html.parser')
test = soup.find_all(class_='BpkTicket_bpk-ticket__paper__2gPSe BpkTicket_bpk-ticket__main__J31fH BpkTicket_bpk-ticket__main--padded__WIbjx BpkTicket_bpk-ticket__main--horizontal__2MgwA BpkTicket_bpk-ticket__paper--with-notches__19yQc'):
print(test)
I guess flight seeker sites works with some kind of refresh data, hence, its not visible in requests? am i right? In this case i would need some sleep/wait function, right?
Reply
#2
If they use JavaScript you may need to use Selenium
Check our tutorial - https://python-forum.io/Thread-Web-scraping-part-2
look for God dammit JavaScript, why do i not get all content and next
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Thanks buran,

as always helpful! :)

now time to learn captcha solving! :D
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Web scrap --Need help Lizardpython 4 1,009 Oct-01-2023, 11:37 AM
Last Post: Lizardpython
  I tried every way to scrap morningstar financials data without success so far sparkt 2 8,226 Oct-20-2020, 05:43 PM
Last Post: sparkt
  Web scrap multiple pages anilacem_302 3 3,813 Jul-01-2020, 07:50 PM
Last Post: mlieqo
  Need logic on how to scrap 100K URLs goodmind 2 2,609 Jun-29-2020, 09:53 AM
Last Post: goodmind
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,618 Mar-19-2020, 06:13 PM
Last Post: apollo
  Scrap a dynamic span hefaz 0 2,685 Mar-07-2020, 02:56 PM
Last Post: hefaz
  scrap by defining 3 functions zarize 0 1,851 Feb-18-2020, 03:55 PM
Last Post: zarize
  Skipping anti-scrap zarize 0 1,872 Jan-17-2020, 11:51 AM
Last Post: zarize
  Cannot get selenium to scrap past the first two pages newbie_programmer 0 4,155 Dec-12-2019, 06:19 AM
Last Post: newbie_programmer
  Scrap data from not standarized page? zarize 4 3,295 Nov-25-2019, 10:25 AM
Last Post: zarize

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020