Python Forum
No data when using scrapy to get data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
No data when using scrapy to get data
#1
Hi all, i have managed to write my first scrapy code but when i run it i get no data from the site, i get no errors but i feel i know the issue, i need to load the page before i run the code, but i am not sure how to do this

here is my code

# -*- coding: utf-8 -*-
import scrapy
from ..items import SydneycheckItem


class SydneyflightcheckSpider(scrapy.Spider):
    name = 'sydneyfc'

    start_urls = [
        'https://www.sydneyairport.com.au/flights/?query=&flightType=departure&terminalType=domestic&date=2019-11-10&sortColumn=scheduled_time&ascending=true&showAll=true'
    ]

    def parse(self, response):
        items = SydneycheckItem()
        destinationname = response.css('.destination-name::text').extract()
        airlinename = response.css('.with-image').css('::text').extract()
  #      airlinelogo = response.css('.img:attr(src)').extract()
        flightnumber = response.css('.flight-numbers').css('::text').extract()
        scheduled = response.css('.large-scheduled-time').css('::text').extract()
        estimated = response.css('.estimated-time').css('::text').extract()
        status = response.css('.status-container').css('::text').extract()

        items['destination_name '] = destinationname
        items['airlinename'] = airlinename
      #  items['airlinelogo'] = airlinelogo
        items['flightnumber'] = flightnumber
        items['scheduled'] = scheduled
        items['estimated'] = estimated
        items['status'] = status

        yield items


        pass
Reply
#2
Looks like the data is loaded from an API using javascript.
The easiest way to get it yourself would be requesting it from the API directly, which will give you the data as json.

The API url is https://www.sydneyairport.com.au/_a/flights<query_parameters>
Reply
#3
(Nov-11-2019, 06:57 AM)stranac Wrote: Looks like the data is loaded from an API using javascript. The easiest way to get it yourself would be requesting it from the API directly, which will give you the data as json. The API url is https://www.sydneyairport.com.au/_a/flights


Ok great so how would i change my code to do this , still really new at this
Reply
#4
First change your start url to fetch the data from the API (just add the _a part to your url).
Then use the json module to load the data from response.text.

At this point you'll have a dict and you can just choose what parts you want to keep.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Scrapy does not show all data in iteration loop georgekasa 0 1,961 Jul-31-2021, 09:10 AM
Last Post: georgekasa
  How to crawl schema markup data type using scrapy? Nuwan16 1 3,038 Mar-31-2020, 03:42 PM
Last Post: stranac
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 5,087 Mar-13-2020, 07:59 PM
Last Post: alkaline3
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,608 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020