Python Forum

Full Version: Merging selenium and scrapy
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm working on a school project, where I have to web scrape a lot of pages from one URL, where i have to switch page when I have extracted all data from page 1 (if it makes sense). I am using selenium to navigate to the page and to log in. However, I want to use Scrapy and merge it with the selenium. My Selenium Code runs without error, but the scrapy part isn't scraping the right URL, so it just end of with error code 302 and 200.... Any help will do!!

Shown below is my scrapy code.

class Z2Spider(scrapy.Spider):
    name = 'Z2'
    page_number = 2
    allowed_domains = ['xx']
    start_urls = ['xx']

    def start_requests(self):
        yield scrapy.Request(url=current_url, callback=self.parse)

    def parse(self, response):
        items = Z1Item()

        deal_number_var = response.css(".mclbEl a::text").extract()
        deal_type_var = response.css(".#ContentContainer1_ctl00_Content_ListCtrl1_LB1_VDTBL .mclbEl:nth- 
        child(9)::text").extract

        items['deal_number_var'] = deal_number_var
        items['deal_type_var'] = deal_type_var

        yield items

        next_page = '' + str(Z2Spider.page_number) + '/'
        if Z2Spider.page_number < 82233:
            Z2Spider.page_number += 1
            yield response.follow(next_page, callback=self.parse)