Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Merging selenium and scrapy
#1
I'm working on a school project, where I have to web scrape a lot of pages from one URL, where i have to switch page when I have extracted all data from page 1 (if it makes sense). I am using selenium to navigate to the page and to log in. However, I want to use Scrapy and merge it with the selenium. My Selenium Code runs without error, but the scrapy part isn't scraping the right URL, so it just end of with error code 302 and 200.... Any help will do!!

Shown below is my scrapy code.

class Z2Spider(scrapy.Spider):
    name = 'Z2'
    page_number = 2
    allowed_domains = ['xx']
    start_urls = ['xx']

    def start_requests(self):
        yield scrapy.Request(url=current_url, callback=self.parse)

    def parse(self, response):
        items = Z1Item()

        deal_number_var = response.css(".mclbEl a::text").extract()
        deal_type_var = response.css(".#ContentContainer1_ctl00_Content_ListCtrl1_LB1_VDTBL .mclbEl:nth- 
        child(9)::text").extract

        items['deal_number_var'] = deal_number_var
        items['deal_type_var'] = deal_type_var

        yield items

        next_page = '' + str(Z2Spider.page_number) + '/'
        if Z2Spider.page_number < 82233:
            Z2Spider.page_number += 1
            yield response.follow(next_page, callback=self.parse)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,584 Nov-03-2017, 08:41 PM
Last Post: metulburr
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,608 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020