Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python - Scrapy - CSS selector
#1
Hello everyone!, i was messing with the scrapy i did some examples....but my css selector in Car_Manufacturer, Manufacturer_Model, Model_Edition im getting empty brackets for some reason ...



here is a quick test:
# -*- coding: utf-8 -*-
import scrapy

class Mybot4Spider(scrapy.Spider):
    name = 'MyBot4'
    start_urls = ['https://www.mytoutou.gr/manufacturers/ford/344/1480/']

    def parse(self, response):
        for content in response.css('div.mtt-uil-clbc'):
            form = response.css('div.FormContainer')
            yield {
            'title' : content.css('a::text').extract(),
            'Link' : content.css('a::attr(href)').extract(),
            'H1' : response.css('div.mtt-uil-category-products > h1::text').extract(),
            'Car_Manufacturer' : form.css('span.ui-selectmenu-text').extract(),
            'Manufacturer_Model' : form.css('span.ui-selectmenu-text').extract(),
            'Model_Edition' : form.css('span.ui-selectmenu-text').extract(),
            'CurrentURL' : response.url
            }
p.s i saw the form is work with java script to show the current model....so im thinking to split the url and get the value for each url

here is the quick css:
'Manufacturer_Model' : response.css('option[value="3444"]::text').extract()
im having over 20k links to crawl...its not the only one to craw... so i was thinking if i can split them to get the value...

or if you have smarter idea to read the javascript that will be great!!! :D
Reply
#2
It looks like the classes you're trying to use are created by javascript, but the data itself is available in the source.
One possibility you have is finding the selected option, e.g.:
'Car_Manufacturer': form.css('#car-manuf option[selected]::text').get(),
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Div Class HTML selector in Python Artur 1 572 Mar-28-2024, 09:46 AM
Last Post: StevenSnyder
  Python Scrapy Date Extraction Issue tr8585 1 3,299 Aug-05-2020, 04:32 AM
Last Post: tr8585
  Python Scrapy tr8585 2 2,357 Aug-04-2020, 04:11 AM
Last Post: tr8585
  TDD/CSS & HTML testing - CSS selector (.has-error) makoseafox 0 1,800 May-13-2020, 07:41 PM
Last Post: makoseafox
  Downloading Images - Unable to find correct selector Brompy 4 2,950 Jan-22-2020, 04:54 PM
Last Post: snippsat
  Python - Scrapy Baggelhsk95 0 2,281 Apr-24-2019, 01:07 PM
Last Post: Baggelhsk95
  Python Scrapy ebay API Baggelhsk95 0 3,201 Nov-21-2018, 11:22 AM
Last Post: Baggelhsk95
  Python scrapy scraped_items Baggelhsk95 2 2,878 Nov-13-2018, 08:30 AM
Last Post: Baggelhsk95
  Python - Scrapy - Contains Baggelhsk95 3 4,511 Oct-27-2018, 03:42 PM
Last Post: stranac
  Python - Scrapy Login in Baggelhsk95 3 4,826 Oct-23-2018, 04:24 PM
Last Post: stranac

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020