Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python - Scrapy - CSS selector
#1
Hello everyone!, i was messing with the scrapy i did some examples....but my css selector in Car_Manufacturer, Manufacturer_Model, Model_Edition im getting empty brackets for some reason ...



here is a quick test:
# -*- coding: utf-8 -*-
import scrapy

class Mybot4Spider(scrapy.Spider):
    name = 'MyBot4'
    start_urls = ['https://www.mytoutou.gr/manufacturers/ford/344/1480/']

    def parse(self, response):
        for content in response.css('div.mtt-uil-clbc'):
            form = response.css('div.FormContainer')
            yield {
            'title' : content.css('a::text').extract(),
            'Link' : content.css('a::attr(href)').extract(),
            'H1' : response.css('div.mtt-uil-category-products > h1::text').extract(),
            'Car_Manufacturer' : form.css('span.ui-selectmenu-text').extract(),
            'Manufacturer_Model' : form.css('span.ui-selectmenu-text').extract(),
            'Model_Edition' : form.css('span.ui-selectmenu-text').extract(),
            'CurrentURL' : response.url
            }
p.s i saw the form is work with java script to show the current model....so im thinking to split the url and get the value for each url

here is the quick css:
'Manufacturer_Model' : response.css('option[value="3444"]::text').extract()
im having over 20k links to crawl...its not the only one to craw... so i was thinking if i can split them to get the value...

or if you have smarter idea to read the javascript that will be great!!! :D
Reply
#2
It looks like the classes you're trying to use are created by javascript, but the data itself is available in the source.
One possibility you have is finding the selected option, e.g.:
'Car_Manufacturer': form.css('#car-manuf option[selected]::text').get(),
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Div Class HTML selector in Python Artur 1 627 Mar-28-2024, 09:46 AM
Last Post: StevenSnyder
  Python Scrapy Date Extraction Issue tr8585 1 3,330 Aug-05-2020, 04:32 AM
Last Post: tr8585
  Python Scrapy tr8585 2 2,389 Aug-04-2020, 04:11 AM
Last Post: tr8585
  TDD/CSS & HTML testing - CSS selector (.has-error) makoseafox 0 1,813 May-13-2020, 07:41 PM
Last Post: makoseafox
  Downloading Images - Unable to find correct selector Brompy 4 2,979 Jan-22-2020, 04:54 PM
Last Post: snippsat
  Python - Scrapy Baggelhsk95 0 2,294 Apr-24-2019, 01:07 PM
Last Post: Baggelhsk95
  Python Scrapy ebay API Baggelhsk95 0 3,222 Nov-21-2018, 11:22 AM
Last Post: Baggelhsk95
  Python scrapy scraped_items Baggelhsk95 2 2,903 Nov-13-2018, 08:30 AM
Last Post: Baggelhsk95
  Python - Scrapy - Contains Baggelhsk95 3 4,539 Oct-27-2018, 03:42 PM
Last Post: stranac
  Python - Scrapy Login in Baggelhsk95 3 4,859 Oct-23-2018, 04:24 PM
Last Post: stranac

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020