Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python - Scrapy - CSS selector
#1
Hello everyone!, i was messing with the scrapy i did some examples....but my css selector in Car_Manufacturer, Manufacturer_Model, Model_Edition im getting empty brackets for some reason ...



here is a quick test:
# -*- coding: utf-8 -*-
import scrapy

class Mybot4Spider(scrapy.Spider):
    name = 'MyBot4'
    start_urls = ['https://www.mytoutou.gr/manufacturers/ford/344/1480/']

    def parse(self, response):
        for content in response.css('div.mtt-uil-clbc'):
            form = response.css('div.FormContainer')
            yield {
            'title' : content.css('a::text').extract(),
            'Link' : content.css('a::attr(href)').extract(),
            'H1' : response.css('div.mtt-uil-category-products > h1::text').extract(),
            'Car_Manufacturer' : form.css('span.ui-selectmenu-text').extract(),
            'Manufacturer_Model' : form.css('span.ui-selectmenu-text').extract(),
            'Model_Edition' : form.css('span.ui-selectmenu-text').extract(),
            'CurrentURL' : response.url
            }
p.s i saw the form is work with java script to show the current model....so im thinking to split the url and get the value for each url

here is the quick css:
'Manufacturer_Model' : response.css('option[value="3444"]::text').extract()
im having over 20k links to crawl...its not the only one to craw... so i was thinking if i can split them to get the value...

or if you have smarter idea to read the javascript that will be great!!! :D
Reply
#2
It looks like the classes you're trying to use are created by javascript, but the data itself is available in the source.
One possibility you have is finding the selected option, e.g.:
'Car_Manufacturer': form.css('#car-manuf option[selected]::text').get(),
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Div Class HTML selector in Python Artur 1 564 Mar-28-2024, 09:46 AM
Last Post: StevenSnyder
  Python Scrapy Date Extraction Issue tr8585 1 3,297 Aug-05-2020, 04:32 AM
Last Post: tr8585
  Python Scrapy tr8585 2 2,353 Aug-04-2020, 04:11 AM
Last Post: tr8585
  TDD/CSS & HTML testing - CSS selector (.has-error) makoseafox 0 1,795 May-13-2020, 07:41 PM
Last Post: makoseafox
  Downloading Images - Unable to find correct selector Brompy 4 2,934 Jan-22-2020, 04:54 PM
Last Post: snippsat
  Python - Scrapy Baggelhsk95 0 2,279 Apr-24-2019, 01:07 PM
Last Post: Baggelhsk95
  Python Scrapy ebay API Baggelhsk95 0 3,199 Nov-21-2018, 11:22 AM
Last Post: Baggelhsk95
  Python scrapy scraped_items Baggelhsk95 2 2,876 Nov-13-2018, 08:30 AM
Last Post: Baggelhsk95
  Python - Scrapy - Contains Baggelhsk95 3 4,507 Oct-27-2018, 03:42 PM
Last Post: stranac
  Python - Scrapy Login in Baggelhsk95 3 4,823 Oct-23-2018, 04:24 PM
Last Post: stranac

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020