Nov-07-2018, 01:57 PM
Hello everyone!, i was messing with the scrapy i did some examples....but my css selector in Car_Manufacturer, Manufacturer_Model, Model_Edition im getting empty brackets for some reason ...
here is a quick test:
here is the quick css:
or if you have smarter idea to read the javascript that will be great!!! :D
here is a quick test:
# -*- coding: utf-8 -*- import scrapy class Mybot4Spider(scrapy.Spider): name = 'MyBot4' start_urls = ['https://www.mytoutou.gr/manufacturers/ford/344/1480/'] def parse(self, response): for content in response.css('div.mtt-uil-clbc'): form = response.css('div.FormContainer') yield { 'title' : content.css('a::text').extract(), 'Link' : content.css('a::attr(href)').extract(), 'H1' : response.css('div.mtt-uil-category-products > h1::text').extract(), 'Car_Manufacturer' : form.css('span.ui-selectmenu-text').extract(), 'Manufacturer_Model' : form.css('span.ui-selectmenu-text').extract(), 'Model_Edition' : form.css('span.ui-selectmenu-text').extract(), 'CurrentURL' : response.url }p.s i saw the form is work with java script to show the current model....so im thinking to split the url and get the value for each url
here is the quick css:
'Manufacturer_Model' : response.css('option[value="3444"]::text').extract()im having over 20k links to crawl...its not the only one to craw... so i was thinking if i can split them to get the value...
or if you have smarter idea to read the javascript that will be great!!! :D