Sep-30-2019, 05:18 PM
Hello,
downloaded scrapy and went through the tutorials and still trying to understand the selector
aspect of scraping. So I thought scrape a different quotes web page:
website
I created a new project and spider:
output:
Joe
downloaded scrapy and went through the tutorials and still trying to understand the selector
aspect of scraping. So I thought scrape a different quotes web page:
website
I created a new project and spider:
# -*- coding: utf-8 -*- import scrapy class InspiderSpider(scrapy.Spider): name = 'inspider' allowed_domains = ['https://www.keepinspiring.me/famous-quotes/'] start_urls = ['https://www.keepinspiring.me/famous-quotes//'] def parse(self, response): for quotes in response.css('div.author-quotes'): yield { 'text': quotes.css('span.text::text').extract_first(), 'author': quotes.css('span.quote-author-name::text').extract_first() }I can extract the authors but no luck on the quote.
output:
Output:{"text": null, "author": "-Dr. Suess"},
{"text": null, "author": "-Marilyn Monroe"},
{"text": null, "author": null},
{"text": null, "author": "-Stephen King"},
{"text": null, "author": "-Mark Caine"},
{"text": null, "author": "-Helen Keller"},
.....
when I examine the quote element and copy xpath I get:Output://*[@id="entry-4812"]/div/div[1]/div[6]/text()
any help appreciated,Joe