Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Scrapy] web scrape help
#2
span.text::text tries selecting the text of a span with class text.
Such an element doesn't exist, the text is placed directly in the div.

A css selector that would work here would be simply ::text.
This technically selects all the text nodes inside the div (including the author), but .extract_first() will give you only the thing you are after.

An alternative is using an xpath such as ./text().

A couple of non-selector-related notes:
  • Your allowed_domains is being ignored since it contains full urls instead of domains (it's optional, so your code still works)
  • You should use .get() instead of .extract_first(), that's been the recommended api for a while now
Reply


Messages In This Thread
[Scrapy] web scrape help - by joe_momma - Sep-30-2019, 05:18 PM
RE: [Scrapy] web scrape help - by stranac - Sep-30-2019, 08:21 PM
RE: [Scrapy] web scrape help - by joe_momma - Oct-01-2019, 12:44 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 5,216 Mar-13-2020, 07:59 PM
Last Post: alkaline3
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,686 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020