[Scrapy] web scrape help

***stranac*** · Sep-30-2019, 08:21 PM

span.text::text tries selecting the text of a span with class text.
Such an element doesn't exist, the text is placed directly in the div.

A css selector that would work here would be simply ::text.
This technically selects all the text nodes inside the div (including the author), but .extract_first() will give you only the thing you are after.

An alternative is using an xpath such as ./text().

A couple of non-selector-related notes:

Your allowed_domains is being ignored since it contains full urls instead of domains (it's optional, so your code still works)
You should use .get() instead of .extract_first(), that's been the recommended api for a while now

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	scrape data 1 go to next page scrape data 2 and so on	alkaline3	6	5,216	Mar-13-2020, 07:59 PM Last Post: alkaline3
	Scrapy-cut: Advanced Cookiecutter Scrapy Templating	scriptso	2	4,686	Feb-02-2017, 07:57 PM Last Post: scriptso

[Scrapy] web scrape help

User Panel Messages

Announcements