Python Forum

Full Version: Extract Anchor Text (Scrapy)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Tried using the search box but didn't find any post that relates. Also tried Googling it but didn't find any answer.

How can I extract only the anchor text in a given hyperlink?

Quote:I.E. <a href='mydomain.com'>my anchor text</a>

Quote:example:
<div class = "blog_next_page">
<a class="next_page" href="mydomain.com/page/2">my anchor text</a>

Called the page/site using
scrapy shell 'website url'
using
response.css('div.blog_next_page > a::attr(href)').extract_first()
I can extract the link but how can i get "my anchor text"?

Many thanks for the help!
try
response.css('div.blog_next_page > a::text').extract_first()
Scrapy Selectors docs
(Jul-21-2018, 06:26 AM)buran Wrote: [ -> ]try
response.css('div.blog_next_page > a::text').extract_first()
Scrapy Selectors docs

It works!

I was messing around with having 'text' inside the attr() or a::text(), geez...

So 'text' alone is just inside the string or it sniff a string at the a-tag?

Thanks again!