Python Forum

Tried using the search box but didn't find any post that relates. Also tried Googling it but didn't find any answer.

How can I extract only the anchor text in a given hyperlink?

Quote:I.E. <a href='mydomain.com'>my anchor text</a>

Quote:example:
<div class = "blog_next_page">
<a class="next_page" href="mydomain.com/page/2">my anchor text</a>

Called the page/site using

scrapy shell 'website url'

using

response.css('div.blog_next_page > a::attr(href)').extract_first()

I can extract the link but how can i get "my anchor text"?

Many thanks for the help!

try

response.css('div.blog_next_page > a::text').extract_first()

Scrapy Selectors docs

(Jul-21-2018, 06:26 AM)buran Wrote: [ -> ]try
response.css('div.blog_next_page > a::text').extract_first()
Scrapy Selectors docs

It works!

I was messing around with having 'text' inside the attr() or a::text(), geez...

So 'text' alone is just inside the string or it sniff a string at the a-tag?

Thanks again!

soothsayerpg

buran

soothsayerpg