Python Forum
Extract Anchor Text (Scrapy) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Extract Anchor Text (Scrapy) (/thread-11677.html)



Extract Anchor Text (Scrapy) - soothsayerpg - Jul-21-2018

Tried using the search box but didn't find any post that relates. Also tried Googling it but didn't find any answer.

How can I extract only the anchor text in a given hyperlink?

Quote:I.E. <a href='mydomain.com'>my anchor text</a>

Quote:example:
<div class = "blog_next_page">
<a class="next_page" href="mydomain.com/page/2">my anchor text</a>

Called the page/site using
scrapy shell 'website url'
using
response.css('div.blog_next_page > a::attr(href)').extract_first()
I can extract the link but how can i get "my anchor text"?

Many thanks for the help!


RE: Extract Anchor Text (Scrapy) - buran - Jul-21-2018

try
response.css('div.blog_next_page > a::text').extract_first()
Scrapy Selectors docs


RE: Extract Anchor Text (Scrapy) - soothsayerpg - Jul-21-2018

(Jul-21-2018, 06:26 AM)buran Wrote: try
response.css('div.blog_next_page > a::text').extract_first()
Scrapy Selectors docs

It works!

I was messing around with having 'text' inside the attr() or a::text(), geez...

So 'text' alone is just inside the string or it sniff a string at the a-tag?

Thanks again!