Python Forum
How to get the href value of a specific word in the html code - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: How to get the href value of a specific word in the html code (/thread-24807.html)



How to get the href value of a specific word in the html code - julio2000 - Mar-05-2020

I want to create a code wich gets the href=... of a specific name in the html code. A part of the html code looks like this:

class="name-link" href="/shop/pants/xz7ypjoam/i4lsmo5e7">Cargo Pant</a> == $0

there are more parts in the html code that look like this. Cause there are multiple products on the website (Supreme). I have got the name of the product -> Cargo Pant . Does someone know how I can get the href value from this specific line instead of a other line in the html code wich also includes a href= ? This was my code wich printed out all the href links on the page. But i just want that specific link from the product with Tupac Hologram Tee as the name.
session = HTMLSession()
r = session.get(url)
word = r.html.links
print(word)
This is my code uptill now. This prints out all the href values on the page. But i just want a specific value. The url to the page is:
https://www.supremenewyork.com/shop/all/pants


RE: How to get the href value of a specific word in the html code - snippsat - Mar-05-2020

As a example to get the Cargo Pant.
Will use CSS Selector the can target a specific tag.
import requests
from bs4 import BeautifulSoup

url = 'https://www.supremenewyork.com/shop/all/pants'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
cargo_pants = soup.select_one('#container > article:nth-child(4) > div > a')
Usage for getting href and src.
Output:
>>> cargo_pants <a href="/shop/pants/xz7ypjoam/yd2ny9jom" style="height:150px;"><img alt="Zhlkcvj 5dw" height="150" src="//assets.supremenewyork.com/187255/vi/zhlKcVj_5dw.jpg" width="150"/></a> >>> cargo_pants.get('href') '/shop/pants/xz7ypjoam/yd2ny9jom' >>> cargo_pants.img.get('src') '//assets.supremenewyork.com/187255/vi/zhlKcVj_5dw.jpg
Now you are using requests-html which you should mention as this is not at all common knowledge.
If would recommend to learn the more standar way as i shown here.
To do the same as shown here you most read the doc for requests-html,it has support for both CSS Selector and XPath Selector,so it shold work there to.
As i mention in your other post the lack of updates an response in Issues Tracker is a concern for this library.


RE: How to get the href value of a specific word in the html code - julio2000 - Mar-05-2020

Yeah I understand those lines. But the thing is, that on a dropday of supreme, where they drop limited items wich sellout fast. I don't know the place (and also not the css value)on the website of the product that I want to buy. That's why I want to get the href value (product link) that is linked to the wanted product name, wich is (in this example) Cargo Pant. So I want to create a code that finds the href value from the wanted name in the html code. But I don't really know how to :/