Python Forum
Extracting links from website with selenium bs4 and python - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Extracting links from website with selenium bs4 and python (/thread-19932.html)



Extracting links from website with selenium bs4 and python - M1ck0 - Jul-20-2019

Okay so.

Heading might seem like this question ahs already been asked but I had no luck finding an answer for it.

I need help about making link extracting program with python.

Actually It works. It finds all elements on a webpage. Takes their href="" and puts it in array. Then it exports it in csv file. Which is what I want.

But I can't get a hold of one thing.

Website is dynamic so I am using Selenium webdriver to get JavaScript result.

Code for program is pretty simple. I open website with webdriver and then get it's content. Then I get all links with

results = driver.find_elements_by_tag_name('a')
Then I lop throught results with for loop and get href with

result.get_attribute("href")
I store results in array and then print them out.

But problem is that I can't get name of the links.

<a href="https://www.google.com">This leads to Google</a>
Is there any way to get 'This leads to Google' string.

I need it for every link that is stored in array.

Thank you for your time

UPDATE

As it seems it only get's dynamic links. I just notices this. This is really strange now. For hard coded items it returns empty string. For dynamic link it returns it's name.

Okay so. Answer was using
get_attribude("textContent")
It returns string with name.


RE: Extracting links from website with selenium bs4 and python - Larz60+ - Jul-20-2019

what is the URL?
have you seen:
Web scraping 1 & 2:
https://python-forum.io/Thread-Web-Scraping-part-1
https://python-forum.io/Thread-Web-scraping-part-2