Okay so.
Heading might seem like this question ahs already been asked but I had no luck finding an answer for it.
I need help about making link extracting program with python.
Actually It works. It finds all elements on a webpage. Takes their href="" and puts it in array. Then it exports it in csv file. Which is what I want.
But I can't get a hold of one thing.
Website is dynamic so I am using Selenium webdriver to get JavaScript result.
Code for program is pretty simple. I open website with webdriver and then get it's content. Then I get all links with
Then I lop throught results with for loop and get href with
I store results in array and then print them out.
But problem is that I can't get name of the links.
Is there any way to get 'This leads to Google' string.
I need it for every link that is stored in array.
Thank you for your time
UPDATE
As it seems it only get's dynamic links. I just notices this. This is really strange now. For hard coded items it returns empty string. For dynamic link it returns it's name.
Okay so. Answer was using
It returns string with name.
Heading might seem like this question ahs already been asked but I had no luck finding an answer for it.
I need help about making link extracting program with python.
Actually It works. It finds all elements on a webpage. Takes their href="" and puts it in array. Then it exports it in csv file. Which is what I want.
But I can't get a hold of one thing.
Website is dynamic so I am using Selenium webdriver to get JavaScript result.
Code for program is pretty simple. I open website with webdriver and then get it's content. Then I get all links with
1 |
results = driver.find_elements_by_tag_name( 'a' ) |
1 |
result.get_attribute( "href" ) |
But problem is that I can't get name of the links.
1 |
|
I need it for every link that is stored in array.
Thank you for your time
UPDATE
As it seems it only get's dynamic links. I just notices this. This is really strange now. For hard coded items it returns empty string. For dynamic link it returns it's name.
Okay so. Answer was using
1 |
get_attribude( "textContent" ) |