Python Forum
get link and link text from table - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: get link and link text from table (/thread-19089.html)



get link and link text from table - metulburr - Jun-12-2019

I am using selenium trying to only get the links and link text from only within the table, not the entire website.

The table html is the following
<table class="enrolled-courses-gridview-mobile visible-xs gridview-move-up" cellspacing="0" cellpadding="0" id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile" style="border-width:0px;width:100%;border-collapse:collapse;">
		<thead>
			<tr class="disable-onbeforeunload">
				<th scope="col" abbr="Course Details"><a href="javascript:__doPostBack('ctl00$ctl00$ctl00$MainContentPlaceholder$RightColumnPlaceholder$RightColumnPlaceHolder$AssignedTrainingGridview$EnrolledCoursesGridviewMobile','Sort$CourseTitle')"></a></th>
			</tr>
		</thead><tbody>
			<tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_0"><a href="/Learning/CourseViewer.aspx?id=479561544" class="courseGridTitle">Introduction to Corporate Compliance Programs</a><br><em><span class="courses-grid-subcontent"> 1.5 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_0">Due 7/30/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_0" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=479561544">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_1"><a href="/Learning/CourseViewer.aspx?id=497633331" class="courseGridTitle">Workplace Emergencies and Natural Disasters: An Overview</a><br><em><span class="courses-grid-subcontent"> 1 hour</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_1">Due 8/30/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_1" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=497633331">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_2"><a href="/Learning/CourseViewer.aspx?id=497623435" class="courseGridTitle">Carefirst Ergonomics</a><br><em><span class="courses-grid-subcontent"> 0 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_2">Due 10/31/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_2" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=497623435">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_3"><a href="/Learning/CourseViewer.aspx?id=524653979" class="courseGridTitle">Fire Safety</a><br><em><span class="courses-grid-subcontent"> 0.5 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_3">Due 10/31/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_3" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=524653979">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_4"><a href="/Learning/CourseViewer.aspx?id=532872917" class="courseGridTitle">Workplace Safety: The Basics</a><br><em><span class="courses-grid-subcontent"> 0.25 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_4">Due 10/31/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_4" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=532872917">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_5"><a href="/Learning/CourseViewer.aspx?id=532872916" class="courseGridTitle">Policy Manual</a><br><em><span class="courses-grid-subcontent"> 0 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_5">Due 11/30/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_5" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=532872916">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_6"><a href="/Learning/CourseViewer.aspx?id=607247461" class="courseGridTitle">CareFIrst Security System</a><br><em><span class="courses-grid-subcontent"> 0 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_6">Due 12/31/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_6" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=607247461">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_7"><a href="/Learning/CourseViewer.aspx?id=607050831" class="courseGridTitle">From Touchy to Touching: Straight Talk About the Dying Process</a><br><em><span class="courses-grid-subcontent"> 1 hour</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_7">Due 12/31/2019</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_7" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=607050831">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_8"><a href="/Learning/CourseViewer.aspx?id=580034730" class="courseGridTitle">HIPAA: The Basics</a><br><em><span class="courses-grid-subcontent"> 0.5 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_8">Due 2/28/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_8" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=580034730">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_9"><a href="/Learning/CourseViewer.aspx?id=579512670" class="courseGridTitle">NYS HIV info</a><br><em><span class="courses-grid-subcontent"> 0 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_9">Due 2/28/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_9" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=579512670">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_10"><a href="/Learning/CourseViewer.aspx?id=603091009" class="courseGridTitle">Corporate Compliance: The Basics</a><br><em><span class="courses-grid-subcontent"> 0.5 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_10">Due 4/30/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_10" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=603091009">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_11"><a href="/Learning/CourseViewer.aspx?id=603353500" class="courseGridTitle">Sexual Harassment for Employees</a><br><em><span class="courses-grid-subcontent"> 0.5 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_11">Due 4/30/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_11" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=603353500">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_12"><a href="/Learning/CourseViewer.aspx?id=607727757" class="courseGridTitle">Core Values</a><br><em><span class="courses-grid-subcontent"> 0 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_12">Due 5/31/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_12" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=607727757">Take Now</a></p>
            </td>
			</tr><tr class="gray_row" style="background-color:White;">
				<td align="center" valign="middle">
                <p><span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_TitleLabel_13"><a href="/Learning/CourseViewer.aspx?id=607761986" class="courseGridTitle">Ethics for Hospice and Palliative Care Services</a><br><em><span class="courses-grid-subcontent"> 1.25 hours</span></em></span></p>
                <p>
                    
                    <span id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_RequiredByDateLabel_13">Due 5/31/2020</span>
                </p>
                <p><a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_13" class="btn btn-med lime module-button" href="/Learning/CourseViewer.aspx?id=607761986">Take Now</a></p>
            </td>
			</tr>
		</tbody><tfoot>

		</tfoot>
	</table>
I have tried this
        for a in self.browser.find_elements_by_xpath('.//a'):
            print(a.get_attribute('href'))
but his gets all links on the website


        for a in table_id:
            print(a.get_attribute('href'))
I treid this but i get
Error:
File "selenium_start.py", line 63, in __init__ self.get_trainings() File "selenium_start.py", line 86, in get_trainings for a in table_id: TypeError: 'WebElement' object is not iterable
I tried this but i get None returned for every one
        id_ = 'MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridview'
        table_id = self.browser.find_element(By.ID, id_)
        rows = table_id.find_elements(By.TAG_NAME, "tr")

        for row in rows:        
            elements = row.find_elements(By.TAG_NAME, "td") 
            for section in elements:
                print(section.get_attribute('href'))



RE: get link and link text from table - snippsat - Jun-13-2019

You most write a more specific XPath,eg for finding table it would be.
//table[@class='enrolled-courses-gridview-mobile visible-xs gridview-move-up']
This will have all data in table,the can do a new search for eg //a/@href.
Or can try to do it in one XPath search.
from lxml import etree

html = '''\
<body>
  <table class="enrolled-courses-gridview-mobile visible-xs gridview-move-up" cellspacing="0" cellpadding="0">
    <p><a href="https://python-forum.io/">Visit Python forum</a></p>
  </table>
</body>'''

tree = etree.fromstring(html)
forum_link = tree.xpath("//table[@class='enrolled-courses-gridview-mobile visible-xs gridview-move-up']//a/@href")
print(forum_link[0])
Output:
https://python-forum.io/



RE: get link and link text from table - metulburr - Jun-13-2019

i did copy the xpath from the table to get:
//*[@id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridview"]
But it didnt catch it


In what circumstances does this fail
btn = driver.find_element_by_xpath('//*[@id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridview_ModuleButton_0"]')
btn.click()
but this works?
self.browser.execute_script("document.getElementsByClassName('btn btn-med lime-no-shadow module-button')[0].click()")
Quote:
<a id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridview_ModuleButton_0" class="btn btn-med lime-no-shadow module-button" href="/Learning/CourseViewer.aspx?id=479561544">Take Now</a>

It appears that on this site, only executing javascript works


RE: get link and link text from table - snippsat - Jun-13-2019

Quote:btn.click()
XPath return a list so.
btn[0].click()
If i do local test with table in selenium.
Work on test when click on first Take Now link.
time.sleep(3)
take_now = browser.find_elements_by_xpath('//*[@id="MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_0"]')
take_now[0].click()

Test of CSS selector second Take Now link work.
take_now1 = browser.find_elements_by_css_selector('#MainContentPlaceholder_RightColumnPlaceholder_RightColumnPlaceHolder_AssignedTrainingGridview_EnrolledCoursesGridviewMobile_ModuleButton_1')
take_now1[0].click()



RE: get link and link text from table - metulburr - Jun-13-2019

(Jun-13-2019, 09:13 AM)snippsat Wrote: XPath return a list so.
ARe you saying this specfic xpath returns a list? I dont remember responding to a list when using xpath's in other places?


RE: get link and link text from table - snippsat - Jun-13-2019

(Jun-13-2019, 04:55 PM)metulburr Wrote: ARe you saying this specfic xpath returns a list? I dont remember responding to a list when using xpath's in other places?
I think it has been returning list for while now,i think i remember some scenarios when it was not like this,can not remember when/if it was different.

If you look at lxml code i have posted it's the same list return print(forum_link[0])
Also same for my CSS selector test take_now1[0].click().