select all the span text with same attribute - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: select all the span text with same attribute (/thread-28645.html) |
select all the span text with same attribute - JennyYang - Jul-28-2020 after finding all spans in the object I get the results as following. I would like to choose the different values using: rating_chart[31].find('span',attrs={'class':"_3fVK8yi6"}).getText() but it only shows the first answer: 1961. How do I get all the numbers: 1961, 437,116,40? Thank you so much. results: '<span class="_3fVK8yi6">1,961</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span>, <span class="_3fVK8yi6">437</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span>, <span class="_3fVK8yi6">116</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span>, <span class="_3fVK8yi6">40</span>,...' RE: select all the span text with same attribute - Martinelli - Jul-28-2020 JennyYang Hi there! What do you think in use the page content and re.search or re.findall function? Almost like this: import re import urllib.request var_page_content = urllib.request.urlopen("https://urlthatyouneed.com/").read() result = re.search(var_page_content, '1961') # it will return 1961 in string format print(result)Otherwise, use: import re import urllib.request var_page_content = urllib.request.urlopen("https://urlthatyouneed.com/").read() result = re.findall(r'[0-9]+', var_page_content) # it will return all the numbers in string format print(result)Hope it can help you! Regards, Martinelli RE: select all the span text with same attribute - snippsat - Jul-28-2020 Martinelli regex is not the best choice when it comes to HTML/XML. There is a reason why parses exits,if want a funny read . from bs4 import BeautifulSoup html = '''\ <span class="_3fVK8yi6">1,961</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span>, <span class="_3fVK8yi6">437</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span>, <span class="_3fVK8yi6">116</span>, <span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span></span>, <span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span>, <span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span>, <span class="_3fVK8yi6">40</span>''' soup = BeautifulSoup(html , 'lxml')Usage test. >>> all_3fV = soup.find_all(class_="_3fVK8yi6") >>> all_3fV [<span class="_3fVK8yi6">1,961</span>, <span class="_3fVK8yi6">437</span>, <span class="_3fVK8yi6">116</span>, <span class="_3fVK8yi6">40</span>] >>> >>> [tag.text for tag in all_3fV] ['1,961', '437', '116', '40'] |