Python Forum
select all the span text with same attribute - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: select all the span text with same attribute (/thread-28645.html)



select all the span text with same attribute - JennyYang - Jul-28-2020

after finding all spans in the object I get the results as following.
I would like to choose the different values using:
rating_chart[31].find('span',attrs={'class':"_3fVK8yi6"}).getText()

but it only shows the first answer: 1961. How do I get all the numbers: 1961, 437,116,40?

Thank you so much.

results:
'<span class="_3fVK8yi6">1,961</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span>,
<span class="_3fVK8yi6">437</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span>,
<span class="_3fVK8yi6">116</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span>,
<span class="_3fVK8yi6">40</span>,...'


RE: select all the span text with same attribute - Martinelli - Jul-28-2020

JennyYang

Hi there! What do you think in use the page content and re.search or re.findall function?

Almost like this:

import re
import urllib.request
var_page_content = urllib.request.urlopen("https://urlthatyouneed.com/").read()
result = re.search(var_page_content, '1961') # it will return 1961 in string format
print(result)
Otherwise, use:

import re
import urllib.request
var_page_content = urllib.request.urlopen("https://urlthatyouneed.com/").read()
result = re.findall(r'[0-9]+', var_page_content) # it will return all the numbers in string format
print(result)
Hope it can help you!

Regards,
Martinelli


RE: select all the span text with same attribute - snippsat - Jul-28-2020

Martinelli regex is not the best choice when it comes to HTML/XML.
There is a reason why parses exits,if want a funny read .
from bs4 import BeautifulSoup

html = '''\
<span class="_3fVK8yi6">1,961</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:16.9445521519969%"></span>,
<span class="_3fVK8yi6">437</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:4.497867390461419%"></span>,
<span class="_3fVK8yi6">116</span>,
<span class="_2DzayJ9y"><span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span></span>,
<span class="RSnM6YUj"><span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span></span>,
<span class="_3EekUCk7 _2-7CMSau" style="width:1.5509887553315238%"></span>,
<span class="_3fVK8yi6">40</span>'''

soup = BeautifulSoup(html , 'lxml')
Usage test.
>>> all_3fV = soup.find_all(class_="_3fVK8yi6")
>>> all_3fV
[<span class="_3fVK8yi6">1,961</span>,
 <span class="_3fVK8yi6">437</span>,
 <span class="_3fVK8yi6">116</span>,
 <span class="_3fVK8yi6">40</span>]
>>> 
>>> [tag.text for tag in all_3fV]
['1,961', '437', '116', '40']