Python Forum

Full Version: 'soup.findAll()' help - Want to retrieve multiple attribute values
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello:

I am extremely new to python and am working on developing basic web crawlers. One data point I am interested in retrieving from a particular website is 'engagement'; however, when I inspect the webpage element, the HTML reveals that the engagement number has multiple attribute values depending upon how much it is being engaged with. For example, the element is 'span' and the attribute is 'class' but there are multiple attribute values associated with it ('hot', 'warm').

Here is my (partial) code:

for link in soup.findAll('span', {'class': 'warm' 'hot'}):
            views = link.string
            print(views)
If i choose just one, i do get some print out. However, I want to get the engagement data from every article I crawl.

Therefore, my question is this: How do I incorporate multiple attribute values into a single soup.findAll so that I don't clutter up my print out?

Thanks for the help!

I have also tried the following:

for link in soup.findAll('span', {'class': ['warm','hot']}):
            views = link.string
            print(views)
This doesn't produce any errors, but it doesn't print anything out
you second code is the one that should work.
Quote:soup.findAll('span', {'class': ['warm','hot']}):
Work your way backwards. Instead of printing out link.string, see what link actually contains. If link is empty then work you way backwards even more.

However is there an outer HTML tag that this is nested in, to grab each span element?