Extract something when you have multiple tags - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Extract something when you have multiple tags (/thread-34475.html) |
Extract something when you have multiple tags - knight2000 - Aug-03-2021 Hey all, I am practicing webscraping and I've come across a scenario where I'm a little stuck. First, here's a snapshot of the code (which works up to this point) from bs4 import BeautifulSoup import requests import pandas as pd url = ('mytesturl') page = requests.get(url) soup = BeautifulSoup(page.text, 'html.parser') voteup = (soup.find('span', {'class': 'nvb voteup'})) print(voteup)This gives me the following result: So what I'm trying to do, is be able to navigate through this result and extract the "1" towards the end between the 'span' tags. I've tried looking through some of the BeautifulSoup documentation and also tried searching online (although not sure what to properly search for in Google and maybe missed it in the BS documentation too ).Would anyone be able to enlighten me on how to navigate through the current results to then go on to extract the 1 between the span tags please? Thanks a lot. RE: Extract something when you have multiple tags - Larz60+ - Aug-03-2021 Add after: voteup = (soup.find('span', {'class': 'nvb voteup'})) to: splist = votup.find_all('span') you can then extract your data from the splist RE: Extract something when you have multiple tags - knight2000 - Aug-04-2021 (Aug-03-2021, 09:10 PM)Larz60+ Wrote: Add after: Hi Larz60+, Thank you very much for your assistance. I tried and failed at so many variations- I could have sworn I also tried the Find All approach and it didn't work for me. Clearly I didn't execute it properly and now it looks so simple! Appreciate your time in helping me out. Have a great day. RE: Extract something when you have multiple tags - snippsat - Aug-04-2021 (Aug-03-2021, 11:51 AM)knight2000 Wrote: Would anyone be able to enlighten me on how to navigate through the current results to then go on to extract the 1 between the span tags please? from bs4 import BeautifulSoup html = '''\ <span class="nvb voteup"><i class="fa fa-plus"></i><span>1</span></span>''' soup = BeautifulSoup(html, 'lxml')Usage with CSS selector >>> soup.select_one('body > span > span') <span>1</span> >>> soup.select_one('body > span > span').text '1'If more tag use select() then get a list of tag back that eg can loop over.html = '''\ <span class="nvb voteup"><i class="fa fa-plus"></i><span>1</span></span> <span class="nvb voteup"><i class="fa fa-plus"></i><span>2</span></span> <span class="nvb voteup"><i class="fa fa-plus"></i><span>3</span></span>'''Usage. >>> tag = soup.select('body > span > span') >>> tag [<span>1</span>, <span>2</span>, <span>3</span>] >>> for span in tag: ... print(span.text) ... 1 2 3 >>> [span.text for span in tag] ['1', '2', '3'] |