Python Forum
Unable to gather data using beautifulscoup() [Output shows blank file]
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unable to gather data using beautifulscoup() [Output shows blank file]
#2
When using soup.find() it stop at first hit,there are 6 class="container-fluid".
Find a tag that have that have more specific info and contain the table.
Example:
from bs4 import BeautifulSoup
import requests

url = 'http://www.vgchartz.com/gamedb/?page=&results=1000&name=&platform=&minSales=0.01&publisher=&genre=&sort=GL'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
chart = soup.find("div", id="generalBody")
tr_tag = chart.find_all('tr')
Test:
>>> tr_tag[4]
<tr style="background-image:url(../imgs/chartBar_alt_large.gif); height:70px">
<td>2</td>
<td>
<div id="photo3">
<a href="/games/game.php?id=6455&amp;region=All">
<div style="height:60px; width:60px; overflow:hidden;"> <img alt="Boxart Missing" border="0" src="/games/boxart/8972270ccc.jpg" width="60"/>
</div>
</a>
</div>
</td> <td style="font-size:12pt;"> <a href="http://www.vgchartz.com/game/6455/super-mario-bros/?region=All">Super Mario Bros.    </a> </td>
<td>
<center>
<img alt="NES" src="/images/consoles/NES_b.png"/>
</center>
</td> <td width="100">Nintendo  </td> <td align="center">N/A  </td> <td align="center">10.0  </td> <td align="center">N/A  </td> <td align="center">40.24m</td> <td align="center" width="75">18th Oct 85  </td> <td align="center" width="75">N/A</td></tr>

>>> tr_tag[4].find_all('a')[1].text
... 
'Super Mario Bros.    '

>>> td = tr_tag[4].find_all('td', align="center")
>>> for item in td:
...     item.text
...     
'N/A  '
'10.0  '
'N/A  '
'40.24m'
'18th Oct 85  '
'N/A'
Look at Web-Scraping part-1,
as you see no use of urllib always Requests.
Reply


Messages In This Thread
RE: Unable to gather data using beautifulscoup() [Output shows blank file] - by snippsat - Apr-13-2018, 04:35 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Unable to get the data from web API using authentication key lokamaba 0 2,108 May-15-2020, 05:07 AM
Last Post: lokamaba
  Unable to access javaScript generated data with selenium and headless FireFox. pjn4 0 2,684 Aug-04-2019, 11:10 AM
Last Post: pjn4
  unable to load file using python selenium purnima1 4 6,716 Dec-12-2017, 04:04 PM
Last Post: hshivaraj
  Unable to print data while looping through list in csv for webscraping - Python Prince_Bhatia 1 3,598 Oct-04-2017, 11:18 AM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020