Scrap table from webpage - Printable Version

Scrap table from webpage - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Scrap table from webpage (/thread-37820.html)

Scrap table from webpage - Luis_liverpool - Jul-26-2022

Hi All,

I wonder if it possible to scrap table from following web-page:

My desired table

I saw few examples and it looks very easy like below example, but for my requested webpage it not work :(

df = pd.read_html('https://fastestlaps.com/tracks/le-mans-bugatti')

Does anyone can help me and shortly explain why some pages are more friendlier with scraping the data but sometimes not? :)

RE: Scrap table from webpage - snippsat - Jul-26-2022

(Jul-26-2022, 03:54 PM)Luis_liverpool Wrote: Does anyone can help me and shortly explain why some pages are more friendlier with scraping the data but sometimes not? :)

It's because the whole page it generated bye JavaScript then need other tool like Selenium.
But if i take a look so is not so easy as the whole body html is generated in go,so can not just parse out the table have to do some cleaning up.
If look at network can find the json response(url) this is a easier way then can use only Requests.
Example.

import requests

response = requests.get('http://node.gurustats.usermd.net:60519/pgee2022')
json_data = response.json()

>>> json_data['data'][0]
{'BIEGI': 64,
 'BON': 3,
 'D': 0,
 'DOM': 2.735,
 'DYSTBILANS': 22,
 'DYSTMINUS': 7,
 'DYSTPLUS': 29,
 'ELO': 1514,
 'KLUB': 'Gorzów',
 'Kolumna1': 0.201960784,
 'MSC': 1,
 'P0': 1,
 'P1': 5,
 'P2': 13,
 'P3': 45,
 'PKT': 166,
 'SDY': 0.536585366,
 'SREDNIA': 2.641,
 'SST': 0.674603175,
 'STARTBILANS': 44,
 'STARTMINUS': 41,
 'STARTPLUS': 85,
 'T': 0,
 'TORA': 2.6,
 'TORB': 2.714,
 'TORC': 2.412,
 'TORD': 2.833,
 'U': 0,
 'W': 0,
 'WYJAZD': 2.533,
 'ZAWODNIK': 'Bartosz Zmarzlik',
 'ZW': 0.703125,
 '_id': '62df0c7344fcf4caaa61790a',
 'id': 95,
 'mecze': 13}

>>> json_data['data'][1]
{'BIEGI': 29,
 'BON': 4,
 'D': 0,
 'DOM': 2.5,
 'DYSTBILANS': 1,
 'DYSTMINUS': 4,
 'DYSTPLUS': 5,
 'ELO': 1212,
 'KLUB': 'Grudziądz',
 'Kolumna1': 0.136363636,
 'MSC': 3,
 'P0': 1,
 'P1': 5,
 'P2': 7,
 'P3': 16,
 'PKT': 67,
 'SDY': 0.071428571,
 'SREDNIA': 2.448,
 'SST': 0.75,
 'STARTBILANS': 28,
 'STARTMINUS': 14,
 'STARTPLUS': 42,
 'T': 0,
 'TORA': 2.714,
 'TORB': 2.583,
 'TORC': 2.667,
 'TORD': 1.857,
 'U': 0,
 'W': 1,
 'WYJAZD': 2.364,
 'ZAWODNIK': 'Nicki Pedersen',
 'ZW': 0.551724138,
 '_id': '62df0c7344fcf4caaa61790b',
 'id': 110,
 'mecze': 5}

RE: Scrap table from webpage - Luis_liverpool - Jul-26-2022

Wow looks very good! I have one more question, how do you create this link? From where you get all necessary information to create it? It will be nice to know :)

RE: Scrap table from webpage - snippsat - Jul-26-2022

(Jul-26-2022, 05:40 PM)Luis_liverpool Wrote: From where you get all necessary information to create it? It will be nice to know :)

Using DevTools is useful when inspect a webpage and figure out what's going on.
The url can can be found it Network tab,usually what use most when scape is Elements tab where can look at HTML/CSS and get correct CSS or XPath selector generated for tag chosen.

RE: Scrap table from webpage - Luis_liverpool - Jul-26-2022

Wow! its another stuff which I need to investigate deeper, because I don't know nothing about that ;) Nevertheless thanks again for your cooperation and I hope to see you in my next posts which should be appear in the future ;)

RE: Scrap table from webpage - sharmajaafar - Aug-04-2022

You can check the similar elements feature of clicknium.