Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python using BS scraper
#1
Hello please can some point me in the right direction
- i have been dabbling and leaning Python with Beautiful soup and web scraping
i would like to create a program that can extract multiple web links eg.
<a href="/car_racing/" class="win">1pm</a>
<a href="/car_racing2/" class="win">2pm</a>
<a href="/car_racing3/" class="win">3pm</a>
<a href="/car_racing4/" class="win">4pm</a>

store these either in a json or csv or ??(please advise the best storage to use)
then add the main link on the this (www.carracing.com/car_racing/profile/ open each link and extract another link

<a href="/profile/" class="win">red</a>
<a href="/profile1/" class="win">blue</a>
<a href="/profile2/" class="win">green</a>
<a href="/profile3/" class="win">white</a>

again store these store these
then open each link and extract the data per car driver name, car type, car engine, car make ect

then present the date in a readable format
Reply
#2
Look at web-scraping part-1, part-2
from bs4 import BeautifulSoup

html = '''\
<a href="/car_racing/" class="win">1pm</a>
<a href="/car_racing2/" class="win">2pm</a>
<a href="/car_racing3/" class="win">3pm</a>
<a href="/car_racing4/" class="win">4pm</a>'''

soup = BeautifulSoup(html, 'lxml')
Usage:
>>> all_a = soup.find_all('a', class_="win")
>>> all_a
[<a class="win" href="/car_racing/">1pm</a>,
 <a class="win" href="/car_racing2/">2pm</a>,
 <a class="win" href="/car_racing3/">3pm</a>,
 <a class="win" href="/car_racing4/">4pm</a>]

>>> for tag in all_a:
...     print(tag.text)
...     
1pm
2pm
3pm
4pm
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Web scraper tomenzo123 8 4,293 Aug-18-2023, 12:45 PM
Last Post: Gaurav_Kumar
  Web scraper not populating .txt with scraped data BlackHeart 5 1,457 Apr-03-2023, 05:12 PM
Last Post: snippsat
  Image Scraper (beautifulsoup), stopped working, need to help see why woodmister 9 3,959 Jan-12-2021, 04:10 PM
Last Post: woodmister
  Court Opinion Scraper in Python w/ BS4 (Currently exports to CSV) need help with SQL MidnightDreamer 4 2,962 Mar-12-2020, 09:57 AM
Last Post: BrandonKastning
  web scraper using pathlib Larz60+ 1 3,170 Oct-16-2017, 05:27 PM
Last Post: Larz60+
  Need alittle hlpl with an image scraper. Blue Dog 8 7,637 Dec-24-2016, 08:09 PM
Last Post: Blue Dog
  Made a very simple email grabber(scraper) Blue Dog 4 6,802 Dec-13-2016, 06:25 AM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020