Python using BS scraper - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Python using BS scraper (/thread-24300.html) |
Python using BS scraper - paulfearn100 - Feb-07-2020 Hello please can some point me in the right direction - i have been dabbling and leaning Python with Beautiful soup and web scraping i would like to create a program that can extract multiple web links eg. <a href="/car_racing/" class="win">1pm</a> <a href="/car_racing2/" class="win">2pm</a> <a href="/car_racing3/" class="win">3pm</a> <a href="/car_racing4/" class="win">4pm</a> store these either in a json or csv or ??(please advise the best storage to use) then add the main link on the this (www.carracing.com/car_racing/profile/ open each link and extract another link <a href="/profile/" class="win">red</a> <a href="/profile1/" class="win">blue</a> <a href="/profile2/" class="win">green</a> <a href="/profile3/" class="win">white</a> again store these store these then open each link and extract the data per car driver name, car type, car engine, car make ect then present the date in a readable format RE: Python using BS scraper - snippsat - Feb-07-2020 Look at web-scraping part-1, part-2 from bs4 import BeautifulSoup html = '''\ <a href="/car_racing/" class="win">1pm</a> <a href="/car_racing2/" class="win">2pm</a> <a href="/car_racing3/" class="win">3pm</a> <a href="/car_racing4/" class="win">4pm</a>''' soup = BeautifulSoup(html, 'lxml')Usage: >>> all_a = soup.find_all('a', class_="win") >>> all_a [<a class="win" href="/car_racing/">1pm</a>, <a class="win" href="/car_racing2/">2pm</a>, <a class="win" href="/car_racing3/">3pm</a>, <a class="win" href="/car_racing4/">4pm</a>] >>> for tag in all_a: ... print(tag.text) ... 1pm 2pm 3pm 4pm |