Scraping a Website (HELP)

LearnPython2 · May-08-2020, 10:09 AM

Hi, I need to scrape out all the links on a website.

I want to crawl something like Screaming Frog but with Python.

This is my code:

import urllib.request
data = urllib.request.urlopen('https://consultarsimit.co').read().decode()

from bs4 import BeautifulSoup
soup =  BeautifulSoup(data)
tags = soup('a')
for tag in tags:
		print(tag.get('href'))

How can I save the links in a database and query them with multi-threading?

Thanks!

**Larz60+** · (This post was last modified: May-08-2020, 03:20 PM by Larz60+.)

use requests rather than urllib.request (needs to be installed pip install requests)
follow snippsat's tutorial here:
web scraping part 1
web scraping part 2

then to get a list of all links:

linklist = soup.find_all('a')
for link in linklist:
    print(f"{link.get('href')}")

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	web scraping for new additions/modifed website?	kingoman123	4	2,245	Apr-14-2022, 04:46 PM Last Post: snippsat
	Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$	Hafedh_2021	1	2,756	May-18-2021, 08:41 PM Last Post: snippsat
	Scraping all website text using Python	MKMKMKMK	1	2,092	Nov-26-2020, 10:35 PM Last Post: Larz60+
	scraping from a website that hides source code	PIWI_Protein	1	1,968	Mar-27-2020, 05:08 PM Last Post: Larz60+
	Scraping not moving to the next pages in a website	jithin123	0	1,961	Mar-23-2020, 06:10 PM Last Post: jithin123
	Random Loss of Control of Website When Scraping	bmccollum	0	1,519	Aug-30-2019, 04:04 AM Last Post: bmccollum
	MaxRetryError while scraping a website multiple times	kawasso	6	17,472	Aug-29-2019, 05:25 PM Last Post: kawasso
	scraping multiple pages of a website.	Blue Dog	14	22,429	Jun-21-2018, 09:03 PM Last Post: Blue Dog
	Scraping number in % from website	santax	3	4,479	Mar-19-2017, 12:22 PM Last Post: santax

Scraping a Website (HELP)

User Panel Messages

Announcements