Scraping a Website (HELP) - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Scraping a Website (HELP) (/thread-26643.html) |
Scraping a Website (HELP) - LearnPython2 - May-08-2020 Hi, I need to scrape out all the links on a website. I want to crawl something like Screaming Frog but with Python. This is my code: import urllib.request data = urllib.request.urlopen('https://consultarsimit.co').read().decode() from bs4 import BeautifulSoup soup = BeautifulSoup(data) tags = soup('a') for tag in tags: print(tag.get('href'))How can I save the links in a database and query them with multi-threading? Thanks! RE: Scraping a Website (HELP) - Larz60+ - May-08-2020 use requests rather than urllib.request (needs to be installed pip install requests )follow snippsat's tutorial here: web scraping part 1 web scraping part 2 then to get a list of all links: linklist = soup.find_all('a') for link in linklist: print(f"{link.get('href')}") |