hi new at python , trying to get urls from website

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

hi new at python , trying to get urls from website

snippsat

Administrators

Posts: 7,113

Threads: 122

Joined: Sep 2016

Reputation: 499

Feb-24-2018, 05:12 PM

Larz60+ code work for me,not tested it with other links.

Start simple here a the basic setup.
Then can error handling/testing or not(as many drop in web-scraping).

from bs4 import BeautifulSoup
import requests

url = 'https://www.python.org/'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
for link in soup.select('a'):
    if link.get('href').startswith('http'):
        print(link.get('href'))

Output:https://docs.python.org
https://pypi.python.org/
http://plus.google.com/+Python
http://www.facebook.com/pythonlang?fref=ts
http://twitter.com/ThePSF
http://brochure.getpython.info/
.... ect

So this get links bye using CSS selector or could have used soup.find_all('a')
This filter out so only get link that has http.
I have a tutorial here, part-2

Find

Messages In This Thread

hi new at python , trying to get urls from website - by dviry - Feb-23-2018, 05:01 PM

RE: hi new at python , trying to get urls from website - by metulburr - Feb-23-2018, 05:55 PM

RE: hi new at python , trying to get urls from website - by Larz60+ - Feb-23-2018, 06:00 PM

RE: hi new at python , trying to get urls from website - by Larz60+ - Feb-23-2018, 06:16 PM

RE: hi new at python , trying to get urls from website - by dviry - Feb-24-2018, 04:47 PM

RE: hi new at python , trying to get urls from website - by snippsat - Feb-24-2018, 05:12 PM

RE: hi new at python , trying to get urls from website - by metulburr - Feb-24-2018, 07:34 PM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Extracting content from a website using Python?	SandraYokum	2	309	Yesterday, 03:30 AM Last Post: Davidleo
	Retrieve website content using Python?	Vadanane	1	1,348	Jan-16-2023, 09:55 AM Last Post: Axel_Erfurt
	BeautifulSoup not parsing other URLs	giddyhead	0	1,252	Feb-23-2022, 05:35 PM Last Post: giddyhead
	I want to create an automated website in python	mkdhrub1	2	2,540	Dec-27-2021, 11:27 PM Last Post: Larz60+
	Python to build website	Methew324	1	2,297	Dec-15-2020, 05:57 AM Last Post: buran
	Scraping all website text using Python	MKMKMKMK	1	2,144	Nov-26-2020, 10:35 PM Last Post: Larz60+
	Need logic on how to scrap 100K URLs	goodmind	2	2,712	Jun-29-2020, 09:53 AM Last Post: goodmind
	Python Webscraping with a Login Website	warriordazza	0	2,673	Jun-07-2020, 07:04 AM Last Post: warriordazza
	Python tool based on website?	zarize	2	2,541	Mar-21-2020, 02:25 PM Last Post: zarize
	Scrape multiple urls LXML	santdoyle	1	3,621	Oct-26-2019, 09:53 PM Last Post: snippsat

Users browsing this thread: 1 Guest(s)

View a Printable Version

hi new at python , trying to get urls from website

User Panel Messages

Announcements