Python Forum
hi new at python , trying to get urls from website
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
hi new at python , trying to get urls from website
#6
Larz60+ code work for me,not tested it with other links.

Start simple here a the basic setup.
Then can error handling/testing or not(as many drop in web-scraping).
from bs4 import BeautifulSoup
import requests

url = 'https://www.python.org/'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
for link in soup.select('a'):
    if link.get('href').startswith('http'):
        print(link.get('href'))
Output:
https://docs.python.org https://pypi.python.org/ http://plus.google.com/+Python http://www.facebook.com/pythonlang?fref=ts http://twitter.com/ThePSF http://brochure.getpython.info/ .... ect
So this get links bye using CSS selector or could have used soup.find_all('a')
This filter out so only get link that has http.
I have a tutorial here, part-2
Reply


Messages In This Thread
RE: hi new at python , trying to get urls from website - by snippsat - Feb-24-2018, 05:12 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting content from a website using Python? SandraYokum 2 309 Yesterday, 03:30 AM
Last Post: Davidleo
  Retrieve website content using Python? Vadanane 1 1,348 Jan-16-2023, 09:55 AM
Last Post: Axel_Erfurt
  BeautifulSoup not parsing other URLs giddyhead 0 1,252 Feb-23-2022, 05:35 PM
Last Post: giddyhead
  I want to create an automated website in python mkdhrub1 2 2,540 Dec-27-2021, 11:27 PM
Last Post: Larz60+
  Python to build website Methew324 1 2,297 Dec-15-2020, 05:57 AM
Last Post: buran
  Scraping all website text using Python MKMKMKMK 1 2,144 Nov-26-2020, 10:35 PM
Last Post: Larz60+
  Need logic on how to scrap 100K URLs goodmind 2 2,712 Jun-29-2020, 09:53 AM
Last Post: goodmind
  Python Webscraping with a Login Website warriordazza 0 2,673 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Python tool based on website? zarize 2 2,541 Mar-21-2020, 02:25 PM
Last Post: zarize
  Scrape multiple urls LXML santdoyle 1 3,621 Oct-26-2019, 09:53 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020