Python Forum
webscraping - failing to extract specific text from data.gov
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
webscraping - failing to extract specific text from data.gov
#2
from lxml import html
import requests
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
doc = html.fromstring(response.text)
link = doc.cssselect('div.new-results')
print(link[0].text_content().strip())

or using BeautifulSoup and lxml as parser

import requests
from bs4 import BeautifulSoup
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
soup = BeautifulSoup(response.text, 'lxml')
div = soup.find('div', {'class':'new-results'})
print(div.text.strip())
or

import requests
from bs4 import BeautifulSoup
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
soup = BeautifulSoup(response.text, 'lxml')
div = soup.select('div.new-results')
print(div[0].text.strip())
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Messages In This Thread
RE: webscraping - failing to extract specific text from data.gov - by buran - May-18-2018, 08:38 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscraping news articles by using selenium cate16 7 3,175 Aug-28-2023, 09:58 AM
Last Post: snippsat
  Webscraping with beautifulsoup cormanstan 3 1,997 Aug-24-2023, 11:57 AM
Last Post: snippsat
  Webscraping returning empty table Buuuwq 0 1,408 Dec-09-2022, 10:41 AM
Last Post: Buuuwq
  WebScraping using Selenium library Korgik 0 1,053 Dec-09-2022, 09:51 AM
Last Post: Korgik
  Extract Href URL and Text From List knight2000 2 9,124 Jul-08-2021, 12:53 PM
Last Post: knight2000
  How to get specific TD text via Selenium? euras 3 8,832 May-14-2021, 05:12 PM
Last Post: snippsat
  Extract data from sports betting sites nestor 3 5,664 Mar-30-2021, 04:37 PM
Last Post: Larz60+
  DJANGO Looping Through Context Variable with specific data Taz 0 1,840 Feb-18-2021, 03:52 PM
Last Post: Taz
  How to get rid of numerical tokens in output (webscraping issue)? jps2020 0 1,957 Oct-26-2020, 05:37 PM
Last Post: jps2020
  Extract data from a table Bob_M 3 2,700 Aug-14-2020, 03:36 PM
Last Post: Bob_M

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020