webscraping - failing to extract specific text from data.gov

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

webscraping - failing to extract specific text from data.gov

buran

Posts: 8,090

Threads: 154

Joined: Sep 2016

Reputation: 582

May-18-2018, 08:38 AM

from lxml import html
import requests
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
doc = html.fromstring(response.text)
link = doc.cssselect('div.new-results')
print(link[0].text_content().strip())

or using BeautifulSoup and lxml as parser

import requests
from bs4 import BeautifulSoup
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
soup = BeautifulSoup(response.text, 'lxml')
div = soup.find('div', {'class':'new-results'})
print(div.text.strip())

import requests
from bs4 import BeautifulSoup
response = requests.get('https://catalog.data.gov/dataset#sec-organization_type')
soup = BeautifulSoup(response.text, 'lxml')
div = soup.select('div.new-results')
print(div[0].text.strip())

If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Find

Messages In This Thread

webscraping - failing to extract specific text from data.gov - by rontar - May-18-2018, 08:26 AM

RE: webscraping - failing to extract specific text from data.gov - by buran - May-18-2018, 08:38 AM

RE: webscraping - failing to extract specific text from data.gov - by rontar - May-19-2018, 08:01 AM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Webscraping news articles by using selenium	cate16	7	3,175	Aug-28-2023, 09:58 AM Last Post: snippsat
	Webscraping with beautifulsoup	cormanstan	3	1,997	Aug-24-2023, 11:57 AM Last Post: snippsat
	Webscraping returning empty table	Buuuwq	0	1,408	Dec-09-2022, 10:41 AM Last Post: Buuuwq
	WebScraping using Selenium library	Korgik	0	1,053	Dec-09-2022, 09:51 AM Last Post: Korgik
	Extract Href URL and Text From List	knight2000	2	9,124	Jul-08-2021, 12:53 PM Last Post: knight2000
	How to get specific TD text via Selenium?	euras	3	8,832	May-14-2021, 05:12 PM Last Post: snippsat
	Extract data from sports betting sites	nestor	3	5,664	Mar-30-2021, 04:37 PM Last Post: Larz60+
	DJANGO Looping Through Context Variable with specific data	Taz	0	1,840	Feb-18-2021, 03:52 PM Last Post: Taz
	How to get rid of numerical tokens in output (webscraping issue)?	jps2020	0	1,957	Oct-26-2020, 05:37 PM Last Post: jps2020
	Extract data from a table	Bob_M	3	2,700	Aug-14-2020, 03:36 PM Last Post: Bob_M

Users browsing this thread: 2 Guest(s)

View a Printable Version

webscraping - failing to extract specific text from data.gov

User Panel Messages

Announcements