How to web scrape this?

***snippsat*** · (This post was last modified: May-28-2021, 12:57 AM by snippsat.)

(May-27-2021, 10:24 PM)Pedroski55 Wrote: So I thought, "I'll webscrape it and save the text!", just as practice.

But, there is no .html or .php just:

Do you see .html or .php often as it's not common to have in a url address.
So on the web dos not filename extensions matter,
as web-server call .html files and map it to a serve name and browser also communicated with a name server(DNS) to translate the server name.
Read more about this.

So scraping it's the same way as it's just normal url address.

import requests
from bs4 import BeautifulSoup

url = 'https://www.geeksforgeeks.org/difference-between-propositional-logic-and-predicate-logic/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.select_one('div.title').text)
print(soup.select_one('#post-564612 > div.text > ol:nth-child(4) > li:nth-child(1)').text)

Output:Difference between Propositional Logic and Predicate Logic
If x is real, then x2 > 0

How to web scrape this?

User Panel Messages

Announcements