Python Forum
Please help me condense this requests snippet, and answer a basic question
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Please help me condense this requests snippet, and answer a basic question
#1
I'm having a hard time deciphering the requests module documentation, so is there a way to do this in one statement:

// assume I've imported requests and defined variable url with a string value representing
// a valid url

raw_object = requests.get(url)
extracted_text = raw_object.text
Also, when I use dir() on a requests.response object, I get a list of attributes and methods. Does that list vary depending on the data retrieved from the webpage accessed by requests.get(), or on different operating systems the running Python is installed on, provided that the version of Python is the same?
Reply
#2
The returned by get() method object is a class with methods and attributes.

This is why you have to use .text or .content attribute to get the source of the page.

One line:

page = requests.get('http://python-forum.io').content
The .content is the response in bytes while the .text is in Unicode. The other methods and attributes you see when you dir() the response object have quite descriptive names so you get the picture.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
(Jan-28-2018, 08:43 AM)league55 Wrote: Does that list vary depending on the data retrieved from the webpage accessed by requests.get()
Now the list is always the same,it's build as a package wheel.
The list do not change based on OS or external source.

A typical example.
Requests is always getting getting the correct encoding that web site use.
So sending content not text to BS is okay(no need to encode two times),as BS will detected that's utf-8 trough use of Unicode, Dammit
from bs4 import BeautifulSoup
import requests

url = 'https://www.python.org/'
url_get = requests.get(url)
print(url_get.encoding)
soup = BeautifulSoup(url_get.content, 'lxml')
print(soup.select('head > title')[0].text)
Output:
utf-8 Welcome to Python.org
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  POST requests - different requests return the same response Default_001 3 1,901 Mar-10-2022, 11:26 PM
Last Post: Default_001

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020