Python Forum

Full Version: Requests module get() incomplete download
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
With requests.get() the file downloaded is only 3/4 of the file I get by bringing up the page in Firefox and doing a "Save As". It lacks a data item that is visible when reading the page on the screen.
What exactly do you expect, without showing any code?
Your reproach is justified. I understand nothing about Web communications, and do not know where to start. For what it is worth, I took a sample case. I downloaded a page with Requests using this program:
#!/usr/bin/python
import requests
url='https://www.wsj.com/market-data/quotes/IBM/financials/annual/balance-sheet'
r=requests.get(url,allow_redirects=True)
open('ibmrqst.html','wb').write(r.content)
Then I called up the page with Firefox, right clicked on the page and did "Save Page As..." with "Web Page,complete" and then with "Web Page,HTML only". This obtained the following:
Requests download:
-rw-r--r-- 1 boba boba 710985 Feb 17 11:19 ibmrqst.html
Firefox download complete:
-rw-rw-r-- 1 boba boba 527413 Feb 17 11:21 ibmffx.html
drwxr-xr-x 2 boba boba 4096 Feb 17 11:21 ibmffx_files
Firefox download HTML only:
-rw-rw-r-- 1 boba boba 414414 Feb 17 11:23 ibmffxhonly.html
I do not know whether any significant values on the page are missing or not; it is just that the sizes of the files are different, even though they are all HTML. I had hoped there was a generally known explanation for such cases.