python selenium downloading embedded pdf

damian0612 · (This post was last modified: Feb-23-2021, 09:11 PM by damian0612.)

I've navigated to a page to download a pdf that is a report showing information I've asked for. However, I can't seem to download it because of the way the information is being displayed. When I inspect the pdf, here is what I see:

<embed id="plugin" type="application/x-google-chrome-pdf" src="https://uk.ixl.com/analytics/students-quickview-pdf" stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/0129a32b-174c-4e06-b6e5-0f332f853591" headers="cache-control: no-cache
cf-cache-status: DYNAMIC
cf-ray: 6263e4d4bdf940f6-LHR
cf-request-id: 08724d58f1000040f6860eb000000001
content-disposition: inline;filename=&quot;IXL-Students-Quickview_2021-02-23.pdf&quot;
content-language: en-GB
content-type: application/pdf
date: Tue, 23 Feb 2021 21:03:30 GMT
expect-ct: max-age=604800, report-uri=&quot;https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct&quot;
link: <https://uk.ixl.com/analytics/students-quickview-pdf>; rel=&quot;canonical&quot;
server: cloudflare
strict-transport-security: max-age=31536000
vary: User-Agent
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-ixl-trace: P1EwUHZMdG 9PaXFvPWJr RmVvMzZjY2 RlZXF4eGVv ZExrTEtPUF pzNndKSTAy YlQzYWFLOV Vzb0VqLzNk akVRaVdzOU 1VenZ4Tmd6 cGVFT21uZn J5ZVpmdnlE Vm4wNkxqdD hRb1g2cWNo bmRZK2xNbm tMazAyNHlp V0JIdnpjWk ttUHpXYi9G YTlkMjdDN2 ZicnF2cldT WitBVStUK3 ZSM2kyRkxo dVUxM256Ni 9EUHhPai9x dHVCUHpneW RrNUpNazJM Z1JSdXpjaU JaU2wwdklL RXF0eGxRaj UyNmMxbVJh bWd5MVZ0Kz VqVGthUHh4 bzU0dTBNc0 JCOU9iV0Jo NGJEZFZQdj FCSzY0ZHU5 dHpiV1g5TD hEMD0=
x-xss-protection: 1; mode=block
" background-color="0xFF525659" top-toolbar-height="0" javascript="allow" full-frame="" pdf-viewer-update-enabled="">

so when I try to download the pdf using urllib.request.urlretrieve(pdfUrl, pdfname) (with the pdfUrl = to the src in the code above) I get the error urllib.error.HTTPError: HTTP Error 403: Forbidden and when I load the the source page you can tell from the link that it is a generic page and for all reports and actually just gives you an error, so it seems as if the data is being streamed to the pdf if I'm right looking at where it says stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/d5f1b753-2bf4-468c-8c35-be6f1629ecc8". Any ideas how I can download this page? I might be missing something but as a home user I'm up to speed with python too much, thanks.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Using Python Embedded with Flask Restplus API : to deploy on IIS	gaurav_umang	1	2,264	Nov-18-2021, 01:35 AM Last Post: Larz60+
	Downloading CSV from a website	bmiller12	1	1,830	Nov-26-2020, 09:33 AM Last Post: Axel_Erfurt
	Downloading book preview	Truman	6	3,548	May-15-2019, 10:02 PM Last Post: Truman
	Downloading Multiple Webpages	MoziakBeats	4	3,272	Apr-17-2019, 04:06 AM Last Post: Skaperen
	Downloading txt files	tjnichols	6	4,073	Aug-27-2018, 10:01 PM Last Post: tjnichols
	Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program.	AcszE	1	3,647	Nov-03-2017, 08:41 PM Last Post: metulburr

python selenium downloading embedded pdf

User Panel Messages

Announcements