Python Forum
python selenium downloading embedded pdf
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python selenium downloading embedded pdf
#1
I've navigated to a page to download a pdf that is a report showing information I've asked for. However, I can't seem to download it because of the way the information is being displayed. When I inspect the pdf, here is what I see:
<embed id="plugin" type="application/x-google-chrome-pdf" src="https://uk.ixl.com/analytics/students-quickview-pdf" stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/0129a32b-174c-4e06-b6e5-0f332f853591" headers="cache-control: no-cache
cf-cache-status: DYNAMIC
cf-ray: 6263e4d4bdf940f6-LHR
cf-request-id: 08724d58f1000040f6860eb000000001
content-disposition: inline;filename=&quot;IXL-Students-Quickview_2021-02-23.pdf&quot;
content-language: en-GB
content-type: application/pdf
date: Tue, 23 Feb 2021 21:03:30 GMT
expect-ct: max-age=604800, report-uri=&quot;https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct&quot;
link: <https://uk.ixl.com/analytics/students-quickview-pdf>; rel=&quot;canonical&quot;
server: cloudflare
strict-transport-security: max-age=31536000
vary: User-Agent
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-ixl-trace: P1EwUHZMdG 9PaXFvPWJr RmVvMzZjY2 RlZXF4eGVv ZExrTEtPUF pzNndKSTAy YlQzYWFLOV Vzb0VqLzNk akVRaVdzOU 1VenZ4Tmd6 cGVFT21uZn J5ZVpmdnlE Vm4wNkxqdD hRb1g2cWNo bmRZK2xNbm tMazAyNHlp V0JIdnpjWk ttUHpXYi9G YTlkMjdDN2 ZicnF2cldT WitBVStUK3 ZSM2kyRkxo dVUxM256Ni 9EUHhPai9x dHVCUHpneW RrNUpNazJM Z1JSdXpjaU JaU2wwdklL RXF0eGxRaj UyNmMxbVJh bWd5MVZ0Kz VqVGthUHh4 bzU0dTBNc0 JCOU9iV0Jo NGJEZFZQdj FCSzY0ZHU5 dHpiV1g5TD hEMD0=
x-xss-protection: 1; mode=block
" background-color="0xFF525659" top-toolbar-height="0" javascript="allow" full-frame="" pdf-viewer-update-enabled="">
so when I try to download the pdf using urllib.request.urlretrieve(pdfUrl, pdfname) (with the pdfUrl = to the src in the code above) I get the error urllib.error.HTTPError: HTTP Error 403: Forbidden and when I load the the source page you can tell from the link that it is a generic page and for all reports and actually just gives you an error, so it seems as if the data is being streamed to the pdf if I'm right looking at where it says stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/d5f1b753-2bf4-468c-8c35-be6f1629ecc8". Any ideas how I can download this page? I might be missing something but as a home user I'm up to speed with python too much, thanks.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Using Python Embedded with Flask Restplus API : to deploy on IIS gaurav_umang 1 2,264 Nov-18-2021, 01:35 AM
Last Post: Larz60+
  Downloading CSV from a website bmiller12 1 1,830 Nov-26-2020, 09:33 AM
Last Post: Axel_Erfurt
  Downloading book preview Truman 6 3,548 May-15-2019, 10:02 PM
Last Post: Truman
  Downloading Multiple Webpages MoziakBeats 4 3,272 Apr-17-2019, 04:06 AM
Last Post: Skaperen
  Downloading txt files tjnichols 6 4,073 Aug-27-2018, 10:01 PM
Last Post: tjnichols
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,647 Nov-03-2017, 08:41 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020