Python Forum
Selenium opening pdf in new window - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Selenium opening pdf in new window (/thread-12794.html)



Selenium opening pdf in new window - test - Sep-13-2018

Hello!
I've been trying to use Selenium as i am unable to figure requests out (kindly help me with requests here).

I am facing the following problem with selenium:
When trying to download pdf files from multiple links on the same page, it opens the pdf in a different window and throws the following error:
Error:
Message: The element reference of <a href="javascript:__doPostBack('dtgReports$ctl03$ctl01','')"> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
I have tried the following with no success:
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", '/home/bunni/Desktop/pat_files')
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")
profile.set_preference("browser.helperApps.neverAsk.openFile", "")
profile.set_preference("pdfjs.disables", True)
profile.set_preference("plugin.scan.Acrobat", "99.0")
profile.set_preference("plugin.scan.plid.all", False)
profile.set_preference("browser.download.manager.useWindow", True)
profile.set_preference("plugin.disable_full_page_plugin_for_types", "application/pdf")

browser = webdriver.Firefox(firefox_profile=profile)
Can someone please help me out here?


RE: Selenium opening pdf in new window - Larz60+ - Sep-13-2018

Quote:you will have to send it via Browser action chaining only as selenium opens an anonymous session everytime. So, send click command like this.

from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

ActionChains(driver).send_keys(Keys.COMMAND, "t").perform()

from: https://stackoverflow.com/questions/31023672/selenium-open-firefox-links-in-new-tabs-not-new-window-python


RE: Selenium opening pdf in new window - test - Sep-13-2018

Hello Larz60+, thank you for your time.
The problem i am facing is that the file i need is behind a link of this type:
[html]
<a href="javascript:__doPostBack('xyzz$ctl02$ctl01','')" style="color:Blue;">View</a>
[/html]
Opening it in a new tab doesnt seem to work.
Also, setPreferences() doesn't seem to be working everytime. I understand that some preferences are frozen. Is there any way to unfreeze them and set?
I have tried the following with no success:
profile.DEFAULT_PREFERENCES['frozen']['browser. link. open_newwindow'] =  2 
Kindly advice.

I can use a partial workaround of this type:
1) Click the link
2) Let it open in a new window
3) Save it
4) Close the window and go back to the page with links

Forthe above, i have tried the following code:
for link in browser.find_elements_by_link_text('View'):
	curr_handle = browser.current_window_handle
	link.click()
	new_handle = browser.window_handles[-1]
	driver.switch_to_window(new_handle)
	#need to save this file
	driver.close() #this call is throwing an error if i am trying the following next
	driver.switch_to_window(curr_handle)
As setting preferences is not working for some reason, i am looking for other ways to save this file. Is there a way analogous to the way we write files as binaries to download something using requests module?
Any help is welcome :)


RE: Selenium opening pdf in new window - test - Sep-14-2018

(Sep-13-2018, 02:08 PM)test Wrote: Forthe above, i have tried the following code:
for link in browser.find_elements_by_link_text('View'):
	curr_handle = browser.current_window_handle
	link.click()
	new_handle = browser.window_handles[-1]
	driver.switch_to_window(new_handle)
	#need to save this file
	driver.close() #this call is throwing an error if i am trying the following next
	driver.switch_to_window(curr_handle)

Apparantly this is not working because the page might have been refreshed by the time the new window was closed. A suggested work around is to rebuild the link list by search_elements_by_link_text('linktext') within the loop.
Still having trouble downloading the pdf though. I would like to write it down as a binary instead of having the browser throw boxes at me. Kindly advice :)

hello! looking for solution to this:
Quote:Still having trouble downloading the pdf though. I would like to write it down as a binary instead of having the browser throw boxes at me. Kindly advice :)
I found the following here
Quote:It may not be the most elegant solution, but what worked for me, eventually, was to simply write each byte, one by one, like this:
f = open('report.xls', 'wb')
for uchar in driver.page_source:
    f.write(bytearray([ord(uchar)]))
f.close()
This produced a working Excel file, which I could then open in libreoffice et al.

But when i try to write google.com homepage using this code, it produces the following error:
Error:
Traceback (most recent call last): File "<stdin>", line 2, in <module> ValueError: byte must be in range(0, 256)
Kindly advice :)


RE: Selenium opening pdf in new window - test - Sep-15-2018

okay, so the type of broser.page_source is str.
I have tried the following to convert it into binary and write it as a pdf file, but no success:
from selenium import webdriver

browser = webdriver.Firefox()
page = browser.get('http://learningstorm.org/wp-content/uploads/2016/02/RMEOJLET-1.pdf')
f = open('try.pdf', 'wb')
a = map(bin, bytearray(browser.page_source, 'utf16'))
for i in a:
    f.write(i)
f.close()
No errors while writing the file, but pdf doesnt open with the error message being:
Error:
File type STL 3D model (binary) (model/x.stl-binary) is not supported
I'm stuck here. Any help is welcome :)

EDIT:
passing the map object to list() and trying to iterate over each byte returns the following error:
Error:
Traceback (most recent call last): File "<stdin>", line 2, in <module> TypeError: a bytes-like object is required, not 'str'
What am i doing wrong?

update:
viewing page_source of this pdf link, it looks like a html page and not a pdf file!
what exactly is happening here??


RE: Selenium opening pdf in new window - test - Sep-15-2018

Here i found the following code:
page_src = br.page_source.encode("utf-8") // support unicode characters
f = open('page.html', 'w')
f.write(page_src)
but the encoding seems to be different from that needed for a pdf file. I'm getting a 'file type not supported' error on trying to see the downloaded file.
I think i'm doing something wrong...downloading a file shouldn't be so difficult..
Any help, welcome


RE: Selenium opening pdf in new window - test - Sep-15-2018

There seems to be some bug in firefox driver that is not updating user preferences. Tried chrome driver. That works.
Hope this helps someone else too!