Python Forum
Python SSL web page scraping
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python SSL web page scraping
#1
I'm using Python 2.7 with BeautifulSoup to scrape web pages, but I keep running across protocol errors that don't make much sense to me. This only occurs on the particular website for which I need to do this: https://edd.telstra.com/telstra


The code I use only for fundamental testing
#! /usr/bin/python

from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re

# Copy all of the content from the provided web page
webpage = urlopen("https://edd.telstra.com/telstra/").read()
And I get the following error (running on Ubuntu 12.10):

Traceback (most recent call last):
File "e.py", line 8, in <module>
webpage = urlopen("https://edd.telstra.com/telstra/").read()
File "/usr/lib/python2.7/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 436, in open_https
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 958, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 818, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 780, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 1165, in connect
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket
ciphers=ciphers)
File "/usr/lib/python2.7/ssl.py", line 143, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 305, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [Errno 1] _ssl.c:504: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
Could someone tell me if there is some parameter that I need to specify to get this page to download in Python? It seems that this is the problem just on this web page as the code above (plus lots of other code I tried) works fine on other HTTPS/SSL pages I tried.

Thanks for any help!
Reply


Messages In This Thread
Python SSL web page scraping - by Vadanane - Jan-13-2023, 09:34 AM
RE: Python SSL web page scraping - by snippsat - Jan-13-2023, 04:11 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Problem with scraping the Title from a web page Wagner822 0 763 Jun-29-2022, 11:31 PM
Last Post: Wagner822
Brick Javascript based web page scraping amjadraza26 1 1,573 Oct-21-2021, 09:36 AM
Last Post: Larz60+
  scraping a table from an http page vchealy 1 1,799 Jun-10-2021, 09:48 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020