Python Forum
Unable to scrape more than one URL with this code
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unable to scrape more than one URL with this code
#1
Solved!

Hi I am trying to scrape a directory but my code is only scraping the first URL and throws an error for the URLs to follow. I've checked if my URLs are fine or not and all of them are okay. I've interchanged the URLs as well and the result is the same. The first url is scrapped fine and then an error. Can someone please have a look at my code and see where I am going wrong?

[Deleted code for privacy]
Reply
#2
What is the error it giving you?
Reply
#3
(Nov-18-2020, 11:56 AM)Anarab Wrote: What is the error it giving you?

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
    raise err
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f82d011f4c0>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "working.py", line 37, in <module>
    driver.get(url)
  File "/home/sam/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
    self.execute(Command.GET, {'url': url})
  File "/home/sam/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 319, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/home/sam/.local/lib/python3.8/site-packages/selenium/webdriver/remote/remote_connection.py", line 374, in execute
    return self._request(command_info[0], url, body=data)
  File "/home/sam/.local/lib/python3.8/site-packages/selenium/webdriver/remote/remote_connection.py", line 397, in _request
    resp = self._conn.request(method, url, body=body, headers=headers)
  File "/usr/lib/python3.8/site-packages/urllib3/request.py", line 79, in request
    return self.request_encode_body(
  File "/usr/lib/python3.8/site-packages/urllib3/request.py", line 171, in request_encode_body
    return self.urlopen(method, url, **extra_kw)
  File "/usr/lib/python3.8/site-packages/urllib3/poolmanager.py", line 336, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 754, in urlopen
    return self.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 754, in urlopen
    return self.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 754, in urlopen
    return self.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=58379): Max retries exceeded with url: /session/75bb5752c87e15845e4617c66fc615db/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f82d011f4c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Unable to Scrape Website muhamdasim 2 2,599 Dec-27-2021, 07:49 PM
Last Post: JohnRaven
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 5,170 Mar-13-2020, 07:59 PM
Last Post: alkaline3

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020