Python Forum

Full Version: Download file from Internet
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello all,

I have a script I've been using for some time that was created by someone else (full script below). What I use it for is to connect to a computer on my network and download a file to my local machine. Now, I need to modify this script to connect to an external website but I get the following error "urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>".

I do go through a proxy.pac and there are firewalls involved; however, all traffic from my computer is unfettered to the destination machine. The script reads from a .bat file to get the https address (which points to port 8834) and has the username and password. The .bat file has "C:\Python27\python.exe C:\PythonEC\script.py -u https://<external IP address>:8834/# -l <username> -p <password> -f csv Prodscan0 -o Prodscan0" (items in <> are real values but I put placeholders here).

From what I get, there could be two issues. 1. the firewalls or proxy is getting in the way or 2. the script has to be changed since it is no longer going to an internal IP address. If it is a firewall/proxy issue, is there a way to get the script to work by adding code to bypass the proxy or defining any NAT hops? If the issue is that it is going to an external host, what can I change to make it work? I've spent a lot of time on this and am not making any progress. I truly appreciate any help that you can offer.

#!/usr/bin/env python
# by Konrads Smelkovs <[email protected]>
# Cool contributions by sash
# Licence - CC-BY, else do whatever you want with this

import urllib2
import json
import time
import sys
import argparse
import ssl
import os

try:
    ctx = ssl.create_default_context()
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE
except AttributeErorr, e:
    print "You should have python version 2.7.9 or later"
    print str(e)
    sys.exit(-1)

SLEEP = 2
CHOICES = "csv nessus html".split(" ")
DEF_URL = "https://localhost:8834"
parser = argparse.ArgumentParser(description='Download Nesuss results in bulk')
parser.add_argument('--sleep', type=int, default=SLEEP,
                    help='poll/sleep timeout')
parser.add_argument('--url', '-u', type=str, required=True,
                    default=DEF_URL,
                    help="url to nessus instance, default {}".format(DEF_URL))
parser.add_argument('-l', '--login', type=str, required=True,
                    help='Nessus login')
parser.add_argument('-p', '--password', type=str, required=True,
                    help='Nessus password')
parser.add_argument('-f', '--format', type=str, required=True,
                    default="csv", choices=CHOICES,
                    help='Format of nesuss output, defaults to csv')
parser.add_argument('--debug', type=bool, default=False,
                    help='Enable debugging output')
parser.add_argument('-o', '--output', type=str,
                    help='Output directory')
parser.add_argument('scanfolder', metavar='FOLDER', type=str, nargs=1,
                    help='Folder from which to download')
args = parser.parse_args()

if args.output:
    OUTPUDIR = args.output
else:
    OUTPUDIR = os.getcwd()

if args.sleep:
    SLEEP = args.sleep

data = json.dumps({'username': args.login, 'password': args.password})'
request = urllib2.Request(args.url + "/session", data, {'Content-Type': 'application/json; charset=UTF-8',
                                                        })
# opener.open(request,context=ctx)
f = urllib2.urlopen(request, context=ctx)
token = json.loads(f.read())['token']
if args.debug:
    print "[D] Logged on, token is %s" % token

request = urllib2.Request(args.url + "/folders",
                          headers={'X-Cookie': 'token=' + str(token)})
f = urllib2.urlopen(request, context=ctx)
folders = json.loads(f.read())
# print folders
# print args.scanfolder[0]
folderid = filter(lambda y: y['name'] == args.scanfolder[
                  0], folders['folders'])[0]['id']

scans_by_folder = urllib2.Request(
    args.url + "/scans?folder_id=%i" % folderid, headers={'X-Cookie': 'token=' + str(token)})
f = urllib2.urlopen(scans_by_folder, context=ctx)
scans = json.loads(f.read())["scans"]
if scans is None:
    print "[WW] There are no scan results in the folder ``{}''".format(args.scanfolder[0])
    sys.exit(-1)
if args.debug:
    print "[D] Got %i scans in folder %i" % (len(scans), folderid)

for s in scans:
    if args.debug:
        print "[D] Exporting %s" % s['name']

    if args.format == "html":
        values = {'report': s["id"],
                  "chapters": "compliance;compliance_exec;vuln_by_host;vuln_by_plugin;vuln_hosts_summary",
                  "format": args.format
                  }
        data = json.dumps(values)

    else:

        data = json.dumps({'format': args.format})

    print data
    request = urllib2.Request(args.url + "/scans/%i/export" % s["id"], data,
                              {'Content-Type': 'application/json',
                               'X-Cookie': 'token=' + str(token)})
    f = urllib2.urlopen(request, context=ctx)
    fileref = scans = json.loads(f.read())["file"]
    if args.debug:
        print "[D] Got export file reference %s" % fileref
    attempt = 0
    while True:
        attempt += 1
        if args.debug:
            print "[D] Reqesting scan status for fileref %s, attempt %i" % (fileref, attempt)
        status_for_file = urllib2.Request(args.url + "/scans/%s/export/%s/status" % (
            s["id"], fileref), headers={'X-Cookie': 'token=' + str(token)})
        f = urllib2.urlopen(status_for_file, context=ctx)
        status = json.loads(f.read())["status"]
        if status == "ready":
            download = urllib2.Request(args.url + "/scans/%s/export/%s/download?token=%s" % (s["id"], fileref, token),
                                       headers={'X-Cookie': 'token=' + str(token)})
            f = urllib2.urlopen(download, context=ctx)
            print "[**] Downloaded report for %s" % s["name"]
            with open(os.path.join(OUTPUDIR, "{}.{}".format(s["name"], args.format)), "wb") as rep:
                rep.write(f.read())
            break
        else:
            if args.debug:
                print "[D] Sleeping for %i seconds..." % SLEEP
            time.sleep(args.sleep)
There are some excellent tutorials on this forum for web scraping.
see: Web-Scraping part-1
and Web-Scraping-part-2