Python Forum
I wan't to Download all .zip Files From A Website (Project AI)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
I wan't to Download all .zip Files From A Website (Project AI)
#14
Hi Guys, I combined coding I found from someone, on the Internet for Web-Scraping ZIP Files.
With Your Code DeadEye, here is the Combined code :-

import sys
import getpass
import hashlib
import requests
 
 
BASE_URL = 'https://www.flightsim.com/'
 
 
def do_login(credentials):
    session = requests.Session()
    session.get(BASE_URL)
    req = session.post(BASE_URL + LOGIN_PAGE, params={'do': 'login'}, data=credentials)
    if req.status_code != 200:
        print('Login not successful')
        sys.exit(1)
    # session is now logged in
    return session
 
 
def get_credentials():
    username = input('Username: ')
    password = getpass.getpass()
    password_md5 = hashlib.md5(password.encode()).hexdigest()
    return {
        'cookieuser': 1,
        'do': 'login',
        's': '',
        'securitytoken': 'guest',
        'vb_login_md5_password': password_md5,
        'vb_login_md5_password_utf': password_md5,
        'vb_login_password': '',
        'vb_login_password_hint': 'Password',
        'vb_login_username': username,
        }
 
 
credentials = get_credentials()
session = do_login()

import urllib2
from urllib2 import Request, urlopen, URLError
#import urllib
import os
from bs4 import BeautifulSoup


#Create a new directory to put the files into
#Get the current working directory and create a new directory in it named test
cwd = os.getcwd()
newdir = cwd +"\\test"
print "The current Working directory is " + cwd
os.mkdir( newdir, 0777);
print "Created new directory " + newdir
newfile = open('zipfiles.txt','w')
print newfile


print "Running script.. "
#Set variable for page to be open and url to be concatenated 
url = "http://www.flightsim.com"
page = urllib2.urlopen('https://www.flightsim.com/vbfs/fslib.php?do=search&fsec=62').read()

#File extension to be looked for. 
extension = ".zip"

#Use BeautifulSoup to clean up the page
soup = BeautifulSoup(page)
soup.prettify()

#Find all the links on the page that end in .zip
for anchor in soup.findAll('a', href=True):
    links = url + anchor['href']
    if links.endswith(extension):
        newfile.write(links + '\n')
newfile.close()

#Read what is saved in zipfiles.txt and output it to the user
#This is done to create presistent data 
newfile = open('zipfiles.txt', 'r')
for line in newfile:
    print line + '/n'
newfile.close()

#Read through the lines in the text file and download the zip files.
#Handle exceptions and print exceptions to the console
with open('zipfiles.txt', 'r') as url:
    for line in url:
        if line:
            try:
                ziplink = line
                #Removes the first 48 characters of the url to get the name of the file
                zipfile = line[48:]
                #Removes the last 4 characters to remove the .zip
                zipfile2 = zipfile[:3]
                print "Trying to reach " + ziplink
                response = urllib2.urlopen(ziplink)
            except URLError as e:
                if hasattr(e, 'reason'):
                    print 'We failed to reach a server.'
                    print 'Reason: ', e.reason
                    continue
                elif hasattr(e, 'code'):
                    print 'The server couldn\'t fulfill the request.'
                    print 'Error code: ', e.code
                    continue
            else:
                zipcontent = response.read()
                completeName = os.path.join(newdir, zipfile2+ ".zip")
                with open (completeName, 'w') as f:
                    print "downloading.. " + zipfile
                    f.write(zipcontent)
                    f.close()
print "Script completed"


But I get the following Traceback Error, the coding runs ok initially, allowing me to type my Username. But I get the following Error Message after I hit enter :-

Error:
Traceback (most recent call last): File "C:\Users\Edward\Desktop\Python 2.79\Web Scraping Code For .ZIP Files 3.py", line 38, in <module> credentials = get_credentials() File "C:\Users\Edward\Desktop\Python 2.79\Web Scraping Code For .ZIP Files 3.py", line 22, in get_credentials username = input('Username: ') File "<string>", line 1, in <module> NameError: name '......' is not defined
Any ideas where I am going wrong ?

Eddie
Reply


Messages In This Thread
RE: I wan't to Download all .zip Files From A Website (Project AI) - by eddywinch82 - Aug-26-2018, 08:10 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Website scrapping and download santoshrane 3 4,507 Apr-14-2021, 07:22 AM
Last Post: kashcode
  Login and download an exported csv file within a ribbon/button in a website Alekhya 0 2,774 Feb-26-2021, 04:15 PM
Last Post: Alekhya
  Cant Download Images from Unsplash Website firaki12345 1 2,389 Feb-08-2021, 04:15 PM
Last Post: buran
  Download some JPG files and make it a single PDF & share it rompdeck 5 5,862 Jul-31-2020, 01:15 AM
Last Post: Larz60+
  download pdf file from website m_annur2001 1 3,098 Jun-21-2019, 05:03 AM
Last Post: j.crater
  Access my webpage and download files from Python Pedroski55 7 5,873 May-26-2019, 12:08 PM
Last Post: snippsat
  Download all secret links from a map design website fyec 0 2,939 Jul-24-2018, 09:08 PM
Last Post: fyec
  I Want To Download Many Files Of Same File Extension With Either Wget Or Python, eddywinch82 15 14,972 May-20-2018, 06:05 PM
Last Post: eddywinch82

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020