Python Forum

Full Version: I Want To Download Many Files Of Same File Extension With Either Wget Or Python,
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Thankyou so much snippsat, your coding worked ! Yes I am a complete novice, when it comes to Python programming. But I appreciate your efforts for me. Just wondering what could I type in the coding, so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ?
(May-19-2018, 05:01 PM)eddywinch82 Wrote: [ -> ]so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ?
You can use tqdm
from bs4 import BeautifulSoup
import requests
from tqdm import tqdm, trange

url = 'http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
b_tag = soup.find_all('b')
for a in tqdm(b_tag):
    link = a.find('a')['href']
    #print(link)
    f_name = link.split('id=')[-1]
    with open(f_name, 'wb') as f:
        f.write(requests.get(link).content)
Running from command line cmder.
[Image: 03BHuK.jpg]
Running from Python shell may get a file by file display.
Thankyou snippsat,

I would like to do the same, downloading of AI Aircraft texture Files. For another link :-

http://web.archive.org/web/2005031511294...php?ac=Two Digit Number&cat=6

There are AI Aircraft catagories Boeings, Airbus etc, and each Aircraft Type has a different two digit number after ac= but the Catagory i.e. cat= in this case Current Liveries is always the same i.e. 6 whereas in the vintage Liveries section, the cat=1

Then when you click on that, the different Livery Texture .zip Files have the following common File Path :-

http://web.archive.org/web/2004111419514...ileid=Four Digit Number

And that is the same for All Aircraft and textures, for both Current and Vintage Liveries.

Could tell me how I can code in Python, to download all the .zip files ? I don't mind them all being downloaded in the same place. Your help is and would be very much appreciated, and many thanks for taking the time to help me out.

Eddie
(May-19-2018, 07:13 PM)eddywinch82 Wrote: [ -> ]Could tell me how I can code in Python, to download all the .zip files ?
It don't work like this here,we help peploe that do write some code on there own to learn.
Just asking for code can be done in job section.

That's said i can give you a hint you try to build on.
These are a couple of .zip files from Airbus A300-200.
import requests

file_id = [6082, 6177]
for _id in file_id:
    a_zip = 'http://web.archive.org/web/20041108074847/http://www.projectai.com:80/libraries/download.php?file_id={}'.format(_id)
    with open('{}.zip'.format(_id), 'wb') as f:
        f.write(requests.get(a_zip).content)
Thats fine sir,

I understand, many thanks for all your help, I think I can manage to work out what I
need to do, All the best Eddie
I think that I may have cracked it ! More than two types of Airbus Aircraft .zip Files have downloaded so far, when i run the module. Hopefully all of the .zip Files will download. Here is the amended Python Code :-

from bs4 import BeautifulSoup
import requests, wget, re, zipfile, io


def get_zips(link_root, zips_suffix):
    # 'http://web.archive.org/web/20050315112710/http://www.projectai.com:80/libraries/repaints.php?ac=89&cat=6'
    zips_page =  link_root + zips_suffix
    # print zips_page
    zips_source = requests.get(zips_page).text
    zip_soup = BeautifulSoup(zips_source, "html.parser")
    for zip_file in zip_soup.select("a[href*=download.php?fileid=]"):
        zip_url = link_root + zip_file['href']
        print('downloading', zip_file.text, '...',)
        r = requests.get(zip_url)
        with open(zip_file.text, 'wb') as zipFile:
            zipFile.write(r.content)


def download_links(root, cat):
    url = ''.join([root, cat])
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, "html.parser")
    td = soup.find_all('td', class_="text", colspan="2", bgcolour="#FFFF99", href="download.php?fileid=")
    for h in td:
         h.a.get('href')

    for zips_suffix in soup.select("a[href*=repaints.php?ac=]"):
        get_zips(root, zips_suffix['href'])


link_root = 'http://web.archive.org/web/20041225023002/http://www.projectai.com:80/libraries/'

# Example category, need to read all categories from first page into a list and iterate categories
category = 'acfiles.php?cat=6'
download_links(link_root, category)


Many thanks for your very useful hints, snippsat

Eddie


Sorry snippsat, I have just noticed, that I have put the latest code in the wrong Thread, could you transfer it to the latest thread, I posted for me ? Eddie
Pages: 1 2