Python Forum
I Want To Download Many Files Of Same File Extension With Either Wget Or Python,
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
I Want To Download Many Files Of Same File Extension With Either Wget Or Python,
#11
Thankyou so much snippsat, your coding worked ! Yes I am a complete novice, when it comes to Python programming. But I appreciate your efforts for me. Just wondering what could I type in the coding, so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ?
Reply
#12
(May-19-2018, 05:01 PM)eddywinch82 Wrote: so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ?
You can use tqdm
from bs4 import BeautifulSoup
import requests
from tqdm import tqdm, trange

url = 'http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
b_tag = soup.find_all('b')
for a in tqdm(b_tag):
    link = a.find('a')['href']
    #print(link)
    f_name = link.split('id=')[-1]
    with open(f_name, 'wb') as f:
        f.write(requests.get(link).content)
Running from command line cmder.
[Image: 03BHuK.jpg]
Running from Python shell may get a file by file display.
Reply
#13
Thankyou snippsat,

I would like to do the same, downloading of AI Aircraft texture Files. For another link :-

http://web.archive.org/web/2005031511294...php?ac=Two Digit Number&cat=6

There are AI Aircraft catagories Boeings, Airbus etc, and each Aircraft Type has a different two digit number after ac= but the Catagory i.e. cat= in this case Current Liveries is always the same i.e. 6 whereas in the vintage Liveries section, the cat=1

Then when you click on that, the different Livery Texture .zip Files have the following common File Path :-

http://web.archive.org/web/2004111419514...ileid=Four Digit Number

And that is the same for All Aircraft and textures, for both Current and Vintage Liveries.

Could tell me how I can code in Python, to download all the .zip files ? I don't mind them all being downloaded in the same place. Your help is and would be very much appreciated, and many thanks for taking the time to help me out.

Eddie
Reply
#14
(May-19-2018, 07:13 PM)eddywinch82 Wrote: Could tell me how I can code in Python, to download all the .zip files ?
It don't work like this here,we help peploe that do write some code on there own to learn.
Just asking for code can be done in job section.

That's said i can give you a hint you try to build on.
These are a couple of .zip files from Airbus A300-200.
import requests

file_id = [6082, 6177]
for _id in file_id:
    a_zip = 'http://web.archive.org/web/20041108074847/http://www.projectai.com:80/libraries/download.php?file_id={}'.format(_id)
    with open('{}.zip'.format(_id), 'wb') as f:
        f.write(requests.get(a_zip).content)
Reply
#15
Thats fine sir,

I understand, many thanks for all your help, I think I can manage to work out what I
need to do, All the best Eddie
Reply
#16
I think that I may have cracked it ! More than two types of Airbus Aircraft .zip Files have downloaded so far, when i run the module. Hopefully all of the .zip Files will download. Here is the amended Python Code :-

from bs4 import BeautifulSoup
import requests, wget, re, zipfile, io


def get_zips(link_root, zips_suffix):
    # 'http://web.archive.org/web/20050315112710/http://www.projectai.com:80/libraries/repaints.php?ac=89&cat=6'
    zips_page =  link_root + zips_suffix
    # print zips_page
    zips_source = requests.get(zips_page).text
    zip_soup = BeautifulSoup(zips_source, "html.parser")
    for zip_file in zip_soup.select("a[href*=download.php?fileid=]"):
        zip_url = link_root + zip_file['href']
        print('downloading', zip_file.text, '...',)
        r = requests.get(zip_url)
        with open(zip_file.text, 'wb') as zipFile:
            zipFile.write(r.content)


def download_links(root, cat):
    url = ''.join([root, cat])
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, "html.parser")
    td = soup.find_all('td', class_="text", colspan="2", bgcolour="#FFFF99", href="download.php?fileid=")
    for h in td:
         h.a.get('href')

    for zips_suffix in soup.select("a[href*=repaints.php?ac=]"):
        get_zips(root, zips_suffix['href'])


link_root = 'http://web.archive.org/web/20041225023002/http://www.projectai.com:80/libraries/'

# Example category, need to read all categories from first page into a list and iterate categories
category = 'acfiles.php?cat=6'
download_links(link_root, category)


Many thanks for your very useful hints, snippsat

Eddie


Sorry snippsat, I have just noticed, that I have put the latest code in the wrong Thread, could you transfer it to the latest thread, I posted for me ? Eddie
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Creating a Browser Extension using Python j49857 3 3,962 Feb-13-2024, 10:49 PM
Last Post: j49857
  Login and download an exported csv file within a ribbon/button in a website Alekhya 0 3,196 Feb-26-2021, 04:15 PM
Last Post: Alekhya
  Download some JPG files and make it a single PDF & share it rompdeck 5 6,432 Jul-31-2020, 01:15 AM
Last Post: Larz60+
  Cannot download latest version of a file TheTechRobo 3 2,802 May-20-2020, 08:33 PM
Last Post: TheTechRobo
  Expose chrome extension buttons to Python robertjaxe 2 3,054 May-12-2020, 07:52 PM
Last Post: robertjaxe
  download pdf file from website m_annur2001 1 3,408 Jun-21-2019, 05:03 AM
Last Post: j.crater
  Access my webpage and download files from Python Pedroski55 7 6,559 May-26-2019, 12:08 PM
Last Post: snippsat
  XML file to multiple txt files in PYTHON ayjay516 1 2,632 Jan-31-2019, 10:21 PM
Last Post: Larz60+
  Flask generating a file for download darktitan 0 3,778 Dec-30-2018, 02:02 PM
Last Post: darktitan
  I wan't to Download all .zip Files From A Website (Project AI) eddywinch82 68 44,754 Oct-28-2018, 02:13 PM
Last Post: eddywinch82

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020