Posts: 218
Threads: 27
Joined: May 2018
May-19-2018, 05:01 PM
(This post was last modified: May-19-2018, 05:01 PM by eddywinch82.)
Thankyou so much snippsat, your coding worked ! Yes I am a complete novice, when it comes to Python programming. But I appreciate your efforts for me. Just wondering what could I type in the coding, so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ?
Posts: 7,264
Threads: 122
Joined: Sep 2016
May-19-2018, 05:59 PM
(This post was last modified: May-19-2018, 05:59 PM by snippsat.)
(May-19-2018, 05:01 PM)eddywinch82 Wrote: so that it shows the Files downloading/process as they are downloading, in the Python Program Shell ? You can use tqdm
from bs4 import BeautifulSoup
import requests
from tqdm import tqdm, trange
url = 'http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
b_tag = soup.find_all('b')
for a in tqdm(b_tag):
link = a.find('a')['href']
#print(link)
f_name = link.split('id=')[-1]
with open(f_name, 'wb') as f:
f.write(requests.get(link).content) Running from command line cmder.
Running from Python shell may get a file by file display.
Posts: 218
Threads: 27
Joined: May 2018
Thankyou snippsat,
I would like to do the same, downloading of AI Aircraft texture Files. For another link :-
http://web.archive.org/web/2005031511294...php?ac=Two Digit Number&cat=6
There are AI Aircraft catagories Boeings, Airbus etc, and each Aircraft Type has a different two digit number after ac= but the Catagory i.e. cat= in this case Current Liveries is always the same i.e. 6 whereas in the vintage Liveries section, the cat=1
Then when you click on that, the different Livery Texture .zip Files have the following common File Path :-
http://web.archive.org/web/2004111419514...ileid=Four Digit Number
And that is the same for All Aircraft and textures, for both Current and Vintage Liveries.
Could tell me how I can code in Python, to download all the .zip files ? I don't mind them all being downloaded in the same place. Your help is and would be very much appreciated, and many thanks for taking the time to help me out.
Eddie
Posts: 7,264
Threads: 122
Joined: Sep 2016
(May-19-2018, 07:13 PM)eddywinch82 Wrote: Could tell me how I can code in Python, to download all the .zip files ? It don't work like this here,we help peploe that do write some code on there own to learn.
Just asking for code can be done in job section.
That's said i can give you a hint you try to build on.
These are a couple of .zip files from Airbus A300-200.
import requests
file_id = [6082, 6177]
for _id in file_id:
a_zip = 'http://web.archive.org/web/20041108074847/http://www.projectai.com:80/libraries/download.php?file_id={}'.format(_id)
with open('{}.zip'.format(_id), 'wb') as f:
f.write(requests.get(a_zip).content)
Posts: 218
Threads: 27
Joined: May 2018
Thats fine sir,
I understand, many thanks for all your help, I think I can manage to work out what I
need to do, All the best Eddie
Posts: 218
Threads: 27
Joined: May 2018
May-20-2018, 06:05 PM
(This post was last modified: May-20-2018, 06:05 PM by eddywinch82.)
I think that I may have cracked it ! More than two types of Airbus Aircraft .zip Files have downloaded so far, when i run the module. Hopefully all of the .zip Files will download. Here is the amended Python Code :-
from bs4 import BeautifulSoup
import requests, wget, re, zipfile, io
def get_zips(link_root, zips_suffix):
# 'http://web.archive.org/web/20050315112710/http://www.projectai.com:80/libraries/repaints.php?ac=89&cat=6'
zips_page = link_root + zips_suffix
# print zips_page
zips_source = requests.get(zips_page).text
zip_soup = BeautifulSoup(zips_source, "html.parser")
for zip_file in zip_soup.select("a[href*=download.php?fileid=]"):
zip_url = link_root + zip_file['href']
print('downloading', zip_file.text, '...',)
r = requests.get(zip_url)
with open(zip_file.text, 'wb') as zipFile:
zipFile.write(r.content)
def download_links(root, cat):
url = ''.join([root, cat])
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
td = soup.find_all('td', class_="text", colspan="2", bgcolour="#FFFF99", href="download.php?fileid=")
for h in td:
h.a.get('href')
for zips_suffix in soup.select("a[href*=repaints.php?ac=]"):
get_zips(root, zips_suffix['href'])
link_root = 'http://web.archive.org/web/20041225023002/http://www.projectai.com:80/libraries/'
# Example category, need to read all categories from first page into a list and iterate categories
category = 'acfiles.php?cat=6'
download_links(link_root, category)
Many thanks for your very useful hints, snippsat
Eddie
Sorry snippsat, I have just noticed, that I have put the latest code in the wrong Thread, could you transfer it to the latest thread, I posted for me ? Eddie
|