Python Forum
Code Needs finishing Off Help Needed
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Code Needs finishing Off Help Needed
#16
(May-22-2018, 06:30 AM)eddywinch82 Wrote: , to start downloading from the last .zip File downloaded, rather than downloading all of the downloaded .zip files again ? I mean can I put in a code, the last .zip file downloaded, and then start downloading from that point ?
Start over in a new folder with my code that has progress bar,then let say you got all .zip for 69 planes.
The your connection break down,now you know that you miss the last 3 planes.
So in my code i am using yield url_file_id to generate url's for all planes.
The can use itertools.islice to slice out the last 3 that is missing.
Code:
from bs4 import BeautifulSoup
import requests
from tqdm import tqdm, trange
from itertools import islice

def all_planes():
    '''Generate url links for all planes'''
    url = 'http://web.archive.org/web/20041225023002/http://www.projectai.com:80/libraries/acfiles.php?cat=6'
    url_get = requests.get(url)
    soup = BeautifulSoup(url_get.content, 'lxml')
    td = soup.find_all('td', width="50%")
    plain_link = [link.find('a').get('href') for link in td]
    for ref in tqdm(plain_link):
         url_file_id = 'http://web.archive.org/web/20041114195147/http://www.projectai.com:80/libraries/{}'.format(ref)
         yield url_file_id

def download(all_planes):
    '''Download zip for 1 plain,feed with more url download all planes'''
    # A_300 = next(all_planes())  # Test with first link
    last_3 = islice(all_planes(), 69, 72)
    for plane_url in last_3:
        url_get = requests.get(plane_url)
        soup = BeautifulSoup(url_get.content, 'lxml')
        td = soup.find_all('td', class_="text", colspan="2")
        zip_url = 'http://web.archive.org/web/20041108022719/http://www.projectai.com:80/libraries/download.php?fileid={}'
        for item in tqdm(td):
            zip_name = item.text
            zip_number = item.find('a').get('href').split('=')[-1]
            with open(zip_name, 'wb')  as f_out:
                down_url = requests.get(zip_url.format(zip_number))
                f_out.write(down_url.content)

if __name__ == '__main__':
    download(all_planes) 
Now looking at progress bar.
After 1 plane is dowloaded it's at 97%,because we start at 69 and total is 72
[Image: EVDkJu.jpg]
Reply


Messages In This Thread
RE: Code Needs finishing Off Help Needed - by buran - May-21-2018, 12:40 PM
RE: Code Needs finishing Off Help Needed - by snippsat - May-22-2018, 10:52 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Code Help Needed eddywinch82 4 4,118 Sep-28-2018, 06:38 PM
Last Post: joomdev1309

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020