Python Forum
(solved) Downnload images from a server and rename
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
(solved) Downnload images from a server and rename
#1
Hi everyone,

This is my first post here. I never used Python before.

I got 12 000 images to download from a server. Each image is named by an ID.
For this work, we've made a csv file with 2 columns :
- the absolute url of each file,
- the name to replace the /id.ext by /name.ext

So the script has to :
- pick the first url,
- download the file to a disk renaming it.

I suppose it is very simple to do with Python. Where to find the help or some scripts ?

Thanks for your help.
Reply
#2
(Jun-17-2024, 02:43 PM)charled Wrote: I suppose it is very simple to do with Python. Where to find the help or some scripts ?
We usually like to see some effort or it's more of a small job description.
It's not a hard task,but if you have never used Python then it can be.

To help with start to read the .csv file,try to run code and look at output is ok before downloading url.
And use a smaller a sample,do not test with all 12 000.
#import requests
import csv
from pathlib import Path

csv_file = Path('your.csv')
#output_dir = Path('downloaded_images')
# Create the directory if it doesn't exist
#output_dir.mkdir(parents=True, exist_ok=True)

# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    reader = csv.reader(fp)
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        print(url, new_name)
Output:
https://example.com/image1.jpg image1_new_name.jpg https://example.com/image2.png image2_new_name.png https://example.com/image3.gif image3_new_name.gif https://example.com/image4.jpg image4_new_name.jpg https://example.com/image5.png image5_new_name.png
Reply
#3
Hi Snippsat.

I would prefer have time to start and learn Python. But my client asked me help yesterday with a deadline at june 30... of course... After that, photos will be erased.
So thanks for your help. I'll try it immediately.
Reply
#4
So I tried the code and get this error
Error:
python3 '/home/jluc/Documents/CLIENTS/Découvertes/Ezus/recup_images_ezus.py' Traceback (most recent call last): File "/home/jluc/Documents/CLIENTS/Découvertes/Ezus/recup_images_ezus.py", line 17, in <module> new_name = row[1] IndexError: list index out of range
Sounds like this is a problem with csv file. Field delimiter is ; . i tried some other but same error. Here is the content
"url";"new_name"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561904.jpeg";"AL_Colmar_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561915.jpeg";"AL_Colmar_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561919.jpeg";"AL_Colmar_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561923.jpeg";"AL_Colmar_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808533.jpeg";"AL_Colmar_5"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808539.jpeg";"AL_Colmar_6"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808591.jpeg";"AL_Colmar_7"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808629.jpeg";"AL_Colmar_8"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387970.jpg";"AL_Domremy_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387974.jpg";"AL_Domremy_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387978.jpg";"AL_Domremy_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387985.jpg";"AL_Domremy_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998140.jpg";"AL_Luxembourg_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998145.jpg";"AL_Luxembourg_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998167.jpg";"AL_Luxembourg_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998211.jpg";"AL_Luxembourg_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805803.jpg";"AL_Metz_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805807.jpg";"AL_Metz_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805810.jpeg";"AL_Metz_3"
In case, here is my code
#import requests
import csv
from pathlib import Path

csv_file = Path('/home/jluc/Documents/CLIENTS/Découvertes/Ezus/testrecup.csv')
#output_dir = Path('/home/jluc/Documents/CLIENTS/Découvertes/Ezus/images')
# Create the directory if it doesn't exist
#output_dir.mkdir(parents=True, exist_ok=True)

# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    reader = csv.reader(fp)
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        print(url, new_name)
Reply
#5
This gets your files OK. The problem, as I see it is, the files may have different endings, .jpg .jpeg .png

Better get the files with the original name, then rename if you really need to!

After fetching, rename them if you really want to, using a loop

import requests
import csv
from pathlib import Path

path2csv = '/home/pedro/myPython/requests/csv/french_photos.csv'
savepath = '/home/pedro/myPython/requests/csv/downloaded_images'
savep = Path(savepath)
# from snippsat with small changes by me
csv_file = Path(path2csv)
output_dir = Path(savepath)
# Create the directory if it doesn't exist
output_dir.mkdir(parents=True, exist_ok=True)
 
# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    # your csv delimiter is ;
    reader = csv.reader(fp, delimiter=';')
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        savename = url.split('/')[-1]
        save_file = savep / savename
        #new_name = row[1]
        print(url)
        print(save_file)
        with open(save_file, 'wb') as f:    
            f.write(requests.get(url).content)

# now run a loop to rename the files if you wish
Hope the client is happy!
Reply
#6
Just for fun.

import csv
from bisect import bisect_left as bisect
from pathlib import Path
from urllib.parse import urlparse
from urllib.request import urlopen


def read_csv(file):
    with open(file, newline="", encoding="ascii") as fd:
        reader = csv.reader(fd, delimiter=";")
        # skipping header
        next(reader)
        yield from reader


def transform_rows(csv_file):
    for url, name in read_csv(csv_file):
        source_file = Path(urlparse(url).path)
        yield url, Path(name).with_suffix(source_file.suffix.lower())


def get_size(response):
    headers = dict(response.getheaders())
    return int(headers["Content-Length"]) if "Content-Length" in headers else None


class Progress:
    def __init__(self, response):
        self.size = get_size(response)
        self.last_msg = ""
        self.percentages = [0.25, 0.5, 0.75, 1.0]

    def update(self, transferred):
        if self.size is None:
            return

        relative = transferred / self.size
        value = self.percentages[bisect(self.percentages, relative)]
        current_msg = f"{value:.0%}"

        if self.last_msg != current_msg:
            print(current_msg, end=" ", flush=True)
            self.last_msg = current_msg


def download(url, target_dir, target_file):
    with open(target_dir / target_file, "wb") as fd:
        with urlopen(url) as response:
            transferred = 0
            progress = Progress(response)

            while chunk := response.read(1024):
                transferred += len(chunk)
                fd.write(chunk)
                progress.update(transferred)


def main(csv_file, target_dir):
    target_dir = Path(target_dir)
    target_dir.mkdir(parents=True, exist_ok=True)
    for url, file in transform_rows(csv_file):
        print(f"Downloading {file}", end=" ")
        download(url, target_dir, file)
        print()


if __name__ == "__main__":
    main(
        r"C:\Users\YOUR_USER\Desktop\testrecup.csv",
        r"C:\Users\YOUR_USER\Desktop\XYZFK",
    )
Pedroski55 likes this post
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#7
Thanks Pedro. I don't understand why I can't save directly the files with the right name.
Reply
#8
Here is working code based on .csv you posted.
import requests
import csv
from pathlib import Path

csv_file = Path('url_am.csv')
output_dir = Path('downloaded_images')
# Create the directory if it doesn't exist
output_dir.mkdir(parents=True, exist_ok=True)
with csv_file.open(mode='r', newline='') as file:
    reader = csv.reader(file, delimiter=';')
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        #print(url, new_name)
        response = requests.get(url)
        if response.status_code == 200:
            # Create the full path for the new image
            new_name = f'{new_name}.jpg'
            file_path = output_dir / new_name
            # Save the image to disk
            with file_path.open('wb') as image_file:
                image_file.write(response.content)
            print(f'Successfully downloaded --> {new_name}')
        else:
            print(f'Failed to download {url} - Status code: {response.status_code}')
Output:
Successfully downloaded --> AL_Colmar_1.jpg Successfully downloaded --> AL_Colmar_2.jpg Successfully downloaded --> AL_Colmar_3.jpg Successfully downloaded --> AL_Colmar_4.jpg Successfully downloaded --> AL_Colmar_5.jpg Successfully downloaded --> AL_Colmar_6.jpg Successfully downloaded --> AL_Colmar_7.jpg Successfully downloaded --> AL_Colmar_8.jpg Successfully downloaded --> AL_Domremy_1.jpg Successfully downloaded --> AL_Domremy_2.jpg Successfully downloaded --> AL_Domremy_3.jpg Successfully downloaded --> AL_Domremy_4.jpg Successfully downloaded --> AL_Luxembourg_1.jpg Successfully downloaded --> AL_Luxembourg_2.jpg Successfully downloaded --> AL_Luxembourg_3.jpg Successfully downloaded --> AL_Luxembourg_4.jpg Successfully downloaded --> AL_Metz_1.jpg Successfully downloaded --> AL_Metz_2.jpg Successfully downloaded --> AL_Metz_3.jpg
Reply
#9
Thanks Snippsat. It works well.

Just one more thing : not all images are jpg, some are .png. Is it possible to copy the right extension ?

Thanks.
Reply
#10
(Jun-19-2024, 06:55 PM)charled Wrote: Just one more thing : not all images are jpg, some are .png. Is it possible to copy the right extension ?
Change line 19 to this.
new_name = f"{new_name}.{response.url.split('.')[-1]}"
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] Tiny web server as standalone executable? Winfried 0 1,179 Feb-09-2024, 11:48 AM
Last Post: Winfried
  Sending random images via smtplib [SOLVED] AlphaInc 0 2,369 Oct-19-2021, 10:10 AM
Last Post: AlphaInc
  How to take the tar backup files form remote server to local server sivareddy 0 2,609 Jul-14-2021, 01:32 PM
Last Post: sivareddy
  Download multiple images and rename them andie31 4 6,491 Sep-13-2018, 10:26 AM
Last Post: andie31

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020