(solved) Downnload images from a server and rename

charled · (This post was last modified: Jun-21-2024, 05:36 PM by charled.)

Hi everyone,

This is my first post here. I never used Python before.

I got 12 000 images to download from a server. Each image is named by an ID.
For this work, we've made a csv file with 2 columns :
- the absolute url of each file,
- the name to replace the /id.ext by /name.ext

So the script has to :
- pick the first url,
- download the file to a disk renaming it.

I suppose it is very simple to do with Python. Where to find the help or some scripts ?

Thanks for your help.

***snippsat*** · Jun-17-2024, 05:19 PM

(Jun-17-2024, 02:43 PM)charled Wrote: I suppose it is very simple to do with Python. Where to find the help or some scripts ?

We usually like to see some effort or it's more of a small job description.
It's not a hard task,but if you have never used Python then it can be.

To help with start to read the .csv file,try to run code and look at output is ok before downloading url.
And use a smaller a sample,do not test with all 12 000.

#import requests
import csv
from pathlib import Path

csv_file = Path('your.csv')
#output_dir = Path('downloaded_images')
# Create the directory if it doesn't exist
#output_dir.mkdir(parents=True, exist_ok=True)

# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    reader = csv.reader(fp)
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        print(url, new_name)

Output:https://example.com/image1.jpg image1_new_name.jpg
https://example.com/image2.png image2_new_name.png
https://example.com/image3.gif image3_new_name.gif
https://example.com/image4.jpg image4_new_name.jpg
https://example.com/image5.png image5_new_name.png

charled · Jun-17-2024, 08:11 PM

Hi Snippsat.

I would prefer have time to start and learn Python. But my client asked me help yesterday with a deadline at june 30... of course... After that, photos will be erased.
So thanks for your help. I'll try it immediately.

charled · Jun-18-2024, 10:44 PM

So I tried the code and get this error

Error:python3 '/home/jluc/Documents/CLIENTS/Découvertes/Ezus/recup_images_ezus.py' 
Traceback (most recent call last):
  File "/home/jluc/Documents/CLIENTS/Découvertes/Ezus/recup_images_ezus.py", line 17, in <module>
    new_name = row[1]
IndexError: list index out of range

Sounds like this is a problem with csv file. Field delimiter is ; . i tried some other but same error. Here is the content

"url";"new_name"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561904.jpeg";"AL_Colmar_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561915.jpeg";"AL_Colmar_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561919.jpeg";"AL_Colmar_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675942561923.jpeg";"AL_Colmar_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808533.jpeg";"AL_Colmar_5"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808539.jpeg";"AL_Colmar_6"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808591.jpeg";"AL_Colmar_7"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1677684808629.jpeg";"AL_Colmar_8"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387970.jpg";"AL_Domremy_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387974.jpg";"AL_Domremy_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387978.jpg";"AL_Domremy_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678457387985.jpg";"AL_Domremy_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998140.jpg";"AL_Luxembourg_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998145.jpg";"AL_Luxembourg_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998167.jpg";"AL_Luxembourg_3"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1678785998211.jpg";"AL_Luxembourg_4"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805803.jpg";"AL_Metz_1"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805807.jpg";"AL_Metz_2"
"https://ezus-cmtyhfgfzxdnjtahdgpfmgfj.s3.amazonaws.com/media/1675938805810.jpeg";"AL_Metz_3"

In case, here is my code

#import requests
import csv
from pathlib import Path

csv_file = Path('/home/jluc/Documents/CLIENTS/Découvertes/Ezus/testrecup.csv')
#output_dir = Path('/home/jluc/Documents/CLIENTS/Découvertes/Ezus/images')
# Create the directory if it doesn't exist
#output_dir.mkdir(parents=True, exist_ok=True)

# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    reader = csv.reader(fp)
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        print(url, new_name)

Pedroski55 · Jun-19-2024, 07:31 AM

This gets your files OK. The problem, as I see it is, the files may have different endings, .jpg .jpeg .png

Better get the files with the original name, then rename if you really need to!

After fetching, rename them if you really want to, using a loop

import requests
import csv
from pathlib import Path

path2csv = '/home/pedro/myPython/requests/csv/french_photos.csv'
savepath = '/home/pedro/myPython/requests/csv/downloaded_images'
savep = Path(savepath)
# from snippsat with small changes by me
csv_file = Path(path2csv)
output_dir = Path(savepath)
# Create the directory if it doesn't exist
output_dir.mkdir(parents=True, exist_ok=True)
 
# Read the CSV file and download images(not finish "Requests")
with csv_file.open(mode='r', newline='') as fp:
    # your csv delimiter is ;
    reader = csv.reader(fp, delimiter=';')
    # Skip the header row
    header = next(reader)
    for row in reader:
        url = row[0]
        savename = url.split('/')[-1]
        save_file = savep / savename
        #new_name = row[1]
        print(url)
        print(save_file)
        with open(save_file, 'wb') as f:    
            f.write(requests.get(url).content)

# now run a loop to rename the files if you wish

Hope the client is happy!

DeaD_EyE · Jun-19-2024, 09:35 AM

Just for fun.

import csv
from bisect import bisect_left as bisect
from pathlib import Path
from urllib.parse import urlparse
from urllib.request import urlopen


def read_csv(file):
    with open(file, newline="", encoding="ascii") as fd:
        reader = csv.reader(fd, delimiter=";")
        # skipping header
        next(reader)
        yield from reader


def transform_rows(csv_file):
    for url, name in read_csv(csv_file):
        source_file = Path(urlparse(url).path)
        yield url, Path(name).with_suffix(source_file.suffix.lower())


def get_size(response):
    headers = dict(response.getheaders())
    return int(headers["Content-Length"]) if "Content-Length" in headers else None


class Progress:
    def __init__(self, response):
        self.size = get_size(response)
        self.last_msg = ""
        self.percentages = [0.25, 0.5, 0.75, 1.0]

    def update(self, transferred):
        if self.size is None:
            return

        relative = transferred / self.size
        value = self.percentages[bisect(self.percentages, relative)]
        current_msg = f"{value:.0%}"

        if self.last_msg != current_msg:
            print(current_msg, end=" ", flush=True)
            self.last_msg = current_msg


def download(url, target_dir, target_file):
    with open(target_dir / target_file, "wb") as fd:
        with urlopen(url) as response:
            transferred = 0
            progress = Progress(response)

            while chunk := response.read(1024):
                transferred += len(chunk)
                fd.write(chunk)
                progress.update(transferred)


def main(csv_file, target_dir):
    target_dir = Path(target_dir)
    target_dir.mkdir(parents=True, exist_ok=True)
    for url, file in transform_rows(csv_file):
        print(f"Downloading {file}", end=" ")
        download(url, target_dir, file)
        print()


if __name__ == "__main__":
    main(
        r"C:\Users\YOUR_USER\Desktop\testrecup.csv",
        r"C:\Users\YOUR_USER\Desktop\XYZFK",
    )

charled · Jun-19-2024, 01:02 PM

Thanks Pedro. I don't understand why I can't save directly the files with the right name.

***snippsat*** · Jun-19-2024, 01:55 PM

Here is working code based on .csv you posted.

import requests
import csv
from pathlib import Path

csv_file = Path('url_am.csv')
output_dir = Path('downloaded_images')
# Create the directory if it doesn't exist
output_dir.mkdir(parents=True, exist_ok=True)
with csv_file.open(mode='r', newline='') as file:
    reader = csv.reader(file, delimiter=';')
    header = next(reader)
    for row in reader:
        url = row[0]
        new_name = row[1]
        #print(url, new_name)
        response = requests.get(url)
        if response.status_code == 200:
            # Create the full path for the new image
            new_name = f'{new_name}.jpg'
            file_path = output_dir / new_name
            # Save the image to disk
            with file_path.open('wb') as image_file:
                image_file.write(response.content)
            print(f'Successfully downloaded --> {new_name}')
        else:
            print(f'Failed to download {url} - Status code: {response.status_code}')

Output:Successfully downloaded --> AL_Colmar_1.jpg
Successfully downloaded --> AL_Colmar_2.jpg
Successfully downloaded --> AL_Colmar_3.jpg
Successfully downloaded --> AL_Colmar_4.jpg
Successfully downloaded --> AL_Colmar_5.jpg
Successfully downloaded --> AL_Colmar_6.jpg
Successfully downloaded --> AL_Colmar_7.jpg
Successfully downloaded --> AL_Colmar_8.jpg
Successfully downloaded --> AL_Domremy_1.jpg
Successfully downloaded --> AL_Domremy_2.jpg
Successfully downloaded --> AL_Domremy_3.jpg
Successfully downloaded --> AL_Domremy_4.jpg
Successfully downloaded --> AL_Luxembourg_1.jpg
Successfully downloaded --> AL_Luxembourg_2.jpg
Successfully downloaded --> AL_Luxembourg_3.jpg
Successfully downloaded --> AL_Luxembourg_4.jpg
Successfully downloaded --> AL_Metz_1.jpg
Successfully downloaded --> AL_Metz_2.jpg
Successfully downloaded --> AL_Metz_3.jpg

charled · Jun-19-2024, 06:55 PM

Thanks Snippsat. It works well.

Just one more thing : not all images are jpg, some are .png. Is it possible to copy the right extension ?

Thanks.

***snippsat*** · (This post was last modified: Jun-19-2024, 07:18 PM by snippsat.)

(Jun-19-2024, 06:55 PM)charled Wrote: Just one more thing : not all images are jpg, some are .png. Is it possible to copy the right extension ?

Change line 19 to this.

new_name = f"{new_name}.{response.url.split('.')[-1]}"

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[SOLVED] Tiny web server as standalone executable?	Winfried	0	1,179	Feb-09-2024, 11:48 AM Last Post: Winfried
	Sending random images via smtplib [SOLVED]	AlphaInc	0	2,369	Oct-19-2021, 10:10 AM Last Post: AlphaInc
	How to take the tar backup files form remote server to local server	sivareddy	0	2,609	Jul-14-2021, 01:32 PM Last Post: sivareddy
	Download multiple images and rename them	andie31	4	6,491	Sep-13-2018, 10:26 AM Last Post: andie31

(solved) Downnload images from a server and rename

User Panel Messages

Announcements