Python Forum
Product Image Download Help Required
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Product Image Download Help Required
#1
Hi everyone, since I love python language very much, I am constantly experimenting and this time I am trying to make a new experiment, I have been researching on this for 10 days but I have not been successful.

My requests in the Python code I failed : This Python code downloads product images from the BoohooMan website, creates a folder for each product and saves them in folders. The code is also organized to convert the images to base64 format and save them to a file.
The product list I want : https://www.boohooman.com/us/mens/tops/t...20T-Shirts
I need help very urgently, thank you in advance.

Pulling Image URLs from HTML Content:

Issue: When pulling image URLs, we could not properly separate the product IDs from the data-url attribute.
Creating an Image URL Format:

Issue: The image URL format was incompatible with the previous method. The new format needed to include ?$product_image_category_category_category_page_tablet_landscape_pro_2x$.
Downloading and Saving Images:

Issue: The download worked, but the files were not saved in the specified folder. This can often be caused by file path or permission issues.
Saving with Base64:

Issue: Images were not saved in base64 format. The script should have been set to convert the image to base64 and save it to a txt file.
Reply
#2
show us what you've tried, working or not.
Reply
#3
Are you using HTML too to get the full url path from the website? Some HTML website blocks off the picture from the URL. The extension then becomes .mhtml. Trying to retrieve the full url path from Python codes. Linking the web page to that url page with pictures and the proper categories is what you want.

What's the module or file you are using to get this Link?
Programs are like instructions or rules. Learning it gets us closer to a solution. Desired outcome. Computer talk.
Reply
#4
import os
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

# Constants
BASE_URL = "https://www.boohooman.com/us/mens/tops/t-shirts?prefn1=style&prefv1=Printed%20T-Shirts"
HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"}
DEST_FOLDER = os.path.join(os.path.expanduser("~"), "Desktop", "BoohooMan_TShirts")

# Create destination folder if it doesn't exist
os.makedirs(DEST_FOLDER, exist_ok=True)

def get_soup(url):
    response = requests.get(url, headers=HEADERS)
    response.raise_for_status()
    return BeautifulSoup(response.text, 'html.parser')

def save_image(image_url, folder_path, image_name):
    response = requests.get(image_url)
    response.raise_for_status()
    with open(os.path.join(folder_path, image_name), 'wb') as f:
        f.write(response.content)

def download_product_images():
    soup = get_soup(BASE_URL)
    products = soup.find_all('div', class_='product-item')

    for product in products:
        product_link = product.find('a', class_='product-item-link')['href']
        product_name = product.find('a', class_='product-item-link').get_text(strip=True)
        product_folder = os.path.join(DEST_FOLDER, product_name)

        # Create a folder for each product
        os.makedirs(product_folder, exist_ok=True)

        product_soup = get_soup(urljoin(BASE_URL, product_link))
        image_elements = product_soup.find_all('img', class_='primary-image')

        for idx, img in enumerate(image_elements):
            img_url = img['src']
            img_url = urljoin(BASE_URL, img_url)
            save_image(img_url, product_folder, f'image_{idx + 1}.jpg')

if __name__ == "__main__":
    download_product_images()
    print(f"Images downloaded and saved in {DEST_FOLDER}")
Reply
#5
??????????????????????????????????????
Reply
#6
The site has changed,so you most update and test your code.
Example this line will not find anything.
products = soup.find_all('div', class_='product-item')
Output:
(Pdb) products []
So look at site(inspect in dev tools) and start test for changes.
import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
url = 'https://www.boohooman.com/us/mens/tops/t-shirts?prefn1=style&prefv1=Printed%20T-Shirts'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')
products = soup.find_all('div', class_='product-tile js-product-tile')
>>> print(products[0].select_one('.product-tile-name').text.strip())
Oversized Boxy Extended Neck Palm Tree T-shirt
>>> products[0].get('data-itemid')
'BMM85247'
>>> 
>>> print(products[1].select_one('.product-tile-name').text.strip())
Oversized Heavyweight Paisley Applique T-shirt
>>> products[1].get('data-itemid')
'BMM83448'
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Information Showing trendline formula in a table per product Carlossxx 0 1,668 May-03-2023, 08:34 AM
Last Post: Carlossxx
  How to add product details in exe generated by pyinstaller arex786 1 11,950 Oct-10-2021, 11:00 AM
Last Post: Sran012
  download with internet download manager coral_raha 0 4,051 Jul-18-2021, 03:11 PM
Last Post: coral_raha
  Largest product in a grid (projecteuler problem11) tragical 1 2,856 Sep-14-2020, 01:03 PM
Last Post: Gribouillis
  Blending calculator from final product xerxes106 0 2,058 Dec-05-2019, 10:32 AM
Last Post: xerxes106
  Make dual vector dot-product more efficient technossomy 3 3,376 Nov-28-2019, 09:27 PM
Last Post: Gribouillis
  Store a product/item in a inventory program viktoria_linn 1 4,894 Jul-02-2019, 09:26 PM
Last Post: DeaD_EyE
  Product expression. jarrod0987 1 2,907 Dec-13-2018, 11:32 AM
Last Post: buran
  Product of maximum in first array and minimum in second Thethispointer 9 6,921 Jan-19-2018, 07:38 PM
Last Post: Thethispointer

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020