Mar-22-2024, 11:07 AM
(This post was last modified: Mar-23-2024, 08:07 AM by Gribouillis.)
Hello there,
First of all, I am not an expert for Python, but I rather learned it AD HOC. Namely, I have a script that scrapes name, sizes and prices for each size for products. I create excel file that the script reads, open links, gets the data, and saves it in a new excel file. Now, the issue is that the script is not able to click on a popup window (cookie) that shows up when run on server and accept the cookies - when I run it on my PC, the script works fine.
Here is the code:
First of all, I am not an expert for Python, but I rather learned it AD HOC. Namely, I have a script that scrapes name, sizes and prices for each size for products. I create excel file that the script reads, open links, gets the data, and saves it in a new excel file. Now, the issue is that the script is not able to click on a popup window (cookie) that shows up when run on server and accept the cookies - when I run it on my PC, the script works fine.
Here is the code:
import time from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.common.action_chains import ActionChains from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import StaleElementReferenceException from bs4 import BeautifulSoup from xlwt import Workbook import pandas as pd from selenium.webdriver.chrome.options import Options # Set up Chrome options for headless browsing chrome_options = Options() chrome_options.add_argument("--headless") chrome_options.add_argument("--no-sandbox") # Add this line if running as root user # Initialize WebDriver driver = webdriver.Chrome(options=chrome_options) # Reading the file where the links of products are saved df = pd.read_excel('OttyDuvets_Links.xlsx') # Create a single Workbook and Worksheet wb = Workbook() ws = wb.add_sheet('Sheet1') # Defining the Headers in Excel Worksheet ws.write(0, 0, 'Name') ws.write(0, 1, 'Size') ws.write(0, 2, 'Price') row = 1 # Iterate through all size options using a for loop for i in range(len(df['Links'])): driver.get(df['Links'][i]) time.sleep(5) # Wait for the "cc-btn-decision" class to appear and then click on it try: decision_btn = WebDriverWait(driver, 5).until( EC.presence_of_element_located((By.XPATH, '/html/body/div[2]/div/div[2]/a[3]')) ) decision_btn.click() except Exception as e: print("Error clicking on 'cc-btn-decision':", str(e)) time.sleep(3) # Adjust this delay as needed # Parsing the source code of the webpage soup = BeautifulSoup(driver.page_source, 'html.parser') # Name of the Product name_element = soup.find('h1') name = name_element.text.strip() if name_element else "Name not found" print(name) # Find the button that triggers the dropdown button = driver.find_element(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]') button.click() # Add a delay to allow the dropdown to appear time.sleep(3) # Iterate through all size options using a for loop for j in range(len(driver.find_elements(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]/option'))): # Find the dropdown container and size options each time dropdown_container = driver.find_element(By.CLASS_NAME, 'select') size_options = dropdown_container.find_elements(By.TAG_NAME, 'option') option = size_options[j] # Get the size from the option selected_size = option.text.strip() print(selected_size) # Scroll into view using JavaScript driver.execute_script("arguments[0].scrollIntoView();", option) time.sleep(1) # Click on the size option to select it try: option.click() except StaleElementReferenceException: # If the element is stale, re-find the dropdown and the option dropdown_container = driver.find_element(By.CLASS_NAME, 'select') size_options = dropdown_container.find_elements(By.TAG_NAME, 'option') option = size_options[j] option.click() time.sleep(3) # Find all the price elements for each selected size price_elements = driver.find_elements(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/div[1]/div[1]/div[1]/span[2]') # Iterate through all price elements for the selected size for price_element in price_elements: price = price_element.text.strip() print(price) # Saving the name, size, and price of the product in the worksheet ws.write(row, 0, name) ws.write(row, 1, selected_size) ws.write(row, 2, price) row += 1 # Click on the dropdown again to show the rest of the options driver.find_element(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]').click() time.sleep(3) # Save the single Workbook with its name and type wb.save('OttyDuvets_Details.xls') # Close the browser driver.quit()This is the error log I get:
Error:root@ip-xxx-xx-xx-xxx:/path/path/scrappingProjects/Otty# python3 scrappingOttyDuvets.py
Deluxe Microfibre Duvet
Traceback (most recent call last):
File "scrappingOttyDuvets.py", line 50, in <module>
button.click()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webelement.py", line 94, in click
self._execute(Command.CLICK_ELEMENT)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webelement.py", line 395, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <select id="Option-template--14439267893323__main-0" class="select__select" name="options[Size]" form="product-form-template--14439267893323__main" data-forced-variant="">...</select> is not clickable at point (600, 580). Other element would receive the click: <iframe referrerpolicy="origin" id="attentive_creative" title="Sign Up via Text for Offers" src="https://creatives.attn.tv/creatives-dynamic/multiPage/index.html" style="width: 100%; height: 100%; position: fixed; inset: 0px; opacity: 1; transition: opacity 500ms ease 0s; border: 0px; overflow: hidden; visibility: visible;"></iframe>
(Session info: chrome-headless-shell=122.0.6261.128)
Stacktrace:
#0 0x56443887bf33 <unknown>
#1 0x564438573ce6 <unknown>
#2 0x5644385c5a73 <unknown>
#3 0x5644385c398e <unknown>
#4 0x5644385c1367 <unknown>
#5 0x5644385c0732 <unknown>
#6 0x5644385b3d27 <unknown>
#7 0x5644385e0eb2 <unknown>
#8 0x5644385b36b8 <unknown>
#9 0x5644385e107e <unknown>
#10 0x5644385ff899 <unknown>
#11 0x5644385e0c53 <unknown>
#12 0x5644385b1db3 <unknown>
#13 0x5644385b277e <unknown>
#14 0x56443884186b <unknown>
#15 0x564438845885 <unknown>
#16 0x56443882f181 <unknown>
#17 0x564438846412 <unknown>
#18 0x56443881325f <unknown>
#19 0x56443886a528 <unknown>
#20 0x56443886a723 <unknown>
#21 0x56443887b0e4 <unknown>
#22 0x7f7d88b75609 start_thread
To me, it looks like ti cannot click on the dropdown element due to the overlay of the cookie popup. Can someone help? I also have a few scripts that work great on my PC but not on the server....the URL I am trying to scrape is: https://otty.com/products/luxury-microfi...uble-duvet
Gribouillis write Mar-23-2024, 08:07 AM:
lease post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
lease post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.