Webscraping news articles by using selenium

cate16 · Aug-28-2023, 07:19 AM

(Aug-25-2023, 05:48 PM)snippsat Wrote:

(Aug-25-2023, 08:04 AM)cate16 Wrote: However, if I run the code I keep receiving the same kind of error:

Don't run the first messy code,it's not updates and will not work at all.
Run my test code,so for you it will be this this(you could have chooen a shother path).
First in cmd do:

pip install selenium --upgrade

Code to test.

# sel_test.py
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
 
# Setup
#https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/116.0.5795.0/win64/chromedriver-win64.zip
options = Options()
options.add_argument("--headless=new")
ser = Service(r"C:\Users\cmosca\AppData\Local\Programs\Python\Python311\chromedriver-win64\chromedriver.exe")
browser = webdriver.Chrome(service=ser, options=options)
# Parse or automation
url = 'https://www.palottery.state.pa.us/Draw-Games/Treasure-Hunt.aspx'
browser.get(url)
lotto_number = browser.find_element(By.CSS_SELECTOR, 'div.details')
print(lotto_number.text)

Output:
0508142227

Thank you for your reply. I have done as you said.
This part worked:

pip install selenium --upgrade

This didn't:

# sel_test.py
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
 
# Setup
#https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/116.0.5795.0/win64/chromedriver-win64.zip
options = Options()
options.add_argument("--headless=new")
ser = Service(r"C:\Users\cmosca\AppData\Local\Programs\Python\Python311\chromedriver-win64\chromedriver.exe")
browser = webdriver.Chrome(service=ser, options=options)
# Parse or automation
url = 'https://www.palottery.state.pa.us/Draw-Games/Treasure-Hunt.aspx'
browser.get(url)
lotto_number = browser.find_element(By.CSS_SELECTOR, 'div.details')
print(lotto_number.text)

Output:
0508142227

I got the following error:

C:\Users\cmosca\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:\Users\cmosca\PycharmProjects\pythonProject1\main.py 
Traceback (most recent call last):
  File "C:\Users\cmosca\PycharmProjects\pythonProject1\main.py", line 17, in <module>
    lotto_number = browser.find_element(By.CSS_SELECTOR, 'div.details')
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cmosca\PycharmProjects\pythonProject1\venv\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 739, in find_element
    return self.execute(Command.FIND_ELEMENT, {"using": by, "value": value})["value"]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cmosca\PycharmProjects\pythonProject1\venv\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 345, in execute
    self.error_handler.check_response(response)
  File "C:\Users\cmosca\PycharmProjects\pythonProject1\venv\Lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 229, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"div.details"}
  (Session info: chrome=116.0.5845.111); For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
Stacktrace:
	GetHandleVerifier [0x00007FF7024A5282+57250]
	(No symbol) [0x00007FF70241CB92]
	(No symbol) [0x00007FF7022EDEAB]
	(No symbol) [0x00007FF70232739E]
	(No symbol) [0x00007FF70232748C]
	(No symbol) [0x00007FF7023600C7]
	(No symbol) [0x00007FF70234665F]
	(No symbol) [0x00007FF70235E172]
	(No symbol) [0x00007FF7023463F3]
	(No symbol) [0x00007FF70231C991]
	(No symbol) [0x00007FF70231DB74]
	GetHandleVerifier [0x00007FF7027550A2+2874818]
	GetHandleVerifier [0x00007FF7027A6C74+3209620]
	GetHandleVerifier [0x00007FF70279FAAF+3180495]
	GetHandleVerifier [0x00007FF7025378E6+656902]
	(No symbol) [0x00007FF702428228]
	(No symbol) [0x00007FF702424374]
	(No symbol) [0x00007FF7024244A6]
	(No symbol) [0x00007FF702414873]
	BaseThreadInitThunk [0x00007FFDF06426AD+29]
	RtlUserThreadStart [0x00007FFDF130AA68+40]


Process finished with exit code 1

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Webscraping with beautifulsoup	cormanstan	3	2,246	Aug-24-2023, 11:57 AM Last Post: snippsat
	Webscraping returning empty table	Buuuwq	0	1,485	Dec-09-2022, 10:41 AM Last Post: Buuuwq
	WebScraping using Selenium library	Korgik	0	1,109	Dec-09-2022, 09:51 AM Last Post: Korgik
	How to get rid of numerical tokens in output (webscraping issue)?	jps2020	0	2,007	Oct-26-2020, 05:37 PM Last Post: jps2020
	Python Webscraping with a Login Website	warriordazza	0	2,693	Jun-07-2020, 07:04 AM Last Post: warriordazza
	Help with basic webscraping	Captain_Snuggle	2	4,056	Nov-07-2019, 08:07 PM Last Post: kozaizsvemira
	Can't Resolve Webscraping AttributeError	Hass	1	2,377	Jan-15-2019, 09:36 PM Last Post: nilamo
	How to exclude certain links while webscraping basis on keywords	Prince_Bhatia	0	3,304	Oct-31-2018, 07:00 AM Last Post: Prince_Bhatia
	Webscraping homework	Ghigo1995	1	2,723	Sep-23-2018, 07:36 PM Last Post: nilamo
	Intro to WebScraping	d1rjr03	2	3,519	Aug-15-2018, 12:05 AM Last Post: metulburr

Webscraping news articles by using selenium

User Panel Messages

Announcements