![]() |
Output 'None' - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Output 'None' (/thread-40026.html) |
Output 'None' - liketocode - May-21-2023 Hello I have a problem with this site so i would like to know what im doing wrong. It works for another sites. I'm practicing simple web scraping and would like to scrape a temperature here but im getting None as output, so cant use text = s.getText() print(text) after, cause it will give an error 'NoneType' object has no attribute 'getText' ![]() import requests from bs4 import BeautifulSoup r = requests.get('https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia') soup = BeautifulSoup(r.content, 'html.parser') s = soup.find('div', class_='temp metric') print(s)
RE: Output 'None' - snippsat - May-21-2023 Turn of JavaScript in browser and reload the page,that what you scrape also None. Look at this Thread. Also using Api's is easier when it comes to weather data, eg wttr.in or OpenWeather . G:\div_code\hex λ curl wttr.in/Zagrep?format=3 Zagrep: ☀️ +25°CIn Python this curl command would be. import requests params = { 'format': '3', } response = requests.get('http://wttr.in/Zagrep', params=params) print(response.text)
RE: Output 'None' - snippsat - May-26-2023 Gone do the task as a quick test in Selenium,because i wonder about Headless is Going Away! So a little lie,not going away but change to --headless=new .So for someone not use Selenium before,so is headless a way to not load the browser,just get the result as eg parse with BS.from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By import time #--| Setup options = Options() options.add_argument("--headless=new") ser = Service(r"C:\cmder\bin\chromedriver.exe") browser = webdriver.Chrome(service=ser, options=options) #--| Parse or automation url = 'https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia' browser.get(url) time.sleep(2) weather_info = browser.find_element(By.CSS_SELECTOR, '#current-weather > div.last-renew-info') temp = browser.find_element(By.CSS_SELECTOR, '#current-weather > div.last-renew-info > div.temp') print(weather_info.text) print('\N{snake}' * 5) print(temp.text) So see that the new --headless=new mode works.
RE: Output 'None' - Gaurav_Kumar - Jul-20-2023 * You should better go for Selenium web scrapping framework. * Because when you use the requests.get() method, it only fetches the initial HTML content, which might not include the data you are looking for. Since requests does not execute JavaScript, the content of the <div> element with class 'temp metric' might not be present in the initial HTML response. As a result, soup.find() returns None, and you encounter the 'NoneType' object has no attribute 'getText' error when you try to call getText() on None. * I have implemented your script using Selenium framework. * Here is the code for better understanding:- pip install selenium * Download the appropriate web driver for your browser (e.g., Chrome, Firefox). from selenium import webdriver from selenium.webdriver.chrome.options import Options # Set up options for a headless browser chrome_options = Options() chrome_options.add_argument('--headless') # To run the browser in headless mode chrome_options.add_argument('--disable-gpu') # Disable GPU to avoid potential issues # Initialize the browser driver = webdriver.Chrome(options=chrome_options) # Load the webpage driver.get('https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia') # Wait for the dynamic content to load (you may need to adjust the time if necessary) driver.implicitly_wait(10) # Find the temperature element temperature_element = driver.find_element_by_class_name('temp') temperature = temperature_element.text # Print the temperature print(temperature) # Close the browser driver.quit() |