Output 'None' - Printable Version

Output 'None' - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Output 'None' (/thread-40026.html)

Output 'None' - liketocode - May-21-2023

Hello

I have a problem with this site so i would like to know what im doing wrong. It works for another sites.
I'm practicing simple web scraping and would like to scrape a temperature here but im getting None as output, so cant use
text = s.getText()
print(text)
after, cause it will give an error 'NoneType' object has no attribute 'getText'

import requests
from bs4 import BeautifulSoup

r = requests.get('https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia')

soup = BeautifulSoup(r.content, 'html.parser')

s = soup.find('div', class_='temp metric')

print(s)

Output:
None

RE: Output 'None' - snippsat - May-21-2023

Turn of JavaScript in browser and reload the page,that what you scrape also None.
Look at this Thread.
Also using Api's is easier when it comes to weather data, eg wttr.in or OpenWeather .

G:\div_code\hex
λ curl wttr.in/Zagrep?format=3
Zagrep: ☀️   +25°C

In Python this curl command would be.

import requests

params = {
    'format': '3',
}
response = requests.get('http://wttr.in/Zagrep', params=params)
print(response.text)

Output:
Zagrep: ☀️   +25°C

RE: Output 'None' - snippsat - May-26-2023

Gone do the task as a quick test in Selenium,because i wonder about Headless is Going Away!
So a little lie,not going away but change to --headless=new.
So for someone not use Selenium before,so is headless a way to not load the browser,just get the result as eg parse with BS.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time

#--| Setup
options = Options()
options.add_argument("--headless=new")
ser = Service(r"C:\cmder\bin\chromedriver.exe")
browser = webdriver.Chrome(service=ser, options=options)
#--| Parse or automation
url = 'https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia'
browser.get(url)
time.sleep(2)
weather_info = browser.find_element(By.CSS_SELECTOR, '#current-weather > div.last-renew-info')
temp = browser.find_element(By.CSS_SELECTOR, '#current-weather > div.last-renew-info > div.temp')
print(weather_info.text)
print('\N{snake}' * 5)
print(temp.text)

Output:Zagreb
20°C Vedro vrijeme Vjetar:
7 Km/h
Relativna vlažnost: 60% | Vidljivost: > 10000m | Tlak: 1019,0mb
🐍🐍🐍🐍🐍
20°C

So see that the new --headless=new mode works.

RE: Output 'None' - Gaurav_Kumar - Jul-20-2023

* You should better go for Selenium web scrapping framework.

* Because when you use the requests.get() method, it only fetches the initial HTML content, which might not include the data you are looking for.
Since requests does not execute JavaScript, the content of the <div> element with class 'temp metric' might not be present in the initial HTML response. As a result, soup.find() returns None, and you encounter the 'NoneType' object has no attribute 'getText' error when you try to call getText() on None.

* I have implemented your script using Selenium framework.

* Here is the code for better understanding:-

pip install selenium

* Download the appropriate web driver for your browser (e.g., Chrome, Firefox).

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Set up options for a headless browser
chrome_options = Options()
chrome_options.add_argument('--headless') # To run the browser in headless mode
chrome_options.add_argument('--disable-gpu') # Disable GPU to avoid potential issues

# Initialize the browser
driver = webdriver.Chrome(options=chrome_options)

# Load the webpage
driver.get('https://freemeteo.com.hr/vrijeme/zagreb/trenutno-vrijeme/mjesto/?gid=3186886&language=croatian&country=croatia')

# Wait for the dynamic content to load (you may need to adjust the time if necessary)
driver.implicitly_wait(10)

# Find the temperature element
temperature_element = driver.find_element_by_class_name('temp')
temperature = temperature_element.text

# Print the temperature
print(temperature)

# Close the browser
driver.quit()