No Internet connection when running a Python script

***snippsat*** · (This post was last modified: Mar-11-2024, 11:02 AM by snippsat.)

Some improvement,like time.sleep(blocking) is not the best for schedule stuff.
So schedule and loguru(great) for logging.

import requests
import os
from bs4 import BeautifulSoup
import time
from loguru import logger
logger.add("log_file.log", rotation="2 days")
import schedule
try:
    from lxml import etree
except ImportError:
    raise RuntimeError("Please install lxml with `pip install lxml`")

URL_TO_MONITOR = "https://hckrnews.com/"
CHECK_INTERVAL = 15

def process_html(site_content):
    soup = BeautifulSoup(site_content, features="lxml")
    # Combining tag selections
    for s in soup(["script", "meta"]):
        s.extract()
    return str(soup).replace("\r", "")

def webpage_was_changed():
    headers = {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
        "Pragma": "no-cache",
        "Cache-Control": "no-cache",
    }
    response = requests.get(URL_TO_MONITOR, headers=headers)
    if not os.path.exists("previous_content.txt"):
        open("previous_content.html", "w+").close()
    with open("previous_content.html", "r+") as filehandle:
        previous_response_html = filehandle.read()
        processed_response_html = process_html(response.content)
        if processed_response_html != previous_response_html:
            filehandle.seek(0)
            filehandle.write(processed_response_html)
            filehandle.truncate()
            return True
    return False

def check_webpage():
    try:
        if webpage_was_changed():
            logger.info("WEBPAGE WAS CHANGED.")
        else:
            logger.info("Webpage was not changed.")
    except Exception as e:
        logger.exception(e)

def main():
    schedule.every(CHECK_INTERVAL).seconds.do(check_webpage)
    logger.info("Running Website Monitor")
    while True:
        schedule.run_pending()
        time.sleep(1)

if __name__ == "__main__":
    main()

Also a tips i would say that soup.prettify() is broken,make new lines in tag so dos look like standard HTML at all.
Use Prettier have a command line tool so do just prettier --write . in folder then get correct formatted HTML.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	I don't know what is wrong (Python and SQL connection)	shereen	3	535	Apr-01-2024, 08:56 AM Last Post: Pedroski55
	Running Python script through Task Scheduler?	Winfried	8	975	Mar-10-2024, 07:24 PM Last Post: Winfried
	Connection LTspice-Python with PyLTSpice	bartel90	0	482	Feb-05-2024, 11:46 AM Last Post: bartel90
	Virtual Env changing mysql connection string in python	Fredesetes	0	470	Dec-20-2023, 04:06 PM Last Post: Fredesetes
	connection python and SQL	dawid294	4	872	Dec-12-2023, 08:22 AM Last Post: Pedroski55
	Help Running Python Script in Mac OS	emojistickers	0	424	Nov-20-2023, 01:58 PM Last Post: emojistickers
	Trying to make a board with turtle, nothing happens when running script	Quascia	3	828	Nov-01-2023, 03:11 PM Last Post: deanhystad
	Is there a .bat DOS batch script to .py Python Script converter?	pstein	3	3,786	Jun-29-2023, 11:57 AM Last Post: gologica
	Python script running under windows over nssm.exe	JaroslavZ	0	820	May-12-2023, 09:22 AM Last Post: JaroslavZ
	Networking Issues - Python GUI client and server connection always freezes	Veritas_Vos_Liberabit24	0	820	Mar-21-2023, 03:18 AM Last Post: Veritas_Vos_Liberabit24

No Internet connection when running a Python script

User Panel Messages

Announcements