![]() |
How to implement APScheduler in Python 3.6? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: How to implement APScheduler in Python 3.6? (/thread-10755.html) |
How to implement APScheduler in Python 3.6? - PrateekG - Jun-05-2018 Hi All, I have written a python script (myfile.py) which scrapes the product related data from an e-commerce site and store in mysql db. Now I want to schedule this script to refresh once a week. I have installed the APScheduler for this work but need your help to implement this- https://apscheduler.readthedocs.io/en/latest/userguide.html Can anyone please share his knowledge? RE: How to implement APScheduler in Python 3.6? - buran - Jun-05-2018 you should know the drill already - what have you tried, post code and ask questions, etc... RE: How to implement APScheduler in Python 3.6? - DeaD_EyE - Jun-05-2018 Read first the provided examples: https://github.com/agronholm/apscheduler/tree/master/examples/schedulers https://github.com/agronholm/apscheduler/blob/master/examples/schedulers/background.py RE: How to implement APScheduler in Python 3.6? - PrateekG - Jun-05-2018 Yes, I have seen the examples. But I am not sure where to use my python script (myfile.py) in a scheduler. RE: How to implement APScheduler in Python 3.6? - PrateekG - Jun-05-2018 following is the content of my python script- def get_soup(url): soup = None try: response = requests.get(url) if response.status_code == 200: html = response.content soup = BeautifulSoup(html, "html.parser") return soup def get_category_urls(url): soup = get_soup(url) cat_urls = [] try: categories = soup.find('div', attrs={'id': 'menu_oc'}) if categories is not None: for c in categories.findAll('a'): if c['href'] is not None: cat_urls.append(c['href']) return cat_urls def get_product_urls(url): soup = get_soup(url) prod_urls = [] if soup.find('div', attrs={'class': 'pagination'}): for link in soup.select('div.links a'): if link.string.isdecimal(): # dump next and last links prod_urls.append(link['href']) print("Found following product urls::", prod_urls) return prod_urls if __name__ == '__main__': category_urls = get_category_urls(URL) product_urls = get_product_urls(URL) #TODO upload in dbNow I have created a scheduler-refresh.py with following content import schedule import time def job(): //how to call myfile.py here? print("refreshing...") schedule.every().week.at("10:30").do(job) while 1: schedule.run_pending() time.sleep(1)Here I don;t know how to call myfile.py. Can you help me? |