Get latest version off website and save it as variable [SOLVED] - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Get latest version off website and save it as variable [SOLVED] (/thread-35534.html) |
Get latest version off website and save it as variable [SOLVED] - AlphaInc - Nov-14-2021 Hello everybody, I try to get the latest version off an website (not the download but just the version). For example this is the download site "https://gpac.wp.imt.fr/downloads/" and when manually press on Windows 64 bits it downloads version 1.0.1. Is there a way to detect this information and save it as a variable in a python script? So far, I have used this but it comes with two flaws (1) it's not entirely python and 2) it only saves the entire download link): import os os.system (curl -s https://gpac.wp.imt.fr/downloads/ | grep x64)I would get the information for this particular software from somewhere else but there a few sites I would like to grab the release version. RE: Get latest version off website and save it as variable - snippsat - Nov-14-2021 A more normal way is to web-scrape the info you want. Example. import requests from bs4 import BeautifulSoup url = 'https://gpac.wp.imt.fr/downloads/' response = requests.get(url) soup = BeautifulSoup(response.content, 'lxml') print(soup.select_one('#post-147 > div > p:nth-child(3)').text) # Just version version = soup.select_one('#post-147 > div > p:nth-child(3) > strong').text print(version) This info #post-147 > div > p:nth-child(3) can just copy from Browser right click copy selector .Then in BS use the CSS selector way with select or select_one .
RE: Get latest version off website and save it as variable - AlphaInc - Nov-14-2021 (Nov-14-2021, 10:55 AM)snippsat Wrote: A more normal way is to web-scrape the info you want. Okay yeah that work's thanks. I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/ Do I need to find out what CSS part I'm looking for using a browser? Edit: Alright I got it (sorry getting it so late). My output is "#content > ul:nth-child(3) > li > a" and it prints "MakeMKV 1.16.5 for Windows". How do i cut it to only show the video? I didn't understand that in your part RE: Get latest version off website and save it as variable - snippsat - Nov-14-2021 (Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/It will be same way do some training in web-scaring can look at Web-Scraping part-1 Here a example using two ways. import requests from bs4 import BeautifulSoup url = 'https://www.makemkv.com/download/' response = requests.get(url) soup = BeautifulSoup(response.content, 'lxml') li_ver = soup.find_all('li')[5] print(li_ver.text) print(soup.select_one('#content > li:nth-child(6)').text)
RE: Get latest version off website and save it as variable - AlphaInc - Nov-14-2021 (Nov-14-2021, 08:37 PM)snippsat Wrote:(Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/It will be same way do some training in web-scaring can look at Web-Scraping part-1 Yeah sorry it took me a second to get what you mean. I have it like this: import requests from bs4 import BeautifulSoup url = 'https://www.makemkv.com/download/' response = requests.get(url) soup = BeautifulSoup(response.content, 'lxml') version = soup.select_one('#content > ul:nth-child(3) > li > a').text print(version)It gets the output: MakeMKV 1.16.5 for WindowsWas there a way to only get 1.16.5 (Without the other strings) or is this the best I get? RE: Get latest version off website and save it as variable - DeaD_EyE - Nov-14-2021 import re from collections import namedtuple import bs4 import requests MAKEMKV_BASE = "https://www.makemkv.com/download/" VERSION_REG = re.compile(r"(\d+\.\d+\.\d+)") def parse_version(file_name: str) -> str: if match := VERSION_REG.search(file_name): return match.group(1) else: return "" MakeMKV = namedtuple("makemkv", "url version version_tuple os") def get_makemv(): content = requests.get(MAKEMKV_BASE).content doc = bs4.BeautifulSoup(content, "lxml") selector = "div#content > ul.bullets > li > a" for element in doc.select(selector, href=True): href = element["href"] if href.endswith(".txt"): continue version_str = parse_version(href) version_tuple = tuple(map(int, version_str.split("."))) name = element.text.lower() if "windows" in name: os_type = "windows" elif "mac os x" in name: os_type = "macos" else: os_type = "unkown" yield MakeMKV(href, version_str, version_tuple, os_type) for result in get_makemv(): print(result.os, result.version, result.url)The inspector from Firefox helps a lot to find the elements. I used this information to make the selector. |