Get latest version off website and save it as variable [SOLVED]

AlphaInc · (This post was last modified: May-11-2022, 01:18 PM by AlphaInc.)

Hello everybody,

I try to get the latest version off an website (not the download but just the version).
For example this is the download site "https://gpac.wp.imt.fr/downloads/" and when manually press on Windows 64 bits it downloads version 1.0.1.

Is there a way to detect this information and save it as a variable in a python script?

So far, I have used this but it comes with two flaws (1) it's not entirely python and 2) it only saves the entire download link):

import os

os.system (curl -s https://gpac.wp.imt.fr/downloads/ | grep x64)

I would get the information for this particular software from somewhere else but there a few sites I would like to grab the release version.

***snippsat*** · (This post was last modified: Nov-14-2021, 10:55 AM by snippsat.)

A more normal way is to web-scrape the info you want.
Example.

import requests
from bs4 import BeautifulSoup

url = 'https://gpac.wp.imt.fr/downloads/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.select_one('#post-147 > div > p:nth-child(3)').text)
# Just version
version = soup.select_one('#post-147 > div > p:nth-child(3) > strong').text
print(version)

Output:The current GPAC release is 1.0.1 (released in September 2020).
1.0.1

This info #post-147 > div > p:nth-child(3) can just copy from Browser right click copy selector.
Then in BS use the CSS selector way with select or select_one.

AlphaInc · (This post was last modified: Nov-14-2021, 08:35 PM by AlphaInc.)

(Nov-14-2021, 10:55 AM)snippsat Wrote: A more normal way is to web-scrape the info you want.
Example.
import requests
from bs4 import BeautifulSoup

url = 'https://gpac.wp.imt.fr/downloads/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.select_one('#post-147 > div > p:nth-child(3)').text)
# Just version
version = soup.select_one('#post-147 > div > p:nth-child(3) > strong').text
print(version)
Output:The current GPAC release is 1.0.1 (released in September 2020).
1.0.1
This info #post-147 > div > p:nth-child(3) can just copy from Browser right click copy selector.
Then in BS use the CSS selector way with select or select_one.

Okay yeah that work's thanks.
I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?

Edit: Alright I got it (sorry getting it so late). My output is "#content > ul:nth-child(3) > li > a" and it prints "MakeMKV 1.16.5 for Windows". How do i cut it to only show the video? I didn't understand that in your part

***snippsat*** · Nov-14-2021, 08:37 PM

(Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?

It will be same way do some training in web-scaring can look at Web-Scraping part-1
Here a example using two ways.

import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
li_ver = soup.find_all('li')[5]
print(li_ver.text)
print(soup.select_one('#content > li:nth-child(6)').text)

Output:MakeMKV v1.16.5 (1.11.2021 )
MakeMKV v1.16.5 (1.11.2021 )

AlphaInc · Nov-14-2021, 08:55 PM

(Nov-14-2021, 08:37 PM)snippsat Wrote:
(Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?
It will be same way do some training in web-scaring can look at Web-Scraping part-1
Here a example using two ways.
import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
li_ver = soup.find_all('li')[5]
print(li_ver.text)
print(soup.select_one('#content > li:nth-child(6)').text)
Output:MakeMKV v1.16.5 (1.11.2021 )
MakeMKV v1.16.5 (1.11.2021 )

Yeah sorry it took me a second to get what you mean. I have it like this:

import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
version = soup.select_one('#content > ul:nth-child(3) > li > a').text
print(version)

It gets the output:

MakeMKV 1.16.5 for Windows

Was there a way to only get 1.16.5 (Without the other strings) or is this the best I get?

DeaD_EyE · Nov-14-2021, 09:00 PM

import re
from collections import namedtuple

import bs4
import requests

MAKEMKV_BASE = "https://www.makemkv.com/download/"
VERSION_REG = re.compile(r"(\d+\.\d+\.\d+)")


def parse_version(file_name: str) -> str:
    if match := VERSION_REG.search(file_name):
        return match.group(1)
    else:
        return ""


MakeMKV = namedtuple("makemkv", "url version version_tuple os")


def get_makemv():
    content = requests.get(MAKEMKV_BASE).content
    doc = bs4.BeautifulSoup(content, "lxml")
    selector = "div#content > ul.bullets > li > a"
    for element in doc.select(selector, href=True):
        href = element["href"]
        if href.endswith(".txt"):
            continue

        version_str = parse_version(href)
        version_tuple = tuple(map(int, version_str.split(".")))
        name = element.text.lower()
        if "windows" in name:
            os_type = "windows"
        elif "mac os x" in name:
            os_type = "macos"
        else:
            os_type = "unkown"

        yield MakeMKV(href, version_str, version_tuple, os_type)


for result in get_makemv():
    print(result.os, result.version, result.url)

The inspector from Firefox helps a lot to find the elements.
I used this information to make the selector.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Python 3.13(Windows) does not have the latest module OS	phillip_from_oz	1	961	Dec-13-2024, 04:35 AM Last Post: deanhystad
	For Loop assigns only the latest value from List	Caliban86	3	1,321	Sep-22-2024, 02:47 AM Last Post: deanhystad
	how to save to multiple locations during save	cubangt	1	1,376	Oct-23-2023, 10:16 PM Last Post: deanhystad
	SOLVED variable as tuple name	krayon70	7	3,260	Apr-09-2022, 03:30 PM Last Post: krayon70
	How to save specific variable in for loop in to the database?	ilknurg	1	1,937	Mar-09-2022, 10:32 PM Last Post: cubangt
	[solved] subdictionaries path as variable	paul18fr	4	3,686	May-18-2021, 08:12 AM Last Post: DeaD_EyE
	[solved] Variable number of dictionnaries as argument in def()	paul18fr	11	8,507	Apr-20-2021, 11:15 AM Last Post: paul18fr
	Running latest Python version on the Terminal (MAC)	Damian	4	3,755	Mar-22-2021, 07:58 AM Last Post: Damian
	Latest file with a pattern produces an error	tester_V	4	4,742	Dec-10-2020, 02:14 AM Last Post: tester_V
	Read plotly-latest.min.js from local	issac_n	1	3,227	Nov-18-2020, 02:08 PM Last Post: issac_n

Get latest version off website and save it as variable [SOLVED]

User Panel Messages

Announcements