Python Forum
Get latest version off website and save it as variable [SOLVED]
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get latest version off website and save it as variable [SOLVED]
#1
Hello everybody,

I try to get the latest version off an website (not the download but just the version).
For example this is the download site "https://gpac.wp.imt.fr/downloads/" and when manually press on Windows 64 bits it downloads version 1.0.1.

Is there a way to detect this information and save it as a variable in a python script?

So far, I have used this but it comes with two flaws (1) it's not entirely python and 2) it only saves the entire download link):

import os

os.system (curl -s https://gpac.wp.imt.fr/downloads/ | grep x64)
I would get the information for this particular software from somewhere else but there a few sites I would like to grab the release version.
Reply
#2
A more normal way is to web-scrape the info you want.
Example.
import requests
from bs4 import BeautifulSoup

url = 'https://gpac.wp.imt.fr/downloads/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.select_one('#post-147 > div > p:nth-child(3)').text)
# Just version
version = soup.select_one('#post-147 > div > p:nth-child(3) > strong').text
print(version)
Output:
The current GPAC release is 1.0.1 (released in September 2020). 1.0.1
This info #post-147 > div > p:nth-child(3) can just copy from Browser right click copy selector.
Then in BS use the CSS selector way with select or select_one.
Reply
#3
(Nov-14-2021, 10:55 AM)snippsat Wrote: A more normal way is to web-scrape the info you want.
Example.
import requests
from bs4 import BeautifulSoup

url = 'https://gpac.wp.imt.fr/downloads/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.select_one('#post-147 > div > p:nth-child(3)').text)
# Just version
version = soup.select_one('#post-147 > div > p:nth-child(3) > strong').text
print(version)
Output:
The current GPAC release is 1.0.1 (released in September 2020). 1.0.1
This info #post-147 > div > p:nth-child(3) can just copy from Browser right click copy selector.
Then in BS use the CSS selector way with select or select_one.

Okay yeah that work's thanks.
I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?

Edit: Alright I got it (sorry getting it so late). My output is "#content > ul:nth-child(3) > li > a" and it prints "MakeMKV 1.16.5 for Windows". How do i cut it to only show the video? I didn't understand that in your part
Reply
#4
(Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?
It will be same way do some training in web-scaring can look at Web-Scraping part-1
Here a example using two ways.
import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
li_ver = soup.find_all('li')[5]
print(li_ver.text)
print(soup.select_one('#content > li:nth-child(6)').text)
Output:
MakeMKV v1.16.5 (1.11.2021 ) MakeMKV v1.16.5 (1.11.2021 )
Reply
#5
(Nov-14-2021, 08:37 PM)snippsat Wrote:
(Nov-14-2021, 08:02 PM)AlphaInc Wrote: I haven't understood how I can do it for other sites tho. For another example, how do I get the information for this site: https://www.makemkv.com/download/
Do I need to find out what CSS part I'm looking for using a browser?
It will be same way do some training in web-scaring can look at Web-Scraping part-1
Here a example using two ways.
import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
li_ver = soup.find_all('li')[5]
print(li_ver.text)
print(soup.select_one('#content > li:nth-child(6)').text)
Output:
MakeMKV v1.16.5 (1.11.2021 ) MakeMKV v1.16.5 (1.11.2021 )

Yeah sorry it took me a second to get what you mean. I have it like this:

import requests
from bs4 import BeautifulSoup

url = 'https://www.makemkv.com/download/'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
version = soup.select_one('#content > ul:nth-child(3) > li > a').text
print(version)
It gets the output:
MakeMKV 1.16.5 for Windows
Was there a way to only get 1.16.5 (Without the other strings) or is this the best I get?
Reply
#6
import re
from collections import namedtuple

import bs4
import requests

MAKEMKV_BASE = "https://www.makemkv.com/download/"
VERSION_REG = re.compile(r"(\d+\.\d+\.\d+)")


def parse_version(file_name: str) -> str:
    if match := VERSION_REG.search(file_name):
        return match.group(1)
    else:
        return ""


MakeMKV = namedtuple("makemkv", "url version version_tuple os")


def get_makemv():
    content = requests.get(MAKEMKV_BASE).content
    doc = bs4.BeautifulSoup(content, "lxml")
    selector = "div#content > ul.bullets > li > a"
    for element in doc.select(selector, href=True):
        href = element["href"]
        if href.endswith(".txt"):
            continue

        version_str = parse_version(href)
        version_tuple = tuple(map(int, version_str.split(".")))
        name = element.text.lower()
        if "windows" in name:
            os_type = "windows"
        elif "mac os x" in name:
            os_type = "macos"
        else:
            os_type = "unkown"

        yield MakeMKV(href, version_str, version_tuple, os_type)


for result in get_makemv():
    print(result.os, result.version, result.url)
The inspector from Firefox helps a lot to find the elements.
I used this information to make the selector.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to save to multiple locations during save cubangt 1 542 Oct-23-2023, 10:16 PM
Last Post: deanhystad
  SOLVED variable as tuple name krayon70 7 1,804 Apr-09-2022, 03:30 PM
Last Post: krayon70
  How to save specific variable in for loop in to the database? ilknurg 1 1,143 Mar-09-2022, 10:32 PM
Last Post: cubangt
  [solved] subdictionaries path as variable paul18fr 4 2,622 May-18-2021, 08:12 AM
Last Post: DeaD_EyE
  [solved] Variable number of dictionnaries as argument in def() paul18fr 11 6,114 Apr-20-2021, 11:15 AM
Last Post: paul18fr
  Running latest Python version on the Terminal (MAC) Damian 4 2,633 Mar-22-2021, 07:58 AM
Last Post: Damian
  Latest file with a pattern produces an error tester_V 4 3,178 Dec-10-2020, 02:14 AM
Last Post: tester_V
  Read plotly-latest.min.js from local issac_n 1 2,174 Nov-18-2020, 02:08 PM
Last Post: issac_n
  How to save latest time stamp in a file? redwood 12 7,209 Jul-11-2019, 11:03 AM
Last Post: redwood
  Python ftp server get the latest sub-directory name muzamalrana 1 3,440 Aug-08-2018, 11:40 PM
Last Post: muzamalrana

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020