Python Forum

Full Version: get any domain's Alexa rank via command line
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
alexa_check Github Repository


The cool thing though is that it scrapes countries/ranks that are not visible on the alexa.com/site_info/[domain-name] page as shown here:

[Image: output_screenshot.png]
Thanks for sharing your project. Since the project's code is contained within a single file, you are welcome to post it in the thread, along with basic usage instructions, if you wish.
Ok thank you! I just fixed the "yield" part... is a little less code now:

#! /usr/bin/python3
# check a domain's Alexa rank via command line
# pass domain as argument with -d option
# usage:
# python alexa_check.py -d example_domain.com
import requests
import argparse
import re


class AlexaCheck:
    def __init__(self, domain):
        self.url = "https://www.alexa.com/siteinfo/" + domain

    def __search_regex(self, regex, phrase):
        match = re.search(regex, phrase)
        if match:
            return match.group(1)

    def check(self):
        r = requests.get(self.url)
        page = str(r.text).split("\n")
        found_global = False
        for line in page:
            if "</strong>" in line and any(char.isdigit() for char in line):
                perspective = line.replace(',', '')
                found = self.__search_regex("(\d+)\s+<", perspective)
                if found and found[:1] != "0":
                    if not found_global:
                        found_global = True
                        yield ("global_rank", found)
            if "Flag" in line and "nbsp" in line:
                perspective = line.replace(',', '')
                country = self.__search_regex(".*\w+;(.*)</a>", perspective)
                country_rank = self.__search_regex(".*>(\d+)<", perspective)
                yield (country, country_rank)


if __name__ == "__main__":
    message = 'Enter a domain: example.com'
    parser = argparse.ArgumentParser(description='-d domain.com argument')
    parser.add_argument('-d', '--domain', required=True, help=message)
    domain = vars(parser.parse_args())
    for rank_tuple in AlexaCheck(domain['domain']).check():
        print(rank_tuple)