Python Forum
Git clone all of a Github user's public repositories (download all repositories)
Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Git clone all of a Github user's public repositories (download all repositories)
#11
Axel your gists are awesome.

So I was originally scraping a line that contained the phrase "repository_nwo" to identify lines with repo names. For some reason that line doesn't show up in the HTML of everyone's repository page. I thought it might have to do with being logged in vs. logged out... but no. So now it just looks for each link in the html. Returning as a set may not be necessary but left it just in case. My apologies again.

Also I might add a check in the future so that it doesn't attempt to download forked repositories
Reply
#12
Thanks, it works now.
Reply
#13
(Jul-04-2019, 06:15 PM)rootVIII Wrote: Yup it's the regex... should have a fix in a little bit... sorry for that
ugh... those "unanticipated requirements", again.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#14
Hahaha I know. Sometimes I get too excited and post the code to early. Anyways, this one will only attempt to download the user's own repos and not forks. I changed the URL to display only source repos. It might be fun to get the forked ones too though.

# rootVIII
# Download/clone all of a user's public source repositories
# Pass the Github user's username with the -u option
# Usage: python git_clones.py -u <github username>
# Example: python git_clones.py -u rootVIII
#
from argparse import ArgumentParser
from sys import exit, version_info
from re import findall
from subprocess import call
try:
    from urllib.request import urlopen
except ImportError:
    from urllib2 import Request, urlopen


class GitClones:
    def __init__(self, user):
        self.url = "https://github.com/%s" % user
        self.url += "?&tab=repositories&q=&type=source"
        self.git_clone = "git clone https://github.com/%s/" % user
        self.git_clone += "%s.git"
        self.user = user

    def http_get(self):
        if version_info[0] != 2:
            req = urlopen(self.url)
            return req.read().decode('utf-8')
        req = Request(self.url)
        request = urlopen(req)
        return request.read()

    def get_repo_data(self):
        try:
            response = self.http_get()
        except Exception:
            print("Unable to make request to %s's Github page" % self.user)
            exit(1)
        else:
            pattern = r"<a\s?href\W+%s/(.*)\"\s+" % self.user
            for line in findall(pattern, response):
                yield line.split('\"')[0]

    def get_repositories(self):
        return [repo for repo in self.get_repo_data()]

    def download(self, git_repos):
        for git in git_repos:
            cmd = self.git_clone % git
            try:
                call(cmd.split())
            except Exception as e:
                print(e)
                print('unable to download:%s\n ' % git)


if __name__ == "__main__":
    message = 'Usage: python git_clones.py -u <github username>'
    h = 'Github Username'
    parser = ArgumentParser(description=message)
    parser.add_argument('-u', '--user', required=True, help=h)
    d = parser.parse_args()
    clones = GitClones(d.user)
    repositories = clones.get_repositories()
    clones.download(repositories)
Reply
#15
just to mention that given there is an API, it's better to make use of it. It would be better on so many counts...
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#16
nice to know there is an API. they don't do much to let people know that's there.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#17
Yup I had no idea

I can also guarantee that it was more fun writing the code than making API calls :)
Reply
#18
(Jul-11-2019, 05:27 AM)rootVIII Wrote: I can also guarantee that it was more fun writing the code than making API calls :)
You can write a code that uses API calls. Or even write a python wrapper (although there are many available)
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#19
i am now wondering why not:
        self.git_clone = "git clone https://github.com/%s/%%s" % user
instead of lines 21-22 in post #14.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#20
I wanted to leave that last %s for the loop/filling in the repository name. But to be honest I didn't know that format specifier existed. Pretty neat.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pong clone with classes OhNoSegFaultAgain 1 3,850 May-11-2019, 07:44 PM
Last Post: keames

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020