Posts: 22
Threads: 9
Joined: Oct 2016
I need to loop through pull requests in GitHub to determine if any new art files are submitted to the correct folder based on the markdown file(s) in the pull request.
For instance, in the code below if I'm returned a list of file of filenames as such:
Output: folder1/folder2/test.md
folder1/folder2/media/test/art.png
folder1/folder2/hello.md
The art.png file is submitted to the right folder because /media/test/art.png has the folder name test which the matches test in the test.md file.
I know I need to use any statement when searching for both the art files and the markdown files and then do a file split for comparison, but I can't get the syntax quite right.
image_ext = [".png", ".jpg", ".jpeg", ".gif"]
new_art_wrong_folder = False
for file_information in repo.pull_request(pr.number).files():
fname,extname = os.path.splittext(file_information.filename)
if extname.lower() in (image.lower() for image in image_ext) and file_information.status == 'added':
new_art_wrong_folder = True
Posts: 3,458
Threads: 101
Joined: Sep 2016
I don't know if it's intended, but "(image.lower() for image in image_ext)" will always give you the same result as just "image_ext". >>> image_ext = ['.png', '.gif', '.jpg']
>>> [image.lower() for image in image_ext]
['.png', '.gif', '.jpg']
Posts: 22
Threads: 9
Joined: Oct 2016
Yes, this is intended as the check is to split the extension name from the filename and check to see if it's in "image_ext". Thus the reason for these lines:
fname,extname = os.path.splittext(file_information.filename)
if extname.lower() in (image.lower() for image in image_ext)
Posts: 3,458
Threads: 101
Joined: Sep 2016
Ok, but... # this
if extname.lower() in (image.lower() for image in image_ext)
# is completely identical to this:
if extname.lower() in image_ext That's besides the point, it's just a minor thing that jumped out at me.
...using your example filestructure, what does repo.pull_request(pr.number).files() return? Is it just the base directory, or is it a recursive list of everything in the repo?
Posts: 22
Threads: 9
Joined: Oct 2016
(Oct-19-2016, 09:34 PM)nilamo Wrote: Ok, but...# this
if extname.lower() in (image.lower() for image in image_ext)
# is completely identical to this:
if extname.lower() in image_ext That's besides the point, it's just a minor thing that jumped out at me.
...using your example filestructure, what does repo.pull_request(pr.number).files() return? Is it just the base directory, or is it a recursive list of everything in the repo?
Ah, now I see what you were asking. I was trying to take everything to lower case, but with image_ext already being lower case, I only need to apply lower() on the extname.
repo.pull_request(pr.number).files() returns the filenames associated with an individual pull request (as in the output of my original post).
I need to make sure that any new art (status = added) is submitted to a path that has the name of one of the markdown files in the pull request after /media/name_of_a_markdown_file_in_pull_request/art.png.
So in my example art.png has this after media, /media/test/art.png which should return new_art_wrong_folder = False because there's this markdown file in the pull request, folder1/folder2/test.md and test appears after media (/media/test/art.png) in the art filename.
I hope this makes sense
Posts: 3,458
Threads: 101
Joined: Sep 2016
fname,extname = os.path.splittext(file_information.filename)
That gives you the filename/path, and the extension, so you know if it's an image/markdown file, and you need to parse the path part to make sure it's in the right place, correct? I think the first step would be getting the path of each file, since that's what you're checking is correct.
So, roughly something like this? import os
# in real life,
# files = [f.filename for f in repo.pull_request(pr.number).files()]
files = [
'folder1/folder2/test.md',
'folder1/folder2/media/test/art.png',
'folder1/folder2/hello.md'
]
# markdown files that control where things are saved
control_files = []
# the actual images (keys are the path they're stored in)
asset_files = {}
image_ext = [".png", ".jpg", ".jpeg", ".gif"]
for file_information in files:
fname, ext = os.path.splitext(file_information)
# separate the filename from the directory the file's in
path, control_file = os.path.split(fname)
# separate the stored directory from the repo-base dir
_, dir = os.path.split(path)
if ext.lower() == '.md':
control_files.append(control_file)
elif ext.lower() in image_ext:
if dir not in asset_files:
asset_files[dir] = []
asset_files[dir].append(fname)
# now that we know all the new control files and assets,
# we can check to make sure they match
for dir, assets in asset_files.items():
if dir not in control_files:
print("Control file for '{0}' does not exist!".format(dir))
for fname in control_files:
if fname not in asset_files:
print("Control file found, but no assets added for '{0}'".format(fname)) Output: >python test.py
Control file found, but no assets added for 'hello'
Posts: 22
Threads: 9
Joined: Oct 2016
Thanks for the reply. I came up with the code below that's working.
image_ext = [".png", ".jpg", ".jpeg", ".gif"]
new_art_wrong_folder = False
art_lst = []
md_lst = []
for file_information in repo.pull_request(pr.number).files():
fname, extname = os.path.splitext(file_information.filename)
if extname.lower() in image_ext and file_information.status == 'added':
art_dir = fname.split('media')[1].split('/')[1]
art_lst.append(art_dir)
if extname.lower() == '.md':
base_md = os.path.basename(file_information.filename)
md_filename, md_file_ext = os.path.splitext(base_md)
md_lst.append(md_filename)
if lambda art_lst, md_lst: bool(set(art_path_lst).intersection(md_file_lst)):
pass
else:
new_art_wrong_folder = True
Posts: 3,458
Threads: 101
Joined: Sep 2016
Is it working? The if at the bottom will always be true,regardless of what's in the repo. You create a function with lambda, but never call it. A function is always true.
Posts: 22
Threads: 9
Joined: Oct 2016
I've submitted several pull requests to the repo and everything comes back as expected. Is there another way I should use lambda to check if the elements in the art_path_lst match an element in the md_file_lst?
Posts: 12,034
Threads: 486
Joined: Sep 2016
Oct-21-2016, 07:06 PM
(This post was last modified: Oct-21-2016, 07:09 PM by Larz60+.)
Hello - You can use the builtin filecmp.
see https://pymotw.com/3/filecmp/ for an example
Not sure it will work out of the box for remote dirs.
|