Python Forum
Get the filename from a path
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get the filename from a path
#5
Pathlib does not handle urls correct. You could deconstruct an url with urllib.parse.urlparse and construct an url with urllib.parse.urlunparse. Pathlib could handle the ParseResult.path from urlparse().

Helper functions without pathlib and urllib:
def get_file(url):
    return url.rpartition("/")[2]

def is_pdf(file):
    return file.rpartition(".")[2] == "pdf"


paths = ('http://www.123.com/file.pdf', 'http://www.123.com/pdfhello',
         'http://www.456.com/hello/one.file.pdf', 'http://www.123.com',
         'http://www.456.com/hello/one.file.pdf')

for url in paths:
    file = get_file(url)
    if is_pdf(file):
        print(file)
Doing this with preserving the absolute path with urllib and pathlib:
from pathlib import Path
from urllib.parse import urlparse


def converto_to_paths(urls):
    for url in urls:
        path = Path(urlparse(url).path)
        if path.suffix == ".pdf":
            yield path

urls = ('http://www.123.com/file.pdf', 'http://www.123.com/pdfhello',
         'http://www.456.com/hello/one.file.pdf', 'http://www.123.com',
         'http://www.456.com/hello/one.file.pdf')

paths = list(converto_to_paths(urls))
paths_filenames = [file.name for file in paths]
paths_filenames_stem = [file.stem for file in paths]

print(paths, paths_filenames, paths_filenames_stem, sep="\n\n")
Output:
[PosixPath('/file.pdf'), PosixPath('/hello/one.file.pdf'), PosixPath('/hello/one.file.pdf')] ['file.pdf', 'one.file.pdf', 'one.file.pdf'] ['file', 'one.file', 'one.file']
Depending on what you want later to do with your data, you can decide to use or not to use Path and urlparse.
The function urlparse() could also handle relative urls.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
Get the filename from a path - by 12237ee1 - Jul-12-2020, 07:03 AM
RE: Get the filename from a path - by ndc85430 - Jul-12-2020, 07:08 AM
RE: Get the filename from a path - by DeaD_EyE - Jul-12-2020, 10:11 AM
RE: Get the filename from a path - by 12237ee1 - Jul-13-2020, 02:58 PM
RE: Get the filename from a path - by DeaD_EyE - Jul-13-2020, 04:10 PM
RE: Get the filename from a path - by 12237ee1 - Jul-13-2020, 06:01 PM
RE: Get the filename from a path - by Yoriz - Jul-12-2020, 10:33 AM
RE: Get the filename from a path - by DeaD_EyE - Jul-12-2020, 11:41 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  WebDriverException: Message: 'PATH TO CHROME DRIVER' executable needs to be in PATH Led_Zeppelin 1 2,250 Sep-09-2021, 01:25 PM
Last Post: Yoriz
  .pth file does not show up in sys.path when configuring path. arjunsingh2908 2 5,821 Jul-03-2018, 11:16 AM
Last Post: arjunsingh2908
  scandir() recursively and return path + filename malonn 6 17,410 May-09-2018, 03:45 PM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020