Python Forum

Full Version: check if a file exist on the internet and get the size
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
hi, sorry for my bad english,
i currently create this code:
#image_name = 'https://www.google.co.id/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png'
image_name = 'C:\DumpStack.log'
from os.path import exists
if exists(image_name):
    import os
    statinfo = os.stat(image_name)
    print(image_name + " exist, with size : " + str(statinfo.st_size))
else:
    print(image_name + " do not exist")
but somehow does not work at a file on the internet,
can someone convert this code or give me the clue on the google keyword?
thank you, have a nice day
(Apr-16-2022, 12:36 PM)kucingkembar Wrote: [ -> ]but somehow does not work at a file on the internet,
For internet need something like Requests .
The HEAD request has info about content-length.
The headers dos not always have true size,so can use stream=True to get only response header when download(not whole response body)
import requests

url = 'https://www.google.co.id/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png'
response = requests.get(url, stream=True).headers['Content-length']
print(response)
Output:
5969
To download image it would be like this.
import requests 

img = requests.get(url)
with open('logo.png', 'wb') as f:
    f.write(img.content)
Some tips about your code,import should always first in code.
Look into f-string.
>>> image_name = 'logo.png'
>>> stat = 200
>>> # print(image_name + " exist, with size : " + str(statinfo.st_size))
>>> # Become
>>> print(f'{image_name} exist, with size: {stat}kb')
logo.png exist, with size: 200kb
thank you snippsat for the reply,
now I know how to get internet file size,
but I still don't understand how to get: if the URL is a file,
if I change a letter in the URL and get an Error 404 (Not Found),
the code still responds: the file exists with size xxxx

and about "import should always first in code"
-is there any problem to put "import" later in the code?
-what happens if put "import" the same package twice?

thank you again for your reply
An HTTP response need not be the contents of a file and the Content-Length header simply contains the size of that response. So, in the case of a 404 where the server sent something in the response body, you'll still get a non-zero value for Content-Length.

import statements should be listed first mainly for readability purposes - they interrupt your flow of reading and understanding the code if they're in the middle.
@ndc85430 thank you for the reply,
my problem is I need to download the jpg if the jpg URL is a real jpg,
if the URL is not jpg, I will not need to download it,

I always put the import when it is necessary to load, especially inside the function/def,
they said python is slow to run, and CMIIW, I guess: only "load/import" when the necessary packages really needed; is a way to make it faster
You might want to try checking the value of the Content-Type header then.
@ndc85430 Dude, thank you, i sent you and snippsat reputation +1 point