Python Forum

Full Version: Help with requests module
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I have a program I'm trying to switch modules from http.client to requests. The program takes a website through input and scans the index file to determine what filetype it is, pretty simple.

My issue is now that I switched to the requests module when I give it a website to scan it's telling me the index is all the possible filetypes so something is going wrong somewhere. I've been messing around and looking for answers for a few hours now so I figured somebody here might be able to tell me why.

I would guess it's somewhere between line 23 & 30.
#import http.client
import requests


files = ['index.html','index.php','index.js']

html = ["index.html", "Index is a html file"]
php = ["index.php", "Index is a php file"]
js = ["index.js", "Index is a javascript file"]

site = input("Enter URL-> ")

print ("Scanning " + "{}...\n".format(site))
print ("-"*60)

# HTTP.Client function
#def connection_status(site, index_file):
#    connection = http.client.HTTPSConnection(site)
#    connection.request("HEAD", index_file)
#    return connection.getresponse().status

# Requests fuction
def connection_status(site):
    connection = requests.head(site)
    return connection.status_code	


def scan(site, upload):
	status = connection_status(site)
	if status == 200:
		if upload == "index.html":
			print (html[0])
			print (html[1])
			print ("-"*60)
		elif upload == "index.php":
			print (php[0])
			print (php[1])
			print ("-"*60)
		elif upload == "index.js":
			print (js[0])
			print (js[1])
			print ("-"*60)					
		else:
			pass
	else:
		pass

for upload in files:
	scan(site, upload)

print ("")
print ("Scan complete!")
print ("")
exit()
Output:
Scanning http://localhost/... ------------------------------------------------------------ index.html Index is a html file ------------------------------------------------------------ index.php Index is a php file ------------------------------------------------------------ index.js Index is a javascript file ------------------------------------------------------------ Scan complete!
The index file in my localhost is index.html so that's all I'm expecting to return. Thanks to anyone taking the time to read this thread! Heart
Not sure what you think the problem is

for upload in files:
    scan(site, upload)
you pass every possible string from files to scan() as upload argument in a loop. Obviously response status code is 200, so in every iteration one of the if/elifs is hit and you get the print.

Probably you want to add the file when you make the request in connection_status() as extra param and include it in the url too.
(Mar-14-2021, 05:56 AM)buran Wrote: [ -> ]Obviously response status code is 200, so in every iteration one of the if/elifs is hit and you get the print.
Okay.. so this is a bad example I provided.. The actual program takes the URL and looks for certain files on that URL so that's why I have the status check in place if status is 200 then that file is there.

I'm sure the for statement is probably not the cleanest and probably hurts your head to look at, however that hasn't been touched when switching modules and it worked perfectly with the http.client module so that's why I don't believe that to be the issue. The only lines I changed were 23 through 25. You can see the old function I used with the http.client module commented out above which works.

Better example:
#import http.client
import requests


files = ['index.html','contact.html','login.html']

index = ["index.html", "Index page is found"]
contact = ["contact.html", "Contact page is found"]
login = ["login.html", "Login page is found"]

site = input("Enter URL-> ")

print ("Scanning " + "{}...\n".format(site))
print ("-"*60)

# HTTP.Client function
#def connection_status(site, index_file):
#    connection = http.client.HTTPSConnection(site)
#    connection.request("HEAD", index_file)
#    return connection.getresponse().status

# Requests fuction
def connection_status(site):
    connection = requests.head(site)
    return connection.status_code 	


def scan(site, upload):
	status = connection_status(site, upload)
	if status == 200:
		if upload == "index.html":
			print (index[0])
			print (index[1])
			print ("-"*60)
		elif upload == "contact.html":
			print (contact[0])
			print (contact[1])
			print ("-"*60)
		elif upload == "login.html":
			print (login[0])
			print (login[1])
			print ("-"*60)					
		else:
			pass
	else:
		pass

for upload in files:
	scan(site, upload)

print ("")
print ("Scan complete!")
print ("")
exit()
Let me know if that changes your thoughts on the possible issue, thanks again for your input!
(Mar-14-2021, 06:35 AM)0xB9 Wrote: [ -> ]Let me know if that changes your thoughts on the possible issue, thanks again for your input!
I already did
(Mar-14-2021, 05:56 AM)buran Wrote: [ -> ]Probably you want to add the file when you make the request in connection_status() as extra param and include it in the url too.

in the original code you were making requests to url/file, while in the new code you just do url