Python Forum
Need Help Understanding Python Code - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Need Help Understanding Python Code (/thread-23515.html)



Need Help Understanding Python Code - samlee916 - Jan-03-2020

I need help explaining this code that was given to me. I understand the simple ones like lines 1-4, and 9-10 and have a tough time understanding the rest of that portion of the code.
indexHTML = requests.get("https://www.gosolarcalifornia.ca.gov/equipment/documents/").text
fileName = "Grid_Support_Inverter_List_Full_Data.xlsm"
fileNameHTML = '<td><a href="{0}">{0}</a></td>'.format(fileName)
fileNameHTMLIndex = indexHTML.find(fileNameHTML)
if fileNameHTMLIndex != -1:
    dateStartIndex = indexHTML.find(">", fileNameHTMLIndex+len(fileNameHTML)) + 1
    dateEndIndex = indexHTML.find("<", dateStartIndex)
    webFileDate = datetime.datetime.strptime(indexHTML[dateStartIndex:dateEndIndex].strip(), "%Y-%m-%d %H:%M")
else:
    raise Exception("Unable to find fileName {}".format(fileName))



RE: Need Help Understanding Python Code - stullis - Jan-03-2020

Line 4 will return a -1 if fileNameHTML is not found in indexHTML. So, line 5 checks for the presence of a <td> tag with the filename. In the event it is found (the value is not -1), it then checks for the first ">" and the first "<" in indexHTML on lines 6 and 7. Using the returned indexes, it slices indexHTML, strips the leading and trailing white space, and formats the text into a datetime object.