Python Forum

Full Version: Help with understanding basic IP script
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello there!

For my computer studies, I recieved homework where I have to pick any simple computer task and offer a solution on how to create a program for its automatisation in either Flowgorithm or Python. I can either write my own script, or analyze and explain one I've pulled from the internet. After much consideration, I chose this one: Get the Geo Location of an IP Address

Now, our class hasn't done anything but the basic statements and core principals of Python, so I'm having a bit of trouble understanding what most of this means and would greatly appreciate anyone giving me a hand.

if len(sys.argv)!=2:
    print(usage)
    sys.exit(0)

if len(sys.argv) > 1:
    ipaddr = sys.argv[1]
This is the main culprit, namely I don't understand what the role of !=2 and >1 is supposed to be.

paragraph = soup('p')[3]

I believe that this line is about BeautifulSoup finding the correct paragraph with the location information, which is the third one?
geo_txt = re.sub(r'<.*?>', '', str(paragraph))
print geo_txt[32:].strip()
This is where a regex is used to remove the HTML tag, but I can't fathom how to explain what (r'<.*?>', '', str(paragraph)) and print geo_txt[32:].strip() are supposed to mean.

If anyone could walk me through this or even just point me in the right direction, I would greatly appreciate it.

Cheers
len(x) != 2 is a test for there not being exactly two items in x (the length of x is not equal to 2). len(x) > 1 is a test for their being more than one item in x. Python is 0-indexed, so the first item in x is x[0], then x[1], and so on. So x[3] is the fourth item in x.

The key part of the regex is .*?. The dot means match anything, the asterisk means match zero or more or those anythings, and the ? means don't be greedy (stop as soon as you can, that is at the first >). The sub method replaces any instances of that pattern in the text with nothing ('').

The [32:] means skip the first 32 items in the sequnce (character in the string), and the strip method gets rid of any trailing or leading white space.
Thanks a ton for your reply!

I still have a few questions though. I understand what the meaning of !=2 and >1 parts is, but why do I have to include them in the script? Which exact role do they handle?

Also, what's the role of ipaddr?
Well, sys.argv is the list of arguments passed to the program. So it's doing different things based on the number of arguments.
Thanks!

One last thing though, when I try to run the program, I get a syntax error in the last line (print geo_txt). What exactly is the cause of this?
Probably a version issue. The print syntax changed between versions 2.7 and 3.0. I couldn't say for sure without seeing the full text of the error though. Try print(geo_txt) instead and see if that works.
All right, that's all I wanted to know. Thanks a lot for you help!