Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem : IndexError...
#1
I'm capturing a string (url) where the array will be assembled at runtime, but I'm facing the following error ( IndexError: list index out of range ), how do I remedy this problem, below is the code:

# Native Module : urlparse -> ( https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse )
from urllib.parse import urlparse

url = 'https://kosmics.tech/'
urco = urlparse(url)

doco = urco.netloc
paco = urco.path

neco = doco.split('.')[0]
reap = paco.split('/')[1]
reco = paco.split('/')[2]
reme = paco.split('/')[3]

print(neco)
Reply
#2
(Nov-16-2022, 03:52 PM)JohnnyCoffee Wrote: I'm capturing a string (url) where the array will be assembled at runtime, but I'm facing the following error ( IndexError: list index out of range ), how do I remedy this problem, below is the code:

# Native Module : urlparse -> ( https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse )
from urllib.parse import urlparse

url = 'https://kosmics.tech/'
urco = urlparse(url)

doco = urco.netloc
paco = urco.path

neco = doco.split('.')[0]
reap = paco.split('/')[1]
reco = paco.split('/')[2]
reme = paco.split('/')[3]

print(neco)

paco.split('/') returns a list with two empty strings ['', ''] this is the reason. You are trying to access indexes (2, 3) that are out of the list range (0, 1).

The reason is that urco.path returns just the path to the root adrres which is '/'.
Reply
#3
urco = urlparse(url) does everything that you need:

>>> from urllib.parse import urlparse
>>> url = 'https://kosmics.tech/'
>>> urco = urlparse(url)
>>> urco.scheme
'https'
>>> urco.netloc
'kosmics.tech'
>>> urco.path
'/'
>>> urco.params
''
>>> urco.query
''
>>>
As module:
from urllib.parse import urlparse


def split_url(url):
    return urlparse(url)

def main():
    urco = split_url('https://kosmics.tech/')

    print(f"\nurco.scheme: {urco.scheme}")
    print(f"\nurco.netloc: {urco.netloc}")
    print(f"\nurco.path: {urco.path}")
    print(f"\nurco.query: {urco.query}")


if __name__ == '__main__':
    main()
results of running as python mymodule.py:
Output:
urco.scheme: https urco.netloc: kosmics.tech urco.path: / urco.query:
Reply
#4
(Nov-16-2022, 04:17 PM)carecavoador Wrote: paco.split('/') returns a list with two empty strings ['', ''] this is the reason. You are trying to access indexes (2, 3) that are out of the list range (0, 1).

The reason is that urco.path returns just the path to the root adrres which is '/'.

I was able to partly resolve refactoring to the code below:

# Native Module : urlparse -> ( https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse )
from urllib.parse import urlparse

url = 'https://kosmics.tech/'
urco = urlparse(url)

doco = urco.netloc
suco = doco.split('.')[0]

paco = urco.path
redo = paco.split('/')[0]
reap = paco.split('/')[1]
reco = paco.split('/')[2]
reme = paco.split('/')[3]

print(reap)
The problem is that I need to leave the array (2 and 3) programmed for when the path is:

1 -> /api or apis/controller/method
2 -> /appname/api or apis/method
3 -> /appname/controller/api or apis
Reply
#5
see post 3, you are creating more work than needed.
Once you set 'urco', you can access it's members directly.
Reply
#6
As Lar60+ said, far more kindly than I, reading the documentation for the tool you are using and using it as designed is the best solution.

The code below points to you not knowing about Python unpacking. Or maybe you just forgot. That is unfortunate because packing and unpacking are very useful.
reap = paco.split('/')[1]
reco = paco.split('/')[2]
reme = paco.split('/')[3]
This is a nice, light discussion of packing and unpacking.

https://realpython.com/lessons/tuple-ass...unpacking/

After you read it you'd know that the above code should be written as:
reap, reco, reme = paco.split('/')[1:]
It is still the wrong solution for your problem, but it is a better Python coding of the wrong solution.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020