Python Forum

Full Version: IndexError: list index out of range bug?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello!

I have a code that gets a string cell which is a date and converts it to a certain date format. The thing is, I am out of index and have no idea why. This is the code:

import pandas

df = read_csv('XYZ.csv')
numberofrows = len(df.index)																		
rownumber = -1

for line in range(numberofrows):

	MonthsDict = dict(January = '01', February = '02', March = '03', April = '04', May = '05', June = '06', July = '07', August = '08', September = '09', October = '10', November = '11', December = '12')

	if str(df.at[rownumber + 1,"StateStreet_Date"]) != "None" or str(df.at[rownumber + 1,"StateStreet_Date"]) != "NaN":
		SSdatelist = str(df.at[rownumber + 1,"StateStreet_Date"]).split(' ')
		df.at[rownumber + 1,"StateStreet_Date"] = SSdatelist[2] + '.' + str(MonthsDict.get(SSdatelist[1])) + '.' + SSdatelist[0] + '.'
The string we get from the cell is: "09 August 2019"
The end modified string should be: "2019.08.09."

For this I get an error which is this: IndexError: list index out of range. I tried some information gathering regarding the list. I am adding those here and also what they returned. The weird thing is that a lot of times they return two lists, as if one of them overwrites the good one and that is why I get an error ( I shouldn't get two list for 1 variable):

df = read_csv('Data_Acquisition.csv')
	MonthsDict = dict(January = '01', February = '02', March = '03', April = '04', May = '05', June = '06', July = '07', August = '08', September = '09', October = '10', November = '11', December = '12')
	if str(df.at[rownumber + 1,"StateStreet_Date"]) != "None" or str(df.at[rownumber + 1,"StateStreet_Date"]) != "NaN":
		SSdatelist = str(df.at[rownumber + 1,"StateStreet_Date"]).split(' ')
        print(SSdatelist)
        print(len(SSdatelist))
		df.at[rownumber + 1,"StateStreet_Date"] = SSdatelist[2] + '.' + str(MonthsDict.get(SSdatelist[1])) + '.' + SSdatelist[0] + '.'
Output:
['09', 'August', "19"] 3 ["2019.08.09."] 1 IndexError: list index out of range
I even went as far to list the elements with indexes in loop. All seemed fine:

Output:
[[0, '09'], [1, 'August'], [2, '2019']]
I tried every combination and it seems that when I try to print SSdatelist[1] I get the error, which means that the list has 1 element (only the 0 index). I print SSdatelist[0] and get:

Output:
09 2019.08.09. Traceback (most recent call last): File "as.py", line 36, in <module> df.at[rownumber + 1,"StateStreet_Date"] = SSdatelist[2] + '.' + str(MonthsDict.get(str(SSdatelist[1]))) + '.' + SSdatelist[0] + '.' #makes the date the correct format IndexError: list index out of range
How do I have two things under 1 variable? How is this possible? the first one it prints is correct(09) but the second one should not even get printed because it does not exists yet (2019.08.09.) because it only gets created after that.
you are not showing how you increment rownum, and I expect this is related to index being zero based
since 'line' is a range of numberofrows, it will span the dataframe, so why not use it as your index?
example (line 11):
if str(df.at[line,"StateStreet_Date"]) != "None" or str(df.at[line,"StateStreet_Date"]) != "NaN":
also, use a more meaningful name, instead of line, use rownum, or idx
It's a fair question and I have no excuse for it, I wrote the code when I was tired and working late and did not realise that adding the +1-s was a waste of time and i only should have changed rownumber to 0. I have only started learning some months ago when I have some, but I do realise this is very much a "noob" mistake.

But why does the error keep happening? Sorry if I did not understand something.

EDIT: The full code is not here as it has many other things and would be too long, that is why I did not include the rownumber = rownumber + 1 part because it's a bit further away.
I can't see where you are incrementing rownum, you haven't shown that code,
but as I previously stated, it most likely has to do with index starting at zero.
thus if you keep incrementing rownum, at some point it will be equal to length of frame + 1, thus returning an index error.
if you surround your code with:
try:
    ... your code here ...
except IndexError:
    print(f"Index error, rownum: {rownum}")
    raise
this will show what rownum is equal to when the index error occurs.
Also, you should always post your error tracebacks, complete and untouched as they almost certainly pinpoint the cause of error.
(Use error tags)
You were absolutely right, it's kind of embarrassing to admit, but I wrote the rownumber = rownumber + 1 outside the loop and that is what caused the error. Thank you for the help!