Negative indexing/selecting working and not working - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Negative indexing/selecting working and not working (/thread-40332.html) |
Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023 Hi, I am new here, so please bear with me as I am beginner. There is a code below, in which there is a function called compute_percentages. I would like to ask why perc[-2] for example or perc[-1] is working perfectly inside that function, but when I have done: perc = (source["value"] / source["value"].sum()) * 100this: perc[-2]throws an error about: " ValueError: -2 is not in range". I would be very grateful for ideas, thank you. import pandas as pd import altair as alt source = pd.DataFrame([ { "question": "Question 1", "type": "Strongly disagree", "value": 24, }, { "question": "Question 1", "type": "Disagree", "value": 294, }, { "question": "Question 1", "type": "Neither agree nor disagree", "value": 594, }, { "question": "Question 1", "type": "Agree", "value": 1927, }, { "question": "Question 1", "type": "Strongly agree", "value": 376, }, { "question": "Question 2", "type": "Strongly disagree", "value": 2, }, { "question": "Question 2", "type": "Disagree", "value": 2, }, { "question": "Question 2", "type": "Neither agree nor disagree", "value": 0, }, { "question": "Question 2", "type": "Agree", "value": 7, }, { "question": "Question 2", "type": "Strongly agree", "value": 11, }, { "question": "Question 3", "type": "Strongly disagree", "value": 2, }, { "question": "Question 3", "type": "Disagree", "value": 0, }, { "question": "Question 3", "type": "Neither agree nor disagree", "value": 2, }, { "question": "Question 3", "type": "Agree", "value": 4, }, { "question": "Question 3", "type": "Strongly agree", "value": 2, }, { "question": "Question 4", "type": "Strongly disagree", "value": 0, }, { "question": "Question 4", "type": "Disagree", "value": 2, }, { "question": "Question 4", "type": "Neither agree nor disagree", "value": 1, }, { "question": "Question 4", "type": "Agree", "value": 7, }, { "question": "Question 4", "type": "Strongly agree", "value": 6, }, { "question": "Question 5", "type": "Strongly disagree", "value": 0, }, { "question": "Question 5", "type": "Disagree", "value": 1, }, { "question": "Question 5", "type": "Neither agree nor disagree", "value": 3, }, { "question": "Question 5", "type": "Agree", "value": 16, }, { "question": "Question 5", "type": "Strongly agree", "value": 4, }, { "question": "Question 6", "type": "Strongly disagree", "value": 1, }, { "question": "Question 6", "type": "Disagree", "value": 1, }, { "question": "Question 6", "type": "Neither agree nor disagree", "value": 2, }, { "question": "Question 6", "type": "Agree", "value": 9, }, { "question": "Question 6", "type": "Strongly agree", "value": 3, }, { "question": "Question 7", "type": "Strongly disagree", "value": 0, }, { "question": "Question 7", "type": "Disagree", "value": 0, }, { "question": "Question 7", "type": "Neither agree nor disagree", "value": 1, }, { "question": "Question 7", "type": "Agree", "value": 4, }, { "question": "Question 7", "type": "Strongly agree", "value": 0, }, { "question": "Question 8", "type": "Strongly disagree", "value": 0, }, { "question": "Question 8", "type": "Disagree", "value": 0, }, { "question": "Question 8", "type": "Neither agree nor disagree", "value": 0, }, { "question": "Question 8", "type": "Agree", "value": 0, }, { "question": "Question 8", "type": "Strongly agree", "value": 2, } ]) # Add type_code that we can sort by source["type_code"] = source.type.map({ "Strongly disagree": -2, "Disagree": -1, "Neither agree nor disagree": 0, "Agree": 1, "Strongly agree": 2 }) source def compute_percentages(df): # Set type_code as index and sort df = df.set_index("type_code").sort_index() # Compute percentage of value with question group perc = (df["value"] / df["value"].sum()) * 100 df["percentage"] = perc # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0) df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2) # Compute percentage start by subtracting percent df["percentage_start"] = df["percentage_end"] - perc return df source = ( source .groupby("question", group_keys=True) .apply(compute_percentages) .reset_index(drop=True) ) RE: Negative indexing/selecting working and not working - Larz60+ - Jul-12-2023 try: print(perc) index of -2 is last element -1 so there must be at least two elements the error message is telling you that the perc does not contain enough elements to access perc[-2] example: >>> perc = [] >>> perc [] >>> perc.append(123) >>> perc [123] >>> perc[-2] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range >>> perc.append(456) >>> perc [123, 456] >>> perc[-2] 123 >>> RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023 Looking at your code, perc is Pandas series, not a list. Indexing for a series does not work like indexing in a list. Indexing for a series works more like indexing in a dictionary. The index value is a key, not a position. The error message would be This error will occur if you have a question that contains no "Strongly Disagree" type.But I don't get an error when I run your code. I get this. Are you sure the error you are getting is associated with the code in your post?In the future please post the entire error message, including the trace. RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023 In my case/code: len(perc)gives 40 elements, so I want to get second from the end which would be value of 0.000000. But when I do perc[-2] it errors: What am I missing ?
RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023 (Jul-12-2023, 05:05 PM)deanhystad Wrote: Are you sure the error you are getting is associated with the code in your post? I have included all code and you are right about the error, but my question is, why is this code working inside a function compute_percentages but separately, outside of it, it is not ? RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023 I don't understand your question. Inside the function perc is a series. It is going to look like this: type_code -2 0.746501 <- This is perc[-2] -1 9.144635 0 18.475894 1 59.937792 <- This is not perc[-2] 2 11.695179It makes one of these for each group (question). perc does not exist outside the function. You cannot call the function unless you make the groups (you will have duplicate labels in the row index). Can you post the code that makes a list named "perc" that contains 40 elements. is it something like this? perc = source["percentage"] print(perc[-2])This would raise a key error because the row indices are 0, 1...39. -2 is not a valid index. If you want a list of the values, ask for that. perc = source["percentage"].values # Returns a list of values from the "percentage" column. print(perc[-2]) RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023 (Jul-12-2023, 05:28 PM)deanhystad Wrote: Can you post the code that makes a list named "perc" that contains 40 elements. I have posted it previously, but here you are: perc = (source["value"] / source["value"].sum()) * 100perc[-2] I tried to change perc to dataframe but could not get it to work. I would like to admit that I am learning Python, so sometimes even basic subjects are a struggle to me. I apologize for those basic questions. What I want to understand is why this perc[-2] is working inside a function and why does it error when is written separately, meaning not inside a function. The code I provided in my first post works perfectly, I just want to understand what is happening in it and why I got those errors when I started to experiment with that code. I hope this clarifies it a bit. RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023 perc is a series, essentially a single column dataframe. Indexing for a series uses keys, not positional indexing. If you want to do position indexing, get the values. That will return a numpy array. perc.values[-2]When you have questions like this, try printing the thing, or the type of the thing. Printing perc would show you why you cannot do perc[-2]. print(perc) Notice the row indices does not include -2.
RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023 Ok, Thank you very much for your kind explanations, Why in a function compute_percentages it was used as: # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0) df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2)and was not written as: perc.values[-2]Inside that function created perc is of class: "pandas.core.series.Series", isn't it ? I included that function below: def compute_percentages(df): # Set type_code as index and sort df = df.set_index("type_code").sort_index() # Compute percentage of value with question group perc = (df["value"] / df["value"].sum()) * 100 df["percentage"] = perc # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0) df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2) # Compute percentage start by subtracting percent df["percentage_start"] = df["percentage_end"] - perc return df RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023 Maybe this will make the indexing clearer. I made a small change to your code. Instead of computing a numeric type_code I left type_code as a string ("Strongly disagree"...). Now when compute_percentages() reindexes the dataframe to use the "type_code", the row indices are words, not ints. This required a change to the "percentage_end" calculation because there is no perc[-2], perc[-1], or perc[0]. These are now perc["Strongly disagree"], perc["Disagree"] and perc["Neither agree nor disagree"]. source["type_code"] = source["type"] # Changed so type_code is words, not -2, -1, 0, 1, 2 def compute_percentages(df): # Set type_code as index and sort df = df.set_index("type") # Compute percentage of value with question group perc = (df["value"] / df["value"].sum()) * 100 df["percentage"] = perc # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0) # Notice that we have to use words to index perc, because the row indices are words, not numbers. df["percentage_end"] = perc.cumsum() - ( perc["Strongly disagree"] + perc["Disagree"] + perc["Neither agree nor disagree"] / 2 ) # Compute percentage start by subtracting percent df["percentage_start"] = df["percentage_end"] - perc return dfWhen computing perc outside the dataframe, you still created a series. The row indices just happened to be numbers from 0 to 39, but they were still keys, not positions in some array or list. One more attempt to make this clear. Here I sort the source dataframe by the "value" column. source = source.sort_values("value") print(source) Now 30 is the index of the first row and 3 is the index of last row. If I print values[3] it prints the last value, not the firstvalues = source["value"] print(values[3]) Notice that it prints the value from the last row, not the 4th row.
|