Python Forum
Negative indexing/selecting working and not working - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Negative indexing/selecting working and not working (/thread-40332.html)

Pages: 1 2 3


Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023

Hi,
I am new here, so please bear with me as I am beginner.
There is a code below, in which there is a function called compute_percentages.
I would like to ask why perc[-2] for example or perc[-1] is working perfectly inside that function, but when I have done:
perc = (source["value"] / source["value"].sum()) * 100
this:
perc[-2]
throws an error about: " ValueError: -2 is not in range".
I would be very grateful for ideas, thank you.

import pandas as pd
import altair as alt

source = pd.DataFrame([
      {
        "question": "Question 1",
        "type": "Strongly disagree",
        "value": 24,
      },
      {
        "question": "Question 1",
        "type": "Disagree",
        "value": 294,
      },
      {
        "question": "Question 1",
        "type": "Neither agree nor disagree",
        "value": 594,
      },
      {
        "question": "Question 1",
        "type": "Agree",
        "value": 1927,
      },
      {
        "question": "Question 1",
        "type": "Strongly agree",
        "value": 376,
      },
      {
        "question": "Question 2",
        "type": "Strongly disagree",
        "value": 2,
      },
      {
        "question": "Question 2",
        "type": "Disagree",
        "value": 2,
      },
      {
        "question": "Question 2",
        "type": "Neither agree nor disagree",
        "value": 0,
      },
      {
        "question": "Question 2",
        "type": "Agree",
        "value": 7,
      },
      {
        "question": "Question 2",
        "type": "Strongly agree",
        "value": 11,
      },
      {
        "question": "Question 3",
        "type": "Strongly disagree",
        "value": 2,
      },
      {
        "question": "Question 3",
        "type": "Disagree",
        "value": 0,
      },
      {
        "question": "Question 3",
        "type": "Neither agree nor disagree",
        "value": 2,
      },
      {
        "question": "Question 3",
        "type": "Agree",
        "value": 4,
      },
      {
        "question": "Question 3",
        "type": "Strongly agree",
        "value": 2,
      },

      {
        "question": "Question 4",
        "type": "Strongly disagree",
        "value": 0,
      },
      {
        "question": "Question 4",
        "type": "Disagree",
        "value": 2,
      },
      {
        "question": "Question 4",
        "type": "Neither agree nor disagree",
        "value": 1,
      },
      {
        "question": "Question 4",
        "type": "Agree",
        "value": 7,
      },
      {
        "question": "Question 4",
        "type": "Strongly agree",
        "value": 6,
      },

      {
        "question": "Question 5",
        "type": "Strongly disagree",
        "value": 0,
      },
      {
        "question": "Question 5",
        "type": "Disagree",
        "value": 1,
      },
      {
        "question": "Question 5",
        "type": "Neither agree nor disagree",
        "value": 3,
      },
      {
        "question": "Question 5",
        "type": "Agree",
        "value": 16,
      },
      {
        "question": "Question 5",
        "type": "Strongly agree",
        "value": 4,
      },

      {
        "question": "Question 6",
        "type": "Strongly disagree",
        "value": 1,
      },
      {
        "question": "Question 6",
        "type": "Disagree",
        "value": 1,
      },
      {
        "question": "Question 6",
        "type": "Neither agree nor disagree",
        "value": 2,
      },
      {
        "question": "Question 6",
        "type": "Agree",
        "value": 9,
      },
      {
        "question": "Question 6",
        "type": "Strongly agree",
        "value": 3,
      },

      {
        "question": "Question 7",
        "type": "Strongly disagree",
        "value": 0,
      },
      {
        "question": "Question 7",
        "type": "Disagree",
        "value": 0,
      },
      {
        "question": "Question 7",
        "type": "Neither agree nor disagree",
        "value": 1,
      },
      {
        "question": "Question 7",
        "type": "Agree",
        "value": 4,
      },
      {
        "question": "Question 7",
        "type": "Strongly agree",
        "value": 0,
      },
      {
        "question": "Question 8",
        "type": "Strongly disagree",
        "value": 0,
      },
      {
        "question": "Question 8",
        "type": "Disagree",
        "value": 0,
      },
      {
        "question": "Question 8",
        "type": "Neither agree nor disagree",
        "value": 0,
      },
      {
        "question": "Question 8",
        "type": "Agree",
        "value": 0,
      },
      {
        "question": "Question 8",
        "type": "Strongly agree",
        "value": 2,
      }
])

# Add type_code that we can sort by
source["type_code"] = source.type.map({
    "Strongly disagree": -2, 
    "Disagree": -1, 
    "Neither agree nor disagree": 0,
    "Agree": 1,
    "Strongly agree": 2
})
source

def compute_percentages(df):
    # Set type_code as index and sort
    df = df.set_index("type_code").sort_index()
    
    # Compute percentage of value with question group
    perc = (df["value"] / df["value"].sum()) * 100
    df["percentage"] = perc

    # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0)
    df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2)
    
    # Compute percentage start by subtracting percent
    df["percentage_start"] = df["percentage_end"] - perc

    return df

source = (
    source
    .groupby("question", group_keys=True)
    .apply(compute_percentages)
    .reset_index(drop=True)
)



RE: Negative indexing/selecting working and not working - Larz60+ - Jul-12-2023

try:
print(perc)
index of -2 is last element -1 so there must be at least two elements

the error message is telling you that the perc does not contain enough elements to access perc[-2]
example:
>>> perc = []
>>> perc
[]
>>> perc.append(123)
>>> perc
[123]
>>> perc[-2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> perc.append(456)
>>> perc
[123, 456]
>>> perc[-2]
123
>>>



RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023

Looking at your code, perc is Pandas series, not a list. Indexing for a series does not work like indexing in a list. Indexing for a series works more like indexing in a dictionary. The index value is a key, not a position. The error message would be
Error:
KeyError: -2
This error will occur if you have a question that contains no "Strongly Disagree" type.

But I don't get an error when I run your code. I get this.
Output:
question type value percentage percentage_end percentage_start 0 Question 1 Strongly disagree 24 0.746501 -18.382582 -19.129082 1 Question 1 Disagree 294 9.144635 -9.237947 -18.382582 2 Question 1 Neither agree nor disagree 594 18.475894 9.237947 -9.237947 3 Question 1 Agree 1927 59.937792 69.175739 9.237947 4 Question 1 Strongly agree 376 11.695179 80.870918 69.175739 5 Question 2 Strongly disagree 2 9.090909 -9.090909 -18.181818 6 Question 2 Disagree 2 9.090909 0.000000 -9.090909 7 Question 2 Neither agree nor disagree 0 0.000000 0.000000 0.000000 8 Question 2 Agree 7 31.818182 31.818182 0.000000 9 Question 2 Strongly agree 11 50.000000 81.818182 31.818182 10 Question 3 Strongly disagree 2 20.000000 -10.000000 -30.000000 11 Question 3 Disagree 0 0.000000 -10.000000 -10.000000 12 Question 3 Neither agree nor disagree 2 20.000000 10.000000 -10.000000 13 Question 3 Agree 4 40.000000 50.000000 10.000000 14 Question 3 Strongly agree 2 20.000000 70.000000 50.000000 15 Question 4 Strongly disagree 0 0.000000 -15.625000 -15.625000 16 Question 4 Disagree 2 12.500000 -3.125000 -15.625000 17 Question 4 Neither agree nor disagree 1 6.250000 3.125000 -3.125000 18 Question 4 Agree 7 43.750000 46.875000 3.125000 19 Question 4 Strongly agree 6 37.500000 84.375000 46.875000 20 Question 5 Strongly disagree 0 0.000000 -10.416667 -10.416667 21 Question 5 Disagree 1 4.166667 -6.250000 -10.416667 22 Question 5 Neither agree nor disagree 3 12.500000 6.250000 -6.250000 23 Question 5 Agree 16 66.666667 72.916667 6.250000 24 Question 5 Strongly agree 4 16.666667 89.583333 72.916667 25 Question 6 Strongly disagree 1 6.250000 -12.500000 -18.750000 26 Question 6 Disagree 1 6.250000 -6.250000 -12.500000 27 Question 6 Neither agree nor disagree 2 12.500000 6.250000 -6.250000 28 Question 6 Agree 9 56.250000 62.500000 6.250000 29 Question 6 Strongly agree 3 18.750000 81.250000 62.500000 30 Question 7 Strongly disagree 0 0.000000 -10.000000 -10.000000 31 Question 7 Disagree 0 0.000000 -10.000000 -10.000000 32 Question 7 Neither agree nor disagree 1 20.000000 10.000000 -10.000000 33 Question 7 Agree 4 80.000000 90.000000 10.000000 34 Question 7 Strongly agree 0 0.000000 90.000000 90.000000 35 Question 8 Strongly disagree 0 0.000000 0.000000 0.000000 36 Question 8 Disagree 0 0.000000 0.000000 0.000000 37 Question 8 Neither agree nor disagree 0 0.000000 0.000000 0.000000 38 Question 8 Agree 0 0.000000 0.000000 0.000000 39 Question 8 Strongly agree 2 100.000000 100.000000 0.000000
Are you sure the error you are getting is associated with the code in your post?

In the future please post the entire error message, including the trace.


RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023

In my case/code:
len(perc)
gives 40 elements, so I want to get second from the end which would be value of 0.000000.
But when I do perc[-2] it errors:
Error:
ValueError: -2 is not in range
What am I missing ?


RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023

(Jul-12-2023, 05:05 PM)deanhystad Wrote: Are you sure the error you are getting is associated with the code in your post?

I have included all code and you are right about the error, but my question is, why is this code working inside a function compute_percentages but separately, outside of it, it is not ?


RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023

I don't understand your question. Inside the function perc is a series. It is going to look like this:
type_code
-2     0.746501  <- This is perc[-2]
-1     9.144635
 0    18.475894
 1    59.937792  <- This is not perc[-2]
 2    11.695179
It makes one of these for each group (question).

perc does not exist outside the function. You cannot call the function unless you make the groups (you will have duplicate labels in the row index).

Can you post the code that makes a list named "perc" that contains 40 elements. is it something like this?
perc = source["percentage"]
print(perc[-2])
This would raise a key error because the row indices are 0, 1...39. -2 is not a valid index.
If you want a list of the values, ask for that.
perc = source["percentage"].values  # Returns a list of values from the "percentage" column.
print(perc[-2])



RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023

(Jul-12-2023, 05:28 PM)deanhystad Wrote: Can you post the code that makes a list named "perc" that contains 40 elements.

I have posted it previously, but here you are:
perc = (source["value"] / source["value"].sum()) * 100
perc[-2]

I tried to change perc to dataframe but could not get it to work. I would like to admit that I am learning Python, so sometimes
even basic subjects are a struggle to me. I apologize for those basic questions. What I want to understand is why this perc[-2] is working inside a function and why does it error when is written separately, meaning not inside a function.
The code I provided in my first post works perfectly, I just want to understand what is happening in it and why I got those errors when I started to experiment with that code.
I hope this clarifies it a bit.


RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023

perc is a series, essentially a single column dataframe. Indexing for a series uses keys, not positional indexing. If you want to do position indexing, get the values. That will return a numpy array.
perc.values[-2]
When you have questions like this, try printing the thing, or the type of the thing. Printing perc would show you why you cannot do perc[-2].
print(perc)
Output:
0 0.725076 1 8.882175 2 17.945619 3 58.217523 4 11.359517 5 0.060423 6 0.060423 7 0.000000 8 0.211480 9 0.332326 10 0.060423 11 0.000000 12 0.060423 13 0.120846 14 0.060423 15 0.000000 16 0.060423 17 0.030211 18 0.211480 19 0.181269 20 0.000000 21 0.030211 22 0.090634 23 0.483384 24 0.120846 25 0.030211 26 0.030211 27 0.060423 28 0.271903 29 0.090634 30 0.000000 31 0.000000 32 0.030211 33 0.120846 34 0.000000 35 0.000000 36 0.000000 37 0.000000 38 0.000000 39 0.060423 Name: value, dtype: float64
Notice the row indices does not include -2.


RE: Negative indexing/selecting working and not working - Andrzej_Andrzej - Jul-12-2023

Ok, Thank you very much for your kind explanations,

Why in a function compute_percentages it was used as:

# Compute percentage end, centered on "Neither agree nor disagree" (type_code 0)
    df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2)
and was not written as:
perc.values[-2]
Inside that function created perc is of class: "pandas.core.series.Series", isn't it ?

I included that function below:

def compute_percentages(df):
    # Set type_code as index and sort
    df = df.set_index("type_code").sort_index()
    
    # Compute percentage of value with question group
    perc = (df["value"] / df["value"].sum()) * 100
    df["percentage"] = perc

    # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0)
    df["percentage_end"] = perc.cumsum() - (perc[-2] + perc[-1] + perc[0] / 2)
    
    # Compute percentage start by subtracting percent
    df["percentage_start"] = df["percentage_end"] - perc

    return df



RE: Negative indexing/selecting working and not working - deanhystad - Jul-12-2023

Maybe this will make the indexing clearer.

I made a small change to your code. Instead of computing a numeric type_code I left type_code as a string ("Strongly disagree"...). Now when compute_percentages() reindexes the dataframe to use the "type_code", the row indices are words, not ints. This required a change to the "percentage_end" calculation because there is no perc[-2], perc[-1], or perc[0]. These are now perc["Strongly disagree"], perc["Disagree"] and perc["Neither agree nor disagree"].
source["type_code"] = source["type"]  # Changed so type_code is words, not -2, -1, 0, 1, 2

def compute_percentages(df):
    # Set type_code as index and sort
    df = df.set_index("type")

    # Compute percentage of value with question group
    perc = (df["value"] / df["value"].sum()) * 100
    df["percentage"] = perc

    # Compute percentage end, centered on "Neither agree nor disagree" (type_code 0)
    # Notice that we have to use words to index perc, because the row indices are words, not numbers.
    df["percentage_end"] = perc.cumsum() - (
        perc["Strongly disagree"]
        + perc["Disagree"]
        + perc["Neither agree nor disagree"] / 2
    )

    # Compute percentage start by subtracting percent
    df["percentage_start"] = df["percentage_end"] - perc

    return df
When computing perc outside the dataframe, you still created a series. The row indices just happened to be numbers from 0 to 39, but they were still keys, not positions in some array or list.

One more attempt to make this clear. Here I sort the source dataframe by the "value" column.
source = source.sort_values("value")
print(source)
Output:
question value type_code percentage percentage_end percentage_start 30 Question 7 0 Strongly disagree 0.000000 -10.000000 -10.000000 37 Question 8 0 Neither agree nor disagree 0.000000 0.000000 0.000000 36 Question 8 0 Disagree 0.000000 0.000000 0.000000 35 Question 8 0 Strongly disagree 0.000000 0.000000 0.000000 34 Question 7 0 Strongly agree 0.000000 90.000000 90.000000 31 Question 7 0 Disagree 0.000000 -10.000000 -10.000000 38 Question 8 0 Agree 0.000000 0.000000 0.000000 7 Question 2 0 Neither agree nor disagree 0.000000 0.000000 0.000000 15 Question 4 0 Strongly disagree 0.000000 -15.625000 -15.625000 20 Question 5 0 Strongly disagree 0.000000 -10.416667 -10.416667 11 Question 3 0 Disagree 0.000000 -10.000000 -10.000000 17 Question 4 1 Neither agree nor disagree 6.250000 3.125000 -3.125000 32 Question 7 1 Neither agree nor disagree 20.000000 10.000000 -10.000000 25 Question 6 1 Strongly disagree 6.250000 -12.500000 -18.750000 26 Question 6 1 Disagree 6.250000 -6.250000 -12.500000 21 Question 5 1 Disagree 4.166667 -6.250000 -10.416667 39 Question 8 2 Strongly agree 100.000000 100.000000 0.000000 14 Question 3 2 Strongly agree 20.000000 70.000000 50.000000 27 Question 6 2 Neither agree nor disagree 12.500000 6.250000 -6.250000 12 Question 3 2 Neither agree nor disagree 20.000000 10.000000 -10.000000 10 Question 3 2 Strongly disagree 20.000000 -10.000000 -30.000000 6 Question 2 2 Disagree 9.090909 0.000000 -9.090909 5 Question 2 2 Strongly disagree 9.090909 -9.090909 -18.181818 16 Question 4 2 Disagree 12.500000 -3.125000 -15.625000 22 Question 5 3 Neither agree nor disagree 12.500000 6.250000 -6.250000 29 Question 6 3 Strongly agree 18.750000 81.250000 62.500000 13 Question 3 4 Agree 40.000000 50.000000 10.000000 33 Question 7 4 Agree 80.000000 90.000000 10.000000 24 Question 5 4 Strongly agree 16.666667 89.583333 72.916667 19 Question 4 6 Strongly agree 37.500000 84.375000 46.875000 18 Question 4 7 Agree 43.750000 46.875000 3.125000 8 Question 2 7 Agree 31.818182 31.818182 0.000000 28 Question 6 9 Agree 56.250000 62.500000 6.250000 9 Question 2 11 Strongly agree 50.000000 81.818182 31.818182 23 Question 5 16 Agree 66.666667 72.916667 6.250000 0 Question 1 24 Strongly disagree 0.746501 -18.382582 -19.129082 1 Question 1 294 Disagree 9.144635 -9.237947 -18.382582 4 Question 1 376 Strongly agree 11.695179 80.870918 69.175739 2 Question 1 594 Neither agree nor disagree 18.475894 9.237947 -9.237947 3 Question 1 1927 Agree 59.937792 69.175739 9.237947
Now 30 is the index of the first row and 3 is the index of last row. If I print values[3] it prints the last value, not the first
values = source["value"]
print(values[3])
Output:
1927
Notice that it prints the value from the last row, not the 4th row.