Python Forum

Full Version: 2 Dimensional NumPy for beginners
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi there, I don't understand the answer to this task:

This is the task:

import numpy as np

# Subway ridership for 5 stations on 10 different days
ridership = np.array([
[ 0, 0, 2, 5, 0],
[1478, 3877, 3674, 2328, 2539],
[1613, 4088, 3991, 6461, 2691],
[1560, 3392, 3826, 4787, 2613],
[1608, 4802, 3932, 4477, 2705],
[1576, 3933, 3909, 4979, 2685],
[ 95, 229, 255, 496, 201],
[ 2, 0, 1, 27, 0],
[1438, 3785, 3589, 4174, 2215],
[1342, 4043, 4009, 4665, 3033]

if True:
print ridership[1:3, 3:5]

Answer:
[[2328 2539]
[6461 2691]]

But why?
If we have [1:3] that means Row: 1 column 3 --> The answer should be: 2328 and not [2328 2539]
If we have [3:5] that means Row: 3 column 5 --> No answer....

Can somebody explain me, the new python user Angel , why the answer is correct? Or maybe send me a link to the video tutorial/blog what ever.....

Thank you!
J
if you have an array like arr = np.array([1,2,3,4,5]) and you would say something like arr[1:3] you would select everything starting at index 1 to index 3 (but excluding index 3). so you would get [2,3]. so the : always tells the array from where to where you want to select stuff. additionally you could say something like arr[0:4:2] now you pick everything from index 0 to index 4 (excluding index 4) and only every two values (so you set a step width there). what you would get is: [1,3].
In your example you have a 2 Dimensional array, so saying something like ridership[1,3] would select the item at row 1 and column 3. so you divide your x,y,z,... indices by using "," comma. so in your case you would pick following rows [1:3] (so each row from index 1 to 3, excluding index 3) and columns [3:5] (so each column from 3 to 5, excluding index 5) since you selece ridership[1:3, 3:5]. ridership[1,3] == 2328, ridership[1,4] == 2539, ridership[2,3] == 6461 and ridership[2,4] == 2691. What you are doing is slicing your array and it is extreamly efficient in comparison to the usage of loops :)
Try to put your code in [python][/python] markers or the indent is lost...

The answer is correct, you need to take into account 2 important aspects of python (and numpy):
- The indexes start to count at 0
- The ranges never reach the latest value, so 1:4 is 1, 2, and 3

So when you say ridership[1:3, 3:5] you are asking for the elements from row 1 to 3 excluded (so 2nd and 3rd rows) and columns from 3 to 5 excluded (so columns 4th and 5th)

If this feels strange, it is really powerful... the reason to not to include the upper bound of a slice is obvious when you notice that in that way is:
a[0:3] + a[3:10] == a[0:10]
And that helps a lot to avoid missing or repeating elements wen playing with lists or matrices.