Hello all, this is my first time here so apologies if something is missed or in the wrong place.
I have recently started to learn to code and I am struggling with writing a function that will allow me to get all the data from a column in a data set by calling a function with inputs of the data set and a column number. I am completely stuck and any help would be appreciated.
My function starts like this:
def get_column(data, col_num):
I already have the data so that does not need to be imported and col_num is the input to which I would type, for example:
print(get_column(data, 5))
and it would return the 4th column.
Thanks.
What is data? You call it a data set, but I am not familiar with that data type. Is it a Pandas DataFrame? Provide more information about your data please. If possible provide the code that creates data
(Oct-06-2021, 06:08 PM)deanhystad Wrote: [ -> ]What is data? You call it a data set, but I am not familiar with that data type. Is it a Pandas DataFrame? Provide more information about your data please. If possible provide the code that creates data
Sorry, the data is a .csv file. I am not currently using Pandas but I am using Numpy
Could you please provide code that shows how you are reading the csv file.
if data is a 2D numpy array, getting column 2 is as simple as: column=data[:, 2]:
import numpy as np
data = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
# Print the columns one at a time.
for i in range(len(data[0])):
print(i, data[:, i])
Output:
0 [1 5 9]
1 [ 2 6 10]
2 [ 3 7 11]
3 [ 4 8 12]
I am reading the file such that
data = np.loadtxt('data.csv', delimiter = ',', skiprows = 1)
I need my function to get that column of data when I call the function,
e.g when I do print(get_column(data, 3)) it should call the function and get me the data from column 4 only
If your data.csv file looks like mine, you can get columns the same way. No need to write a function.
import numpy as np
# data.csv looks like this: "these, are, column, headers\n1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12"
data = np.loadtxt('data.csv', delimiter = ',', skiprows = 1)
print(f'np Array\n{data}\n\nColumns')
# Print the columns one at a time.
for i in range(len(data[0])):
print(i, data[:, i]) # You don't need a function. This does all you need
Output:
np Array
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]]
Columns
0 [1. 5. 9.]
1 [ 2. 6. 10.]
2 [ 3. 7. 11.]
3 [ 4. 8. 12.]
But if you need to write a function.
def getColumn(data, column):
return data[:, column]
Thank you so much for the help! It works perfectly now!