Python Forum
homeworking using only numpy package
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
homeworking using only numpy package
#1
Hi, I have difficulty doing my loop, will appreciate any help.

my isnumeric() is not working, i just need to show that isnumeric is false.

My loop is to count the total number of unique values in every column in a csv file.
Thanks.

import numpy as np

### Read the hdb resale price index csv file with the loadtxt() function
hdbrpi = "CA1data/housing-and-development-board-resale-price-index-1q2009-100-quarterly.csv"
data = np.genfromtxt(hdbrpi, delimiter=",", skip_header=1, dtype=[('quarter', 'U50'), ('index', 'U50')])

### Print out total rows and columns of data in the file
print("***HDB Resale Price Index***")
print()
print(f"There are {len(data)} rows and {len(data[0])} columns of data in this dataset {hdbrpi}")
print()

### Print out the names of the columns in the file
print("The names of the columns are:")

with open(hdbrpi) as data:
    data = np.genfromtxt(hdbrpi, delimiter=",", skip_header=1, dtype=[('quarter', 'U50'), ('index', 'U50')])
    line_count = 0
    for line_count in data:
        if line_count >= 0:
            print(row[line_count], type(row[line_count]) , "isnumeric:", row[line_count].isnumeric())
            #unique_elements, counts_elements = np.unique(data, return_counts=True)
            #print(unique_elements, counts_elements)
            #print(np.unique(row[line_count], return_counts = true))
        line_count += 1
        print(line_count)
 
Reply
#2
You can use pandas to load the doc and find the number of unique values, e.g.:

import pandas as pd
data = pd.read_csv('path_to_your_csv_file.csv')
uniques_by_column = {col: len(data.loc[:, col].unique()) for col in data.columns}
print(uniques_by_column) # this dictionary contains the number of unique values in each column
Reply
#3
Hi,

Thanks for the reply. I forgot to mention this but i can only use the numpy package for this homework. I am not allowed to use the panda package.
Reply
#4
The pandas package relies heavily on NumPy,
so, the solution of the problem will be almost the same:

import numpy as np
data = np.loadtxt('path_to_your_csv_file.csv', delimiter=',') # check delimiter
# data assumed to be 2D array
uniques_by_column = {j: len(np.unique(data[:, j])) for j in range(data.shape[-1])}
Reply
#5
Hi,

Thank you so much, i will use your code for my homework.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020