Python Forum
How to add data to the categorical index of dataframe as data arrives?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to add data to the categorical index of dataframe as data arrives?
#1
Python 3.7.3, pandas 0.25.1

I need to collect statistics within the clients, so I created a df in which for each client I group the necessary data:

#create dataframe
dtype=np.dtype([('day_begin','u4'), ('day_end','u4'), ('price_begin','f4'), ('price_end','f4'), ('Client','U13')])      
auxiliary_array = np.empty(0, dtype=dtype)       
periods_clients = pd.DataFrame(auxiliary_array)        
periods_clients.set_index(['Client'], inplace=True)

#fill dataframe from file
with open(path_file) as csv_file:
        reader = csv.reader(csv_file, delimiter=';')
         
        fieldnames = ['Date', 'Client', 'Price']
        reader = csv.DictReader(csv_file, fieldnames=fieldnames, delimiter=';')
        for dict_str in reader:
            Client = dict_str['Client']
            
            if Client not in periods_clients.index:
                periods_clients.loc[Client] = [current_date, current_date, current_price, current_price]
            else:                  
                periods_clients.loc[Client].day_end = current_date
                periods_clients.loc[Client].price_end = current_price
The client is a string field, so the program runs for a very long time. My attempt to replace this field with a categorical variable failed, because I could not add values ​​while reading data from the file (and in advance I do not know all the clients).

How to be, boys?

if I create catregorical index, then I get error:

#create dataframe
dtype=np.dtype([('day_begin','u4'), ('day_end','u4'), ('price_begin','f4'), ('price_end','f4')])
auxiliary_array = np.empty(0, dtype=dtype)
periods_clients = pd.DataFrame(auxiliary_array)

periods_clients['Client'] = pd.Series('Client', dtype='category')
periods_clients.set_index(['Client'], inplace=True)
....
#fill dataframe from file
...
if Client not in periods_clients.index:
    periods_clients.index.add_categories(Client, inplace=True) #ERROR!!!
Error:
ValueError: cannot use inplace with CategoricalIn
Therefore, I cann't add values ​​to the categorical index as they become available.
Reply
#2
Maybe there is some way?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Grouping in pandas/multi-index data frame Aleqsie 3 607 Jan-06-2024, 03:55 PM
Last Post: deanhystad
  Help: Conversion of Electricity Data into Time Series Data SmallGuy 3 1,159 Oct-04-2023, 03:31 PM
Last Post: deanhystad
  How to insert data in a dataframe? man0s 1 1,313 Apr-26-2022, 11:36 PM
Last Post: jefsummers
  [split] Getting Index Error - list index out of range krishna 2 2,567 Jan-09-2021, 08:29 AM
Last Post: buran
  Interpolating DataFrame method=‘index’ help tlewick1 1 1,822 Oct-22-2020, 12:48 AM
Last Post: scidam
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 4,245 Jul-15-2020, 06:20 PM
Last Post: pawanmtm
  Getting Index Error - list index out of range RahulSingh 2 6,102 Feb-03-2020, 07:17 AM
Last Post: RahulSingh
  How to find index of a particular value in a dataframe ankitawadhwa 0 2,364 Jan-21-2020, 09:45 PM
Last Post: ankitawadhwa
  datetime intervals - dataframe selection (via plotted data) karlito 0 1,681 Nov-12-2019, 08:16 AM
Last Post: karlito
  Applying operation to a pandas multi index dataframe subgroup Nuovoq 1 2,620 Sep-04-2019, 10:04 PM
Last Post: Nuovoq

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020