Python Forum
is writing a function a pythonic thing to do?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
is writing a function a pythonic thing to do?
#1
So I'm new to python and obviusley new to "pythonic" code,
the problem I am facing is coverting a DataFrame value to a tuple.

For example a feature called MSZoning have the following options:
A, C, FV, I, RH, RL, RP, RM

I want to convert any value of that feature to a tuple,
for example if the value for MSZoning in a row is "RH" I want to convert it to: (0, 0, 0, 0, 1, 0, 0, 0).

The code I have right now is this:

# coverting string values to tuples
train_df["Zone"] = train_df['MSZoning'].map({'A':  (1, 0, 0, 0, 0, 0, 0, 0),
                                                                             'C':  (0, 1, 0, 0, 0, 0, 0, 0),
                                                                             'FV': (0, 0, 1, 0, 0, 0, 0, 0),
                                                                             'I':  (0, 0, 0, 1, 0, 0, 0, 0),
                                                                            'RH': (0, 0, 0, 0, 1, 0, 0, 0),
                                                                            'RL': (0, 0, 0, 0, 0, 1, 0, 0),
                                                                            'RP': (0, 0, 0, 0, 0, 0, 1, 0),
                                                                            'RM': (0, 0, 0, 0, 0, 0, 0, 1)
                                                                           })
# dropping unnesesery column
train_df.drop(['MSZoning'])
now, I want to put trough the same procedure more than 10 features - which seems tidious and not so much pythonic.

"write a function !" my java-oriented brain said,
"it will take a DF, a feature name and a new name as input and will return the DS after converting the column's values !"

This is the "half-pythonic" function I wrote:

def feature_to_boolean_tuple(df, feature_name, new_name):

    tuple_list = [] #each tuple will represent an option
    feature_options = df[feature_name].unique()
    feature_options_length = len(feature_options)

    # creating a list the size of feature_options_length, all 0's
    list_to_be_tuple = [0 for i in range(feature_options_length)]

    for i in range(feature_options_length):
        list_to_be_tuple[i] = 1 # inserting 1 representing option number i
        tuple_list.append(tuple(list_to_be_tuple))
        list_to_be_tuple[i] = 0

    mapping = dict(zip(feature_options, tuple_list)) # dict from values to vectors
    df[new_name] = df[feature_name].map(mapping)
    df.drop([feature_name], axis=1, inplace=True)
but is it really the pythonic way of solving this problem?
is writing small functions by yourself something a python developer will do?

Thank you so much for taking the time to help.
Reply
#2
Avivlevi815 Wrote:is writing small functions by yourself something a python developer will do?
It is a very good thing to do in almost every programming language, and this includes Python. Your function can be shortened because
>>> import numpy as np
>>> dict(zip(['A', 'B', 'C'], (tuple(x) for x in np.eye(3, dtype=int))))
{'C': (0, 0, 1), 'B': (0, 1, 0), 'A': (1, 0, 0)}
Using this, you can remove lines 7 to 14.

There may be even shorter ways to get the same result, unfortunately, I don't know the Pandas library well enough Wink
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Writing a function that changes its answer based on user input SirRavenclaw 2 2,801 Dec-21-2019, 09:46 PM
Last Post: Clunk_Head
  Writing python function difficulty kirito85 5 3,272 Oct-28-2018, 07:34 AM
Last Post: buran
  Writing a function that accepts two integer parameters (lines and cheers) taydeal20 1 3,091 Feb-05-2018, 08:35 PM
Last Post: nilamo
  more pythonic way fstefanov 8 6,154 Apr-24-2017, 05:15 PM
Last Post: fstefanov
  writing a function for isogram Shazily 4 11,997 Feb-03-2017, 05:35 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020