Python Forum
dynamically create variables' names in python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
dynamically create variables' names in python
#1
Hi guys,
i want to create variables in the following way:

assign a name (e.g. var1), then add the name to the prefix of the variable:

name = 'var_1'
this_is_+name = pd.DataFrame()
the outcome i would like is having the variable renamed (on the fly) directly:

this_is_var_1 = pf.DataFrame()
the error i get is

Error:
File "<ipython-input-28-322e5421cff2>", line 4 df+name = pd.DataFrame() ^ SyntaxError: can't assign to operator
i tried many different way, including 'print' but it doesnt work and can't find any solution on the web.
Maybe i am formulating the question in the wrong way?
Thanks for your help.
Reply
#2
Although possible, creating variable names dynamically is real bad idea. Use proper data structures like dict, list, etc. instead (your question imply you will have multiple variables that you want to create dynamically)
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
this_is_+name = pd.DataFrame()

This statement will surely throw error, since the + operator is on LHS(Left hand side) of the = sign.
As replied by Buran, its not a good idea to create variable in run time.
Reply
#4
Just to add why 'dynamically create variable names' is not such a good idea: 'presumably you want to access the variables dynamically as well'. You can't do it if you keep data in your variables names (instead of datastructures like list, dict etc).

More elaborate and elegant explanations:

Keep data out of your variable names

Why you don't want to dynamically create variables
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
Thanks for your recommendations guys.
I created a list in the below way:

dfs = []
for df_gen in np.arange (min_range_v, max_range_v):
    df_name = 'df_'+str(int(df_gen+min_year))
    dfs.append(df_name)
dfs
it returns this:
Output:
['df_2001', 'df_2002', 'df_2003', 'df_2004', 'df_2005', 'df_2006', 'df_2007', 'df_2008', 'df_2009', 'df_2010', 'df_2011', 'df_2012', 'df_2013', 'df_2014', 'df_2015', 'df_2016', 'df_2017', 'df_2018', 'df_2019']
In between the above an the below there is some code that populate the dataframe df.
The above list contains exactly the dataframes names i want to create BUT, when i try to access them i can't use the names (for example df_2001) instead i must use dfs[0] but that create an issue as all the info that i add at each for loop, it is mixed with the previous updated df.
This is what i do after the above:
In order to insert the new df info at each loop, i created a new df and i copy the existing df at each loop, then reset it at the beginning of the loop till it ends the loop and add it to the below:

some dynamically generated code to populate the dataframe df.
then at the end of the loop (just before it ends):

dfs[i] = df.copy()
but in the above way i can't obtains several different dataframes, and additionally if i try to call

df_2001
i get the below:

Error:
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-27-81242f59523c> in <module> ----> 1 df_2001 NameError: name 'df_2001' is not defined
i also tried to do the below:
str(dfs[0])
but still it doesnt create a variable/dataframe name...
i am really struggling with it...

If i call
dfs[0]
i get the dataframe but can't use the name df_2001
and if i call
dfs[0].shape
i get the below (which is way larger than what the dfs[0] is supposed to be as it should contain MAX 365 days (rows).
but what i get is this:
Output:
(4784, 62)
and the same output happen for every other dfs[i].

How can i sort it out to make sure i can access exactly the different dfs[i] like if they were independent dataframes?
potentially how can i access them with the name df_2001 etc?
Many thanks in advance for your great help,
Marco.
Reply
#6
Simple question: do you really need separate dataframes for years? You can add column 'year' and have one dataframe. Why bother with so many dataframes when you have maximum/total of 365 * 19 = 6935 rows.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#7
Here are four ways to access data symbolically in python, the two first being the more common

1) Using a dictionary
data = {}
data['df_2001'] = "spam"    # create, update
print(data['df_2001'])      # read
del data['df_2001']         # delete
2) Using a class instance
class Data:
    pass
data = Data()
data.df_2001 = "spam"    # create, update
print(data.df_2001)      # read
del data.df_2001         # delete
setattr(data, "df_2001", "ham")   # create, update
getattr(data, "df_2001")    # read
delattr(data, "df_2001")    # delete
3) Using directly a class
class data:
    pass
data.df_2001 = "spam"    # create, update
print(data.df_2001)      # read
del data.df_2001         # delete
setattr(data, "df_2001", "ham")   # create, update
getattr(data, "df_2001")    # read
delattr(data, "df_2001")    # delete
4) Creating a global variable (usually frowned upon as stated above)
globals()["df_2001"] = "spam"   # create, update
print(df_2001)   # read
del df_2001      # delete
df_2001 = "ham"    # create, update
globals()["df_2001"] # read
del globals()["df_2001"]   # delete
Reply
#8
thanks you so much all.
Eventually i found a way to call each single df from within the list through the index, e.g.:

dfs[i]

in order to remove the thousands of useless lines, i had to create a for loop where i would restart the temporary dataframe from scratch at each beginning of the loop.

At each end of the loop i would copy the temporary created df into the dfs[i].

In this way is working as i wanted, exception made that i can't generate the names df_2001 etc... but that doesn't really matter anymore at this point.
Thanks a lot for your help and inspiration!
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020