Python Forum
build pandas dataframe from a for loop - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: build pandas dataframe from a for loop (/thread-9521.html)



build pandas dataframe from a for loop - vaison - Apr-14-2018

Hello,

this is my first post about Python, and also this is my first post in English. So... thank you for your patience :)... And sorry for my low english level, I'm practicing to improve.

I want to build a pandas Dataframe but the rows info are coming to me one by one (in a for loop), in form of a dictionary (or json).

In each iteration I receive a dictionary where the keys refer to the columns, and the values are the rows values. For example:

1st Iteration I receive:
d_val = {'key1': 1.1, 'key2':2.2, 'key3':3.3}

And I want to build a dataframe (called df), where column names will be d_val.keys(), and the first row values will be d_val.values()

2nd Iteration I receive:
{'key1': 4.4, 'key2':5.5, 'key3': 6.6}

And I want to append the values to a new row in the datframe.

Example outcome wanted:

key1---key2----key3
1.1-----2.2------3.3
4.4-----5.5------6.6

And each new iteration, will make a new row append. I will be sure that in each iteration the dict keys are the same.

I try to create a empty dataframe or a dataframe with only de columns values, and then use the DataFrame.append method... but never works. I'm lost, because I think that It will not be so difficult.

I hope that I have explained well.

Thanks


RE: build pandas dataframe from a for loop - scidam - Apr-14-2018

Example below should help you...

import pandas as pd

df = pd.DataFrame({'key1': [], 'key2': [], 'key3': []})
for i in range(10):
    df = df.append({'key1': i, 'key2': i*2, 'key3': i**3}, ignore_index=True)



RE: build pandas dataframe from a for loop - vaison - Apr-14-2018

Hello scidam,

I've adapted it to my program and it works! Thank you.

But in my first message I have simplified the problem a little... ;) I was afraid of not explaining myself well and that's why I simplified it...

Really, I need to do this, but I don't know how many items will have the dict a priori. Then, I don't know if I will have 'key1', 'key2'... 'keyn'

In each iteration, I will receive a message with the information. When I receive the first message (in the first iteration), then I analize the message and I see the dict/json items. Before that, I don't know how many columns will have the dataframe.

Is there a more generic way?

Thanks


RE: build pandas dataframe from a for loop - scidam - Apr-14-2018

Pandas is quite smart. Don't worry about new keys, you can even start with an empty data frame, e.g.

import pandas as pd
import random
df = pd.DataFrame()
for j in range(10):
    df = df.append({'key_%s'%random.choice('abcde'): j}, ignore_index=True)
Output:
>>> df key_d key_b key_e key_c 0 0.0 NaN NaN NaN 1 1.0 NaN NaN NaN 2 2.0 NaN NaN NaN 3 NaN 3.0 NaN NaN 4 NaN NaN 4.0 NaN 5 5.0 NaN NaN NaN 6 6.0 NaN NaN NaN 7 NaN NaN 7.0 NaN 8 NaN NaN NaN 8.0 9 NaN NaN NaN 9.0



RE: build pandas dataframe from a for loop - vaison - Apr-14-2018

Hello scidam,

great! Thank you!