May-09-2021, 06:46 PM
(This post was last modified: May-09-2021, 06:52 PM by Yoriz.
Edit Reason: Added code tags
)
Hi, I have a dataset with several columns (Time Series) and I would like to synchronize them - the 'col2' should be the reference.
![[Image: dzJE1.png]](https://i.stack.imgur.com/dzJE1.png)
Here is my df:
![[Image: VDXo9.png]](https://i.stack.imgur.com/VDXo9.png)
With the code below I am able to synchronize the only two columns 'col3' according to 'col2' (time series).
-------------
-------------
-------------
-------------
Here is the df_synchronized:
![[Image: 0SpmE.png]](https://i.stack.imgur.com/0SpmE.png)
I would like to iterate over all columns in DataFrame and do the same for 'col4' and 'col5' as was for 'col3' being done. Simply, 'col3' needs to be replaced in a loop with 'col4' and 'col5'. The goal would be to have the df_synchronized with all columns from df.
Is there any way, how to make it done?
--------
-------
can't be change to distance,
![[Image: dzJE1.png]](https://i.stack.imgur.com/dzJE1.png)
Here is my df:
![[Image: VDXo9.png]](https://i.stack.imgur.com/VDXo9.png)
With the code below I am able to synchronize the only two columns 'col3' according to 'col2' (time series).
-------------
-------------
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
import pandas as pd import numpy as np # pip install fastdtw df = pd.DataFrame({ 'ID' : range ( 0 , 25 ), 'col2' :np.random.randn( 25 ) + 3 , 'col3' :np.random.randn( 25 ) + 3 , 'col4' :np.random.randn( 25 ) + 3 , 'col5' :np.random.randn( 25 ) + 3 }) from fastdtw import * from scipy.spatial.distance import * x = np.array(df[ 'col2' ].fillna( 0 )) y = np.array(df[ 'col3' ].fillna( 0 )) distance, path = fastdtw(x, y, dist = euclidean) result = [] for i in range ( 0 , len (path)): result.append([df[ 'ID' ].iloc[path[i][ 0 ]], df[ 'col2' ].iloc[path[i][ 0 ]], df[ 'col3' ].iloc[path[i][ 1 ]]]) df_synchronized = pd.DataFrame(data = result,columns = [ 'ID' , 'col2' , 'col3' ]).dropna() df_synchronized = df_synchronized.drop_duplicates(subset = [ 'ID' ]) df_synchronized = df_synchronized.sort_values(by = 'ID' ) df_synchronized = df_synchronized.reset_index(drop = True ) df_synchronized.head(n = 3 ) |
-------------
Here is the df_synchronized:
![[Image: 0SpmE.png]](https://i.stack.imgur.com/0SpmE.png)
I would like to iterate over all columns in DataFrame and do the same for 'col4' and 'col5' as was for 'col3' being done. Simply, 'col3' needs to be replaced in a loop with 'col4' and 'col5'. The goal would be to have the df_synchronized with all columns from df.
Is there any way, how to make it done?
--------
1 |
distance, path = fastdtw(x, y, dist = euclidean) |
can't be change to distance,
path = fastdtw(x, y, z, aa, dist=euclidean)
. 'Synchronization' needs to be done on one column, then save into df_synchronized, then with next column...