![]() |
Select data from between two timestamps - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Select data from between two timestamps (/thread-24063.html) Pages:
1
2
|
Select data from between two timestamps - Makada - Jan-29-2020 Hi, I am currently reading data from a .dat file as CSV. With the script below i manage to get data from the source, but i cant get data when i select to get data between 2 timestamps. For example data between 2020-01-20 12:00:00 and 2020-01-25 12:15:00 With the script below i get the whole datarange from the source, see second field showing the output from the script. Maybe someone have a hint on how to accomplish that? Thanks. import pandas as pd import numpy as np import datetime as dt import csv df = pd.read_csv("C:\\Users\\Makada\\Desktop\\CR1000_Table1 - kopie .dat", skiprows=1, index_col=['TIMESTAMP'], parse_dates=True, header=0, usecols=['TIMESTAMP','WS_kph_Max']) df['2020-01-20 20:59:00':'2020-01-22 21:01:00'] print(df) Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> =========== RESTART: C:\Users\Makada\Desktop\pppooooooooooooooooo.py =========== WS_kph_Max TIMESTAMP TS kilometers/hour NaN Max 2020-01-19 20:59:00 4.609 2020-01-19 21:00:00 4.294 2020-01-19 21:01:00 4.24 ... ... 2020-01-29 16:30:00 20.09 2020-01-29 16:31:00 30.92 2020-01-29 16:32:00 24.65 2020-01-29 16:33:00 21.34 2020-01-29 16:34:00 25.74 [14138 rows x 1 columns] RE: Select data from between two timestamps - Larz60+ - Jan-30-2020 here's an example to see if date lies between two others: example: https://stackoverflow.com/a/5464467 RE: Select data from between two timestamps - Makada - Jan-30-2020 Hi, Thanks for your reply. But with the examples given in the link i dont know how to get it to work with my code. My knowledge of python is limited ![]() Thanks. I managed with the code below to get the desired dates, but the values given for that dates (WS_kph_Max) doesnt match from the source. Print(df['2020-01-20 20:59:00':'2020-01-22 21:01:00'])Output: WS_kph_Max TIMESTAMP 2020-01-20 20:59:00 0.078 2020-01-20 21:00:00 0.078 2020-01-20 21:01:00 0.078 Yes its working. ![]() Now lets see if i can get it to plot. Thanks. RE: Select data from between two timestamps - Makada - Jan-30-2020 I see the data type of timestamp and WS_kph_Max are "object". Do i have to convert the timestamp to "datetime64" and WS_kph_Max to "float64" before i start trying to plot them ? Thanks. RE: Select data from between two timestamps - Makada - Jan-31-2020 Hi, I have changed my code a bit, i can plot the data, but when i add code to have data plot in between 2 dates, i get an error.(for error, see second view) import pandas as pd import numpy as np import datetime as dt import csv from pandas import read_csv from datetime import datetime import matplotlib import matplotlib.pyplot as plt df = pd.read_csv("C:\\Users\\Makada\\Desktop\\CR1000_Table1 - kopie - kopie.dat",skiprows=1, parse_dates=[0]) df=pd.DataFrame(df, columns=['TIMESTAMP','RH_Avg']) df['TIMESTAMP']=pd.to_datetime(df.TIMESTAMP, errors='coerce') df['RH_Avg']=pd.to_numeric(df.RH_Avg, errors='coerce') df=df.dropna()#to remove Nan values df.info() print(df) print(df.dtypes) start_date = pd.to_datetime('2020-01-19 20:59:00', dayfirst=True) end_date = pd.to_datetime('2020-01-19 23:00:00', dayfirst=True) (df[(df.index > start_date) & (df.index > end_date)]) df.plot('TIMESTAMP','RH_Avg',kind='line', linewidth=0.5) plt.show() <class 'pandas.core.frame.DataFrame'> Int64Index: 181 entries, 2 to 182 Data columns (total 2 columns): TIMESTAMP 181 non-null datetime64[ns] RH_Avg 181 non-null float64 dtypes: datetime64[ns](1), float64(1) memory usage: 4.2 KB TIMESTAMP RH_Avg 2 2020-01-19 20:59:00 87.16161 3 2020-01-19 21:00:00 87.18287 4 2020-01-19 21:01:00 87.18132 5 2020-01-19 21:02:00 87.20342 6 2020-01-19 21:03:00 87.21060 .. ... ... 178 2020-01-19 23:55:00 89.14359 179 2020-01-19 23:56:00 89.15077 180 2020-01-19 23:57:00 89.15542 181 2020-01-19 23:58:00 89.17583 182 2020-01-19 23:59:00 89.18188 [181 rows x 2 columns] TIMESTAMP datetime64[ns] RH_Avg float64 dtype: object Traceback (most recent call last): File "C:\Users\Makada\Desktop\pppooooooooooooooooolllllllllllllllllllllllllllllllllllllll.py", line 32, in <module> (df[(df.index > start_date) & (df.index > end_date)]) File "C:\Python\lib\site-packages\pandas\core\indexes\base.py", line 111, in cmp_method result = op(self.values, np.asarray(other)) TypeError: '>' not supported between instances of 'int' and 'Timestamp' >>> Yes, got it working with code below. Although the code is working ok, i get a warning showing text in second view below. import pandas as pd import numpy as np import datetime as dt i mport csv from pandas import read_csv from datetime import datetime import matplotlib import matplotlib.pyplot as plt df = pd.read_csv("C:\\Users\\Makada\\Desktop\\CR1000_Table1 - kopie .dat",skiprows=1, parse_dates=[0]) df=pd.DataFrame(df, columns=['TIMESTAMP','RH_Avg']) df['TIMESTAMP']=pd.to_datetime(df.TIMESTAMP, errors='coerce') df['RH_Avg']=pd.to_numeric(df.RH_Avg, errors='coerce') df=df.dropna()#to remove Nan values df.info() print(df) print(df.dtypes) start_date = pd.to_datetime('2020-01-19 21:00:00', dayfirst=True) end_date = pd.to_datetime('2020-01-19 22:00:00', dayfirst=True) mask=(df['TIMESTAMP'] > start_date) & (df['TIMESTAMP'] <= end_date) df.loc[mask] df=df.loc[mask] df.plot('TIMESTAMP','RH_Avg',kind='line', linewidth=0.5) plt.show() Warning (from warnings module): File "<string>", line 1 DtypeWarning: Columns (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) have mixed types. Specify dtype option on import or set low_memory=False. RE: Select data from between two timestamps - Makada - Jan-31-2020 Hi, I solved the warning by adding "dtype='unicode'" to the import. I think "low_memory=False" is possible also . Thanks. RE: Select data from between two timestamps - Makada - Jan-31-2020 Hi, I have made some adjustments to the script so the plot is updating in the same view every minute or so. But now the plot is updating in a new screen every minute. Whats wrong with the code to have the plot update in its own screen? Thanks. import pandas as pd import numpy as np import datetime as dt import csv from pandas import read_csv from datetime import datetime import matplotlib import matplotlib.pyplot as plt df = pd.read_csv("C:\\Users\\Makada\\Desktop\\CR1000_Table1 - kopie .dat",skiprows=1, parse_dates=[0], dtype='unicode') df=pd.DataFrame(df, columns=['TIMESTAMP','WS_kph_S_WVT']) df['TIMESTAMP']=pd.to_datetime(df.TIMESTAMP, errors='coerce') df['WS_kph_S_WVT']=pd.to_numeric(df.WS_kph_S_WVT, errors='coerce') df=df.dropna()#to remove Nan values df.info() plt.ion() fig, ax= plt.subplots() while(True): start_date = pd.to_datetime('2020-01-29 12:00:00', dayfirst=True) end_date = pd.to_datetime('2020-02-01 12:00:00', dayfirst=True) mask=(df['TIMESTAMP'] > start_date) & (df['TIMESTAMP'] <= end_date) df.loc[mask] df=df.loc[mask] df.plot('TIMESTAMP','WS_kph_S_WVT',kind='line', linewidth=0.5) plt.ylim(0,150) plt.pause(10) plt.draw() plt.clf() print(df) print(df.dtypes) RE: Select data from between two timestamps - Makada - Jan-31-2020 Hi, I got it working with code below. ![]() import pandas as pd import numpy as np import datetime as dt import csv from pandas import read_csv from datetime import datetime import matplotlib import matplotlib.pyplot as plt import time starttime=time.time() plt.ion() fig, ax=plt.subplots() while(True): df = pd.read_csv("C:\\Users\\Makada\\Desktop\\CR1000_Table1 - kopie .dat",skiprows=1, parse_dates=[0], dtype='unicode') df=pd.DataFrame(df, columns=['TIMESTAMP','WS_kph_S_WVT']) df['TIMESTAMP']=pd.to_datetime(df.TIMESTAMP, errors='coerce') df['WS_kph_S_WVT']=pd.to_numeric(df.WS_kph_S_WVT, errors='coerce') df=df.dropna()#to remove Nan values df.info() start_date = pd.to_datetime('2020-01-31 12:00:00', dayfirst=True) end_date = pd.to_datetime('2020-02-28 12:00:00', dayfirst=True) mask=(df['TIMESTAMP'] > start_date) & (df['TIMESTAMP'] <= end_date) df.loc[mask] df=df.loc[mask] ax.clear() df.plot(x='TIMESTAMP',y='WS_kph_S_WVT',kind='line', linewidth=0.5, ax=ax) plt.ylim(0,150) plt.draw() plt.pause(10 - time.time() % 10) plt.show() print(df) print(df.dtypes) RE: Select data from between two timestamps - Larz60+ - Jan-31-2020 Thank you for sharing the answer RE: Select data from between two timestamps - Makada - Jan-31-2020 Hi, maybe not the cleanest/best/fastest script, but its working. ![]() Thanks. |