Jun-10-2022, 07:59 PM
when I try to run the following code, I get an error
I am not sure how to correct, I beleive it has something to do with these lines
Any help appreciated.
Respectfully,
LZ
import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline from statsmodels.tsa.stattools import adfuller from statsmodels.tsa.seasonal import seasonal_decompose import time from tqdm import tqdm from scipy import stats from sklearn.metrics import mean_squared_error from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from sklearn.feature_selection import RFE from sklearn.ensemble import ExtraTreesClassifier from sklearn.metrics import f1_score from sklearn.metrics import roc_auc_score from sklearn.metrics import roc_curve, auc from sklearn.metrics import confusion_matrix from sklearn.metrics import classification_report from xgboost import XGBClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier from sklearn.linear_model import LogisticRegression df = pd.read_csv("sensor.csv") print('here') df.head() # Find Duplicate Values # Results will be the list of duplicate values # If no duplicate values, nothing will list. df[df['timestamp'].duplicated(keep=False)] df.isnull().sum() df['machine_status'].value_counts() # Convert timestamp column into data type into datetime df['timestamp'] = pd.to_datetime(df['timestamp']) # Create a Series time_period = pd.Series([]) # Assign values to series for i in tqdm(range(df.shape[0])): if (df["timestamp"][i].hour >= 4) and (df[timestamp][i].hour < 10): time_period[i]="Morning" elif (df["timestamp"][i].hour >= 10) and (df[timestamp][i].hour < 16): time_period[i]="Noon" elif (df["timestamp"][i].hour >= 16) and (df[timestamp][i].hour < 22): time_period[i]="Evening" else: time_period[i]="Night" # Insert new column time period df.Insert(2, 'time_period', time_period)I get an error. The error is
Error:C:\Users\james\AppData\Local\Temp\ipykernel_24076\1118037779.py:50: FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
time_period = pd.Series([])
0%| | 240/220320 [00:00<01:59, 1848.32it/s]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 53>()
52 # Assign values to series
53 for i in tqdm(range(df.shape[0])):
---> 54 if (df["timestamp"][i].hour >= 4) and (df[timestamp][i].hour < 10):
55 time_period[i]="Morning"
56 elif (df["timestamp"][i].hour >= 10) and (df[timestamp][i].hour < 16):
NameError: name 'timestamp' is not defined
Now it says timestamp not defined. I think it is. This is not my code, but somebody else's code.I am not sure how to correct, I beleive it has something to do with these lines
# Convert timestamp column into data type into datetime df['timestamp'] = pd.to_datetime(df['timestamp'])How can I fix it?
Any help appreciated.
Respectfully,
LZ