Trying to delete rows above a specific datetime value - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Trying to delete rows above a specific datetime value (/thread-37148.html) Pages:
1
2
|
Trying to delete rows above a specific datetime value - cubangt - May-05-2022 So i have a csv file, and the data is structured like this 2022-04-25, 7:06 AM, bla bla bla 2022-04-25, 7:07 AM, ha ha ha 2022-04-25, 7:08 AM, ha ha ha 2022-04-25, 7:09 AM, ha ha ha And so on.. I need to add a "delete" option that i can provide the date and time and have it find that line and delete it as well as everything above that line.. So if i provided: 2022-04-25, 7:07 AM I want it to delete that line and all the lines above it so my file will end up with only 2022-04-25, 7:08 AM, ha ha ha 2022-04-25, 7:09 AM, ha ha ha Im currently using these imports, can this be done with pandas? import re import pandas as pd RE: Trying to delete rows above a specific datetime value - deanhystad - May-05-2022 You can make a new dataframe that only contains the values you want. Here I "delete" all values >= 50 and save the results in "low". You could do the same thing by comparing datetime objects. import pandas as pd import numpy as np df = pd.DataFrame({"Letters":[a for a in 'ABCDEFG'], "Numbers":np.random.randint(1, 100, 7)}) low = df[df["Numbers"] < 50] high = df[df["Numbers"] >= 50] print("Low", low, sep="\n") print("\nHigh", high, sep="\n")
RE: Trying to delete rows above a specific datetime value - Pedroski55 - May-06-2022 I sometimes do a similar thing in MySQL, if I forget to set the correct date first. As far as I know, you can compare dates directly。 Assuming your data is in ascending order, you can find the position where you want to cut off an take a slice. def myApp(): import csv, glob path2file = '/home/pedro/myPython/csv/' files = glob.glob(path2file + '*.csv') for f in files: print('The files are', f) myfile = input('Copy and paste a file here ... ') with open(myfile) as infile: my_csv_read = csv.reader(infile) data = [] # csv reader is burned as soon as you read it, so make a list, lasts longer for line in my_csv_read: data.append(line) print(line) # working principle x = '2022-04-25 7:07 PM' y = '2022-04-25 7:08 AM' if x > y: print('x is older than y') if x < y: print('x is younger than y') nums2delete = [] # start at 1 if you have a headers row for i in range(1, len(data)): if data[i][0] >= '2022-04-25 7:07 AM': break wanted = data[0:i] RE: Trying to delete rows above a specific datetime value - snippsat - May-06-2022 (May-06-2022, 07:45 AM)Pedroski55 Wrote: As far as I know, you can compare dates directly。Not using pandas,can compare but most make the string to a datetime object first,or will just compare strings. Also using pendulum make it easier as can parse automatically to a datetime object. That also will handle time zones correctly. >>> import pendulum >>> >>> x = '2022-04-25 7:07 PM' >>> y = '2022-04-25 7:08 AM' >>> x_date = pendulum.parse(x, strict=False) >>> y_date = pendulum.parse(y, strict=False) >>> x_date DateTime(2022, 4, 25, 19, 7, 0, tzinfo=Timezone('UTC')) >>> >>> x_date < y_date False >>> x_date.diff_for_humans(y_date) '11 hours after'If using pandas most also remember to covert dates to Pandas datetime. Example: >>> import pandas as pd >>> >>> d = {'Dates': ['2022-04-25 7:07 PM', '2022-04-25 7:08 AM'], 'Date_value': [111, 999]} >>> df = pandas.DataFrame(d) >>> df Dates Date_value 0 2022-04-25 7:07 PM 111 1 2022-04-25 7:08 AM 999 >>> >>> df.dtypes Dates object Date_value int64 dtype: object >>> >>> # Convert column to Pandas datetime64 >>> df['Dates'] = pd.to_datetime(df['Dates']) >>> df.dtypes Dates datetime64[ns] Date_value int64 dtype: object RE: Trying to delete rows above a specific datetime value - cubangt - May-06-2022 So this is what i currently have that works as removing any records prior to the date provided, but unclear how to add the "time" portion into the equation so that i dont have to manually find the exact date/time and remove them after the file is created. # THIS CLOSES THE CLEANED DATES FILE file2.close() # THIS OPENS THE NEWLY CLEAN DATA IN ORDER TO REMOVE OLD RECORDS df = pd.read_csv("Dates.txt", usecols=range(3), header=None, names=["Date", "Time", "Comment"]) # THIS CONVERTS THE DATE COLUMN INTO A DATETIME FORMAT df['Date'] = pd.to_datetime(df['Date']) # HERE YOU NEED TO PROVIDE THE DATE YOU WANT TO KEEP GOING FORWARD mask = (df['Date'] > '2022-04-24') # THIS RETURNS ALL ROWS GREATER THAN THE DATE PROVIDED ABOVE df = df.loc[mask]What im at a loss with is, since the date and time are in separate columns, how do i adjust the mask to consider both as the criteria? If i change the mask line to this i get a syntax error mask = (df['Date'] > '2022-04-24') & (df['Time'] = '7:06 AM') RE: Trying to delete rows above a specific datetime value - deanhystad - May-06-2022 Make a datetime column RE: Trying to delete rows above a specific datetime value - cubangt - May-06-2022 As in a virtual column? I cant mess with the original csv.. so if you mean make a new virtual column out of those 2 columns, i can do that, but how? RE: Trying to delete rows above a specific datetime value - cubangt - May-06-2022 Like this? # THIS CLOSES THE CLEANED DATES FILE file2.close() # THIS OPENS THE NEWLY CLEAN DATA IN ORDER TO REMOVE OLD RECORDS df = pd.read_csv("Dates.txt", usecols=range(3), header=None, names=["Date", "Time", "Comment"]) # THIS CONVERTS THE DATE COLUMN INTO A DATETIME FORMAT df['Date'] = pd.to_datetime(df['Date']) # THIS CONVERTS THE TIME COLUMN INTO A DATETIME FORMAT df['DT'] = df.apply(lambda row: row.Date + row.Time, axis=1) # HERE YOU NEED TO PROVIDE THE DATE YOU WANT TO KEEP GOING FORWARD #mask = (df['Date'] > '2022-04-24') mask = (df['DT'] > '2022-04-24 7:06 AM') # THIS RETURNS ALL ROWS GREATER THAN THE DATE PROVIDED ABOVE df = df.loc[mask] RE: Trying to delete rows above a specific datetime value - deanhystad - May-06-2022 https://stackoverflow.com/questions/17978092/combine-date-and-time-columns-using-python-pandas RE: Trying to delete rows above a specific datetime value - cubangt - May-06-2022 If i run my updated code above, i get this error:
|