Python Forum
Trying to delete rows above a specific datetime value
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to delete rows above a specific datetime value
#11
OK, i think this works.. so far the few tests dates and times have returned what i expected

# THIS OPENS THE NEWLY CLEAN DATA IN ORDER TO REMOVE OLD RECORDS
df = pd.read_csv("Dates.txt", usecols=range(3), header=None, names=["Date", "Time", "Comment"])
# THIS CONVERTS THE DATE COLUMN INTO A DATETIME FORMAT
df['Date'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])
# HERE YOU NEED TO PROVIDE THE DATE TIME YOU WANT TO KEEP GOING FORWARD
mask = (df['Date'] > '2022-04-28  7:06 AM')
# THIS RETURNS ALL ROWS GREATER THAN THE DATE PROVIDED ABOVE
df = df.loc[mask]
I spoke to soon..

Even though the above does return only records that are greater than the datetime provided in the mask, i just noticed that in the output file, now the date column value is the combined value, i cant have that..

I thought that combining the two columns was only for the purpose of the filtering.. i need to have the output exactly as it was prior to filtering out the older dates..

What did i do wrong?
Reply
#12
Create a new column which combines date and time. Remove items that are older than you threshold. Drop the datetime column you created. Write dataframe to CSV file.
Reply
#13
So to be clear, when you say create, thats just a temp column for the purpose of the filtering correct?
Did i not do that above in post #8?
Reply
#14
Got it working.. thank you for the suggestion, i updated the code to this and produced the correct results in the updated/new csv file

# THIS OPENS THE NEWLY CLEAN DATA IN ORDER TO REMOVE OLD RECORDS
df = pd.read_csv("Dates.txt", usecols=range(3), header=None, names=["Date", "Time", "Comment"])
# THIS CONVERTS THE DATE COLUMN INTO A DATETIME FORMAT
df['DT'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])
# HERE YOU NEED TO PROVIDE THE DATE YOU WANT TO KEEP GOING FORWARD
mask = (df['DT'] > '2022-04-28  7:06 AM')
# THIS RETURNS ALL ROWS GREATER THAN THE DATE PROVIDED ABOVE
ndf = df.loc[mask]
Reply
#15
MAN i really suck at this, the above code does work for filtering out the data, but i just tried to import it into the excel file and notice that the combined column was appended to the end of the columns, so instead of a 3 column CSV i now have a 4 column CSV and thats a no go type of problem..

What is wrong with the above code that its adding that 4th column?
Reply
#16
You must have forgot to drop the "DT" column before writing the CSV.

Another thing you could do is create the date-time as a separate dataframe.
import pandas as pd
import numpy as np

df = pd.DataFrame({"A":np.random.randint(1, 100, 10), "B":np.random.randint(1, 100, 10)})
apb = df["A"] + df["B"]  # <- This creates a new dataframe instead of adding a column to df.

df = df[apb < 100]
print(df)
Output:
A B 1 30 25 2 75 1 3 9 73 4 43 3 5 26 63 7 50 48
Reply
#17
It was the drop that i was missing..

# DROP TEMP COLUMN BEFORE WRITING CSV
df['DT'].drop
Added this and now works as expected. thank you
Reply
#18
Well i spoke to fast, it was working fine in my test script, but now that i implemented the changes above to my actual script, im getting the datetime column at the end even though i have the df.drop before writing the new csv file..

How can that happen? What is wrong with this now? The replacing works, the datetime filter works, but when its written to the new text file, the temp column df['DT'] is appended to the end of the 3 original columns.

Unless i opened an old text file this morning when i checked, i thought this was already working..

# THIS OPENS THE NEWLY CLEAN DATA IN ORDER TO REMOVE OLD RECORDS
df = pd.read_csv("Dates.txt", usecols=range(3), names=["Date", "Time", "Comment"])

df['Comment'] = [re.sub(r'(?:^|\W)Someone Else:(?:$|\W)','Dan: ', str(x)) for x in df['Comment']]
df['Comment'] = [re.sub(r'(?:^|\W)Elizabeth:(?:$|\W)','Ross: ', str(x)) for x in df['Comment']]

# THIS CONVERTS THE DATE COLUMN INTO A DATETIME FORMAT
df['DT'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])

# HERE YOU NEED TO PROVIDE THE DATE YOU WANT TO KEEP GOING FORWARD
mask = (df['DT'] > '2022-05-02  9:32 AM')

# THIS RETURNS ALL ROWS GREATER THAN THE DATE PROVIDED ABOVE
df = df.loc[mask]

# DROP TEMP COLUMN BEFORE WRITING CSV
df['DT'].drop

# THIS IS THE FILTERED DATA RESULTS TO IMPORT INTO EXCEL
df.to_csv(r'C:\Users\mynewfile.txt', header=None, index=None, mode='a')
Reply
#19
Added this parameter to my to_csv and it seems to be working as expected

columns=["Date", "Time", "Comment"], header=None, index=None, mode='a')
ill run a few more tests on the production file and make sure.
Reply
#20
This returns the drop function. It does nothing to modify df.
# DROP TEMP COLUMN BEFORE WRITING CSV
df['DT'].drop
To drop 'DT' from df you can do either of these:
df = df.drop('DT')
# or
df.drop('DT', inplace=True)
Try reading the documentation. I mean really reading until you understand what it says. For me that usually takes 4-5 times through the first time I try something new.

https://pandas.pydata.org/pandas-docs/st....drop.html
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  delete specific row of entries jacksfrustration 3 396 Feb-13-2024, 11:13 PM
Last Post: deanhystad
  Reading Specific Rows In a CSV File finndude 3 990 Dec-13-2022, 03:19 PM
Last Post: finndude
  Pymysql delete specific rows in tableview stsxbel 2 1,091 Aug-18-2022, 09:50 AM
Last Post: ibreeden
  The code I have written removes the desired number of rows, but wrong rows Jdesi1983 0 1,634 Dec-08-2021, 04:42 AM
Last Post: Jdesi1983
Question [SOLVED] Delete specific characters from string lines EnfantNicolas 4 2,221 Oct-21-2021, 11:28 AM
Last Post: EnfantNicolas
  openpyxl incorrect delete rows VladislavM 6 4,123 Jul-19-2021, 08:54 AM
Last Post: VladislavM
  Pandas DataFrame combine rows by column value, where Date Rows are NULL rhat398 0 2,116 May-04-2021, 10:51 PM
Last Post: rhat398
  Indexing [::-1] to Reverse ALL 2D Array Rows, ALL 3D, 4D Array Columns & Rows Python Jeremy7 8 7,128 Mar-02-2021, 01:54 AM
Last Post: Jeremy7
  How to sort rows based on specific order Mekala 3 2,228 Jul-31-2020, 01:01 AM
Last Post: bowlofred
  How to extract specific rows and columns from a text file with Python Farhan 0 3,392 Mar-25-2020, 09:18 PM
Last Post: Farhan

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020