change item in column based on the item before[solved]

amdi40 · (This post was last modified: Jan-11-2022, 04:51 PM by amdi40.)

Hello I would like to be able to compare a number with another number, and if the numbers are equal to each other, then I would like to add a number to the last number. Example:

[Var]
1
1
3
4
5
I would first compare lines 1 and 2 and as they are equal to each other then I would add a number to line 2 fx 1. Then I would like to compare lines 2 and 3, and as they are going to be different(line 2 = 2 now) then the variable should not be changed

I have tried doing it like this:

df = pd.read_csv("newfile1.csv", dtype=str)
n = 2
with open('newfile1.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    with open('newfile2.csv', 'w') as new_file:
        fieldnames = ['datetime', 'intensity']
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter=',')
        csv_writer.writeheader()
        for line in csv_reader:
            for i in range(len('datetime') - 1):
                if 'datetime'[i + 1] == 'datetime'[i]:
                    'datetime'[i + 1] = 'datetime'[i] + pd.DateOffset(minutes=n)
            csv_writer.writerow(line)

The datetime variable that i would like to change does however not change
Looking forward to hearing from you
Kind regards
Amdi

**Larz60+** · Jan-10-2022, 12:01 PM

Your code can be greatly simplified.
The following opens the input spreadsheet you supplied, and writes it back out as a csv.
It does nothing to modify the data, you will have to add that.
The print statement only shows what was read and should be removed when statisfied

import os
import pandas as pd

# make sure starting directory set
os.chdir(os.path.abspath(os.path.dirname(__file__)))
n = 2
df = pd.read_excel("newfile1.xlsx", dtype=str)
print(df)
pd.to_csv('newfile2.csv')

Output:     datetime,intensity
0   19790111-1007,3.333
1   19790111-1007,3.333
2   19790111-1007,3.333
3   19790111-1007,0.556
4   19790111-1007,0.556
5   19790111-1007,0.556
6   19790111-1007,0.556
7   19790111-1007,0.556
8   19790111-1007,0.556
9   19790111-1007,3.333
10  19790111-1007,0.370
11  19790111-1007,0.370
12  19790111-1007,0.370
13  19790111-1007,0.370
14  19790111-1007,0.370
15  19790111-1007,0.370
16  19790111-1007,0.370
17  19790111-1007,0.370
18  19790111-1007,0.370
19  19790111-1007,0.476
20  19790111-1007,0.476
21  19790111-1007,0.476
22  19790111-1007,0.476
23  19790111-1007,0.476
24  19790111-1007,0.476
25  19790111-1007,0.476

amdi40 · Jan-11-2022, 10:03 AM

(Jan-10-2022, 12:01 PM)Larz60+ Wrote: Your code can be greatly simplified.
The following opens the input spreadsheet you supplied, and writes it back out as a csv.
It does nothing to modify the data, you will have to add that.
The print statement only shows what was read and should be removed when statisfied

import os
import pandas as pd

# make sure starting directory set
os.chdir(os.path.abspath(os.path.dirname(__file__)))
n = 2
df = pd.read_excel("newfile1.xlsx", dtype=str)
print(df)
pd.to_csv('newfile2.csv')

Output:     datetime,intensity
0   19790111-1007,3.333
1   19790111-1007,3.333
2   19790111-1007,3.333
3   19790111-1007,0.556
4   19790111-1007,0.556
5   19790111-1007,0.556
6   19790111-1007,0.556
7   19790111-1007,0.556
8   19790111-1007,0.556
9   19790111-1007,3.333
10  19790111-1007,0.370
11  19790111-1007,0.370
12  19790111-1007,0.370
13  19790111-1007,0.370
14  19790111-1007,0.370
15  19790111-1007,0.370
16  19790111-1007,0.370
17  19790111-1007,0.370
18  19790111-1007,0.370
19  19790111-1007,0.476
20  19790111-1007,0.476
21  19790111-1007,0.476
22  19790111-1007,0.476
23  19790111-1007,0.476
24  19790111-1007,0.476
25  19790111-1007,0.476

Thanks for the reply, I think I might have been unclear in what I wanted to get help with. My problem is that the current time in the file is wrong. The timestep between each line should be 2 minutes but is as of now 0, and then when a new event is recorded it switches to that timestamp and then the time remains constant. You have already helped alot with this in another post https://python-forum.io/thread-35960-pos...#pid151576. The problem here is however that the data for minutes went over 60:
19790111 1061 1.667
should be:
19790111 1101 1.667

In order to fix this, I thought I could remove the next function you made, so the timestep would become constant for each recorded event, and then add a deltatime to each timestep after. i can however not figure out how to do it Sad

amdi40 · (This post was last modified: Jan-11-2022, 10:05 AM by amdi40.)

(Jan-11-2022, 10:03 AM)amdi40 Wrote:
(Jan-10-2022, 12:01 PM)Larz60+ Wrote: Your code can be greatly simplified.
The following opens the input spreadsheet you supplied, and writes it back out as a csv.
It does nothing to modify the data, you will have to add that.
The print statement only shows what was read and should be removed when statisfied
import os
import pandas as pd

# make sure starting directory set
os.chdir(os.path.abspath(os.path.dirname(__file__)))
n = 2
df = pd.read_excel("newfile1.xlsx", dtype=str)
print(df)
pd.to_csv('newfile2.csv')
Output:     datetime,intensity
0   19790111-1007,3.333
1   19790111-1007,3.333
2   19790111-1007,3.333
3   19790111-1007,0.556
4   19790111-1007,0.556
5   19790111-1007,0.556
6   19790111-1007,0.556
7   19790111-1007,0.556
8   19790111-1007,0.556
9   19790111-1007,3.333
10  19790111-1007,0.370
11  19790111-1007,0.370
12  19790111-1007,0.370
13  19790111-1007,0.370
14  19790111-1007,0.370
15  19790111-1007,0.370
16  19790111-1007,0.370
17  19790111-1007,0.370
18  19790111-1007,0.370
19  19790111-1007,0.476
20  19790111-1007,0.476
21  19790111-1007,0.476
22  19790111-1007,0.476
23  19790111-1007,0.476
24  19790111-1007,0.476
25  19790111-1007,0.476
Thanks for the reply, I think I might have been unclear in what I wanted to get help with. My problem is that the current time in the file is wrong. The timestep between each line should be 2 minutes but is as of now 0, and then when a new event is recorded it switches to that timestamp and then the time remains constant. You have already helped a lot with this in another post https://python-forum.io/thread-35960-pos...#pid151576 which I am very thankful for!. The problem here is however that the data for minutes went over 60:
19790111 1061 1.667
should be:
19790111 1101 1.667

In order to fix this, I thought I could remove the next function you made, so the timestep would become constant for each recorded event, and then add a deltatime to each timestep after. i can however not figure out how to do it

amdi40 · (This post was last modified: Jan-11-2022, 04:50 PM by amdi40.)

I have also tried

import os
from pathlib import Path
import pandas as pd
import datetime

def convert_file():
    # Set start directory same as script
    os.chdir(os.path.abspath(os.path.dirname(__file__)))

    infile = Path('.') / 'gauga20211_19790101-20120101.txt'
    outfile = Path('.') / 'newfile.csv'

    with infile.open() as fp, outfile.open('w') as fout:
        startdate = None
        starttime = None
        nexttime = 0
        for line in fp:
            line = line.strip().split()
            # extract header
            if line[0] == '1':
                startdate = pd.to_datetime(line[1], format='%Y%m%d')
                starttime = pd.to_datetime(line[2], format='%H%M')
                nexttime = starttime
            else:
                for item in line:
                    data = f"{startdate}-{starttime},{item}\n"
                    fout.write(data)
                    nexttime += datetime.timedelta(minutes=2)
if __name__ == '__main__':
    convert_file()

The date format is however not as i try to format it, and the desired deltatime is not added

Output:1979-01-11 00:00:00-1900-01-01 10:07:00,3.333
1979-01-11 00:00:00-1900-01-01 10:07:00,3.333
1979-01-11 00:00:00-1900-01-01 10:07:00,3.333
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,0.556
1979-01-11 00:00:00-1900-01-01 10:07:00,3.333
1979-01-11 00:00:00-1900-01-01 10:07:00,0.370
1979-01-11 00:00:00-1900-01-01 10:07:00,0.370
1979-01-11 00:00:00-1900-01-01 10:07:00,0.370

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	removing item	pastakipp	3	2,674	Oct-25-2020, 09:15 PM Last Post: deanhystad
	Python Adding +1 to a list item cointained in a dict	ElReyZero	1	2,631	Apr-30-2020, 05:12 AM Last Post: deanhystad
	need help removing an item from a list	jhenry	4	5,250	Oct-13-2017, 08:15 AM Last Post: buran
	Determine if a list contains a specific number of an item	flannel_man	3	5,767	Nov-12-2016, 04:46 PM Last Post: micseydel

change item in column based on the item before[solved]

User Panel Messages

Announcements