Python Forum

Full Version: How to read rainfall time series and insert missing data points
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I am currently attempting to write a code that will read .km2 files, which contains a header row with columns of information and an array of precipitation measurements. The file is structured as shown in the attached file example.
Each rainfall intensity is given with a 2 minute interval.
I would like to get the start time and end time of the rainfall time series and be able to insert zeros in the indexs with no value.
I would also like to restructure the data as shown in the example below:

[ TIME ] [Intensity]
19790125 1208 3.333
19790125 1210 0.833
19790125 1212 0.833
... ...

I hope that anybody can help!

Best regards
MadsM
Will you please attach a .km2 file? I can't find any examples on the web. Also, if you have any code of your own, it would be helpful to attach that as well. Thank you.
(Jan-03-2022, 04:01 PM)BashBedlam Wrote: [ -> ]Will you please attach a .km2 file? I can't find any examples on the web. Also, if you have any code of your own, it would be helpful to attach that as well. Thank you.

It is attached now. I am still quite new to python programming and therefore i do not have any helpful code to attach. Thank you for the response by the way!
please show desired output sample with new header and some data (just a couple of data lines)

I shouldn't do this, see https://python-forum.io/misc.php?action=help&hid=52

there are no headers in the output file, but data is as requested, you can finish the code.
import os
from pathlib import Path


def convert_file():
    # Set start directory same as script
    os.chdir(os.path.abspath(os.path.dirname(__file__)))

    infile = Path('.') / 'gauga20211_19790101-20120101.txt'
    outfile = Path('.') / 'newfile.txt'

    with infile.open() as fp, outfile.open('w') as fout:
        startdate = None
        starttime = None
        nexttime = 0

        for line in fp:
            line = line.strip().split()
            # extract header
            if line[0] == '1':
                startdate = line[1]
                starttime = line[2]
                nexttime = int(starttime)
            else:
                for item in line:
                    data = f"{startdate} {nexttime} {item}\n"
                    fout.write(data)
                    nexttime += 2


if __name__ == '__main__':
    convert_file()
sample output
Output:
19790111 1007 3.333 19790111 1009 3.333 19790111 1011 3.333 19790111 1013 0.556 19790111 1015 0.556 19790111 1017 0.556 19790111 1019 0.556 19790111 1021 0.556 19790111 1023 0.556 19790111 1025 3.333 19790111 1027 0.370 19790111 1029 0.370 19790111 1031 0.370 19790111 1033 0.370 19790111 1035 0.370 19790111 1037 0.370 19790111 1039 0.370 19790111 1041 0.370 19790111 1043 0.370 19790111 1045 0.476 19790111 1047 0.476 19790111 1049 0.476 19790111 1051 0.476 19790111 1053 0.476 19790111 1055 0.476 19790111 1057 0.476 19790111 1059 6.667 19790111 1061 1.667 19790111 1063 1.667 19790111 1065 6.667 19790111 1067 0.667 19790111 1069 0.667 19790111 1071 0.667 19790111 1073 0.667 19790111 1075 0.667 19790111 1077 0.417 19790111 1079 0.417 19790111 1081 0.417 19790111 1083 0.417 19790111 1085 0.417 19790111 1087 0.417 19790111 1089 0.417 19790111 1091 0.417 19790125 1208 3.333 19790125 1210 0.833 19790125 1212 0.833 ...
Hello, I am working together with Mads, and wanted to say thanks!
I will try to see if the post can be moved to the homework section :)
There is however a problem with the code which I am unsure how to fix.
As it is now, the time is a combination of both time and a number, as it can be seen that when the times are added the number changes at 100 and not 60 as I need it to.
I thought I could change your script to be DateTime instead, I can however not figure out how to do this. I thought I could use the panda package and write something like this:

import os
from pathlib import Path
import pandas as pd

def convert_file():
    # Set start directory same as script
    os.chdir(os.path.abspath(os.path.dirname(__file__)))

    infile = Path('.') / 'gauga20211_19790101-20120101.km2'
    outfile = Path('.') / 'newfile.txt'

    with infile.open() as fp, outfile.open('w') as fout:
        startdate = None
        starttime = None
        nexttime = 0

        for line in fp:
            line = line.strip().split()
            # extract header
            if line[0] == '1':
                startdate = line[1]
                fout[line[1]]=pd.to_datetime(fout[line[1]], format='%Y%m%d')
                starttime = line[2]
                fout[line[2]] = pd.to_datetime(fout[line[2]], format='%H%M')
                nexttime = int(starttime)
            else:
                for item in line:
                    data = f"{startdate}{nexttime};{item}\n"
                    fout.write(data)
                    nexttime += 2


if __name__ == '__main__':
    convert_file()



But when i try to do this i get the error TypeError:
Traceback (most recent call last):
File "C:\Users\kaspe\PycharmProjects\pythonproject_project\KM2.py", line 34, in <module>
convert_file()
File "C:\Users\kaspe\PycharmProjects\pythonproject_project\KM2.py", line 22, in convert_file
fout[line[1]]=pd.to_datetime(fout[line[1]], format='%Y%m%d')
TypeError: '_io.TextIOWrapper' object is not subscriptable

I am aware that I have not changed nexttime to be minutes yet, but the error is stated for line 22
Do you know how to do this
looking forward to hearing from you!
kind regards
Kasper