Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Preprocess script
#1
Hi all.

I'm using Python to process climate data and use it in a weather model. To do this, there was a script in Python2, which I modified to adapt it to Python3. I have a part of the code, which when I execute it gives me the following error:

Error:
File "./preprocess.py", line 223, in calc grbout.write('msg', 'utf-8') TypeError: write() takes exactly one argument (2 given)
The part of the code to which it refers is the following:

        #logger.info("Calculating soil moisture and filling skintemp for %s" % ffile)
        logger.info("Calculating soil moisture for %s" % ffile)
        smfile = prefix + "calc_" + infix + suffix
        if os.path.isfile(smfile):
            os.remove(smfile)
        grbout = open(smfile, "w")
        grbs = pygrib.open(os.path.join(DATADIR, ffile))
        grbs.seek(0)
        #have_sst = False
        #have_skt = False
        for grb in grbs:
            if grb.indicatorOfParameter == 140:
                sm_values = grb.values/sm_at_field_capacity
                grb.indicatorOfParameter = 142
                grb.indicatorOfTypeOfLevel = 'sfc'
                grb.typeOfLevel = 'depthBelowLand'
                for level in SM_LEVELS.keys():
                    grb.level = level
                    grb.values = sm_values*SM_LEVELS[level]
                    msg = grb.tostring()
                    grbout.write(msg)
            #elif grb.indicatorOfParameter == 139:
            #    skt_grb = grb
            #    have_skt = True
            #elif grb.indicatorOfParameter == 103:
            #    sst_grb = grb
            #    have_sst = True
            #if have_sst and have_skt:
            #    values = skt_grb.values*sea_land_mask + sst_grb.values*(1-sea_land_mask)
            #    skt_grb.values = values
            #    skt_grb.indicatorOfParameter = 143
            #    msg = skt_grb.tostring()
            #    grbout.write(msg)
            #    del sst_grb
            #    del skt_grb
            #    have_sst = False
            #    have_skt = False
        grbs.close()
        grbout.close()
I searched for the solution, but I'm not sure what coding to use.
Reply
#2
open.write only takes one argument
Quote:Help on built-in function write:

write(text, /) method of _io.TextIOWrapper instance
Write string to stream.
Returns the number of characters written (which is always equal to
the length of the string).

I would do it as you have on line 21

PS You can also automatically translate python2 to python3 with the built in 2to3. No need to manually do it.
Recommended Tutorials:
Reply
#3
(Jun-27-2019, 10:32 PM)metulburr Wrote: PS You can also automatically translate python2 to python3 with the built in 2to3. No need to manually do it.
Thanks for your answer.

I used 2to3 to change the code of the script, the truth is much easier than doing it manually. The problem is that I keep giving the same error. I'm going to copy the entire code of the script, I can not see the error.

Error:
INFO: Calculating soil moisture for rcp45_r1i1p1-MR_eh6_rcm_etc_121.grb Traceback (most recent call last): File "./preprocess.py", line 376, in <module> main() File "./preprocess.py", line 369, in main calc(sm_at_field_capacity, sea_land_mask) File "./preprocess.py", line 223, in calc grbout.write(msg) TypeError: write() argument must be str, not bytes
#!/home/python3/bin/python3.7

HELPTEXT = """
Tool to preprocess echam6 grib data. Currently tested with MR data for rcp45,
rcp85 and hist, should also be working with LR data. All commandline arguments
are mandatory. The data directory must contain (a) a sub-directory static with
the corresponding static data file in it, and (b) the complete set of four files
for each month of interest as .grb.gz files.

Minimum example to process the first month of rcp85 data (2006/01):

./datadir
    ./rcp85_r1i1p1-MR_eh6_rcm_c5_0001.grb.gz
    ./rcp85_r1i1p1-MR_eh6_rcm_c5_133_0001.grb.gz
    ./rcp85_r1i1p1-MR_eh6_rcm_etc_0001.grb.gz
    ./rcp85_r1i1p1-MR_jsb_rcm_land_0001.grb.gz
    ./static/rcp85-MR_rcm_fx_1.grb

For historical data (1949/01):
./datadir
    ./hist_r1i1p1-MR_eh6_rcm_c5_001.grb.gz
    ./hist_r1i1p1-MR_eh6_rcm_c5_133_001.grb.gz
    ./hist_r1i1p1-MR_eh6_rcm_etc_001.grb.gz
    ./hist_r1i1p1-MR_jsb_rcm_land_001.grb.gz
    ./static/hist-MR_rcm_fx_1.grb

The tool will convert all available data in datadir, i.e., simply place all
the files of interest (0001=2006/01 until 1140=2100/12 for rcpXY) in datadir.

Copyright Dominikus Heinzeller, IMK-IFU, KIT, 2013 ([email protected]).
"""

import argparse
import datetime
import numpy
import os
import pygrib
import re
import shutil
import sys

# Provide own logging class for simplicity
class Logger:
    def debug(self, msg):
        print("DEBUG: %s" % msg)
    def info(self, msg):
        print("INFO: %s" % msg)
    def warning(self, msg):
        print("WARNING: %s" % msg)
    def error(self, msg):
        print("ERROR: %s" % msg)
        raise Exception(msg)
    def critical(self, msg):
        print("CRITICAL: %s" % msg)
        raise Exception(msg)
logger = Logger()

###########################################################################################
# Parse command line arguments and set corresponding global variables                     #
###########################################################################################
PARSER = argparse.ArgumentParser(description = HELPTEXT)
PARSER.add_argument('-s', '--scenario', default = None,
                    help='Which scenario (rcp45, rcp85, hist)')
PARSER.add_argument('-r', '--resolution', default = None,
                    help='Which resolution (MR or LR)')
PARSER.add_argument('-d', '--datadir', default = None,
                    help='Top-level data directory for this run')
args = PARSER.parse_args()

if not vars(args)['scenario'] or not vars(args)['datadir'] \
    or not vars(args)['resolution']:
    PARSER.print_help()
    logger.error("Not all mandatory arguments specified")
elif not vars(args)['scenario'] in ['rcp45', 'rcp85', 'hist' ]:
    logger.error("Invalid scenario specified, valid are 'rcp45', 'rcp85' and 'hist'")
elif not vars(args)['resolution'] in ['LR', 'MR']:
    logger.error("Invalid resolution specified, valid are 'LR' and 'MR'")
elif not os.path.isdir(vars(args)['datadir']):
    logger.error("Specified data directory does not exist")

SCENARIO = vars(args)['scenario']
RESOLUTION = vars(args)['resolution']
DATADIR = vars(args)['datadir']
os.chdir(DATADIR)

###########################################################################################
# Global variables, derived or user set                                                   #
###########################################################################################

# Filename patterns
SMCALC_PATTERN = re.compile("(%s_r1i1p1-%s_eh6_rcm_etc_)(\d{3,4})(\.grb)" % \
                            (SCENARIO, RESOLUTION))
DVCONV_PATTERN = re.compile("(%s_r1i1p1-%s_eh6_rcm_c5_)(\d{3,4})(\.grb)" % \
                            (SCENARIO, RESOLUTION))
FILENAME_PATTERN = re.compile("(.+_)(\d{3,4})(\.grb)")

# Input and output filename/pattern for static data
STATIC_DATA_INPUT_FILE = os.path.join(DATADIR, "static/%s-%s_rcm_fx_1.grb" % \
                                      (SCENARIO, RESOLUTION))
STATIC_DATA_OUTPUT_FILE = "%s-%s_rcm_static.grb" % (SCENARIO, RESOLUTION)
if not os.path.exists(STATIC_DATA_INPUT_FILE):
    logger.error("Static data input file '%s' does not exist" % STATIC_DATA_INPUT_FILE)

# Model timestep in hours
MODEL_TIMESTEP = 6

# Soil moisture levels and weights for interpolation from the integrated column;
# the way this is done at the moment only makes sense with equal weights 1 for all
SM_LEVELS = {
      3 : 1.0,
     19 : 1.0,
     78 : 1.0,
    268 : 1.0,
    698 : 1.0,
    }

# Files are numbered by month, need to know which month is the first one.
if SCENARIO in ['rcp45', 'rcp85']:
    # For rcpXY: 0001 = 2006/01; 1140 = 2100/12;
    STARTDATE = datetime.date(year = 2006, month = 1, day = 1)
elif SCENARIO == 'hist':
    # For hist: 001 = 1949/01; 684 = 2005/12;
    STARTDATE = datetime.date(year = 1949, month = 1, day = 1)
else:
    logger.error("Unknown scenario %s" % SCENARIO)

###########################################################################################
# Subroutines                                                                             #
###########################################################################################

def unpack():
    """Copy .gz files from DATADIR to subdirectory gz , extract .gz files in DATADIR"""
    if not os.path.isdir("gz"):
        os.makedirs("gz")
    files = os.listdir(DATADIR)
    for ffile in files:
        if not(ffile.endswith(".gz")):
            continue
        logger.info("Copying file %s to gz/" % ffile)
        os.system('cp -f %s gz/' % ffile)
        logger.info("Unpacking file %s" % ffile)
        os.system('gunzip -f %s' % ffile)
    
def convert():
    """Convert relative vorticity and divergence of wind to u and v wind components
    and convert u, v and air temperature from spherical harmonics to regular grid"""
    files = os.listdir(DATADIR)
    for ffile in files:
        if not(ffile.endswith(".grb")):
            continue
        fnmatch = DVCONV_PATTERN.match(ffile)
        if not fnmatch:
            continue
        prefix = fnmatch.group(1)
        infix = fnmatch.group(2)
        suffix = fnmatch.group(3)
        
        logger.info("Converting file %s" % ffile)
        tmpfile = ffile.replace(".grb", ".tmp")
        if os.path.isfile(tmpfile):
            os.remove(tmpfile)
        cnvfile = prefix + "cnv_" + infix + suffix
        if os.path.isfile(cnvfile):
            os.remove(cnvfile)
        
        status = os.system("cdo dv2uv %s %s" % (ffile, tmpfile))
        status += os.system("cdo sp2gp %s %s" % (tmpfile, cnvfile))
        
        if not status == 0:
            logger.error("Error during conversion of %s" % ffile)
        
        # Remove temporary and original grib file
        os.remove(tmpfile)
        os.remove(ffile)
    
def read_static_data():
    """Read required information soil_moisture_at_field_capacity and 
    sea_land_mask from the static data file for later calculations"""
    grbs = pygrib.open(STATIC_DATA_INPUT_FILE)
    grbs.seek(0)
    for grb in grbs:
        if grb.indicatorOfParameter == 11:
            sm_at_field_capacity = grb.values
        elif grb.indicatorOfParameter == 4:
            sea_land_mask = grb.values
    grbs.close()
    return (sm_at_field_capacity, sea_land_mask)

def calc(sm_at_field_capacity, sea_land_mask):
    """Calculate 5-layer soil moisture field"""
    files = os.listdir(DATADIR)
    for ffile in files:
        if not(ffile.endswith(".grb")):
            continue

        fnmatch = SMCALC_PATTERN.match(ffile)
        if not fnmatch:
            continue
        prefix = fnmatch.group(1)
        infix = fnmatch.group(2)
        suffix = fnmatch.group(3)
        
        #logger.info("Calculating soil moisture and filling skintemp for %s" % ffile)
        logger.info("Calculating soil moisture for %s" % ffile)
        smfile = prefix + "calc_" + infix + suffix
        if os.path.isfile(smfile):
            os.remove(smfile)
        grbout = open(smfile, "w")
        grbs = pygrib.open(os.path.join(DATADIR, ffile))
        grbs.seek(0)
        #have_sst = False
        #have_skt = False
        for grb in grbs:
            if grb.indicatorOfParameter == 140:
                sm_values = grb.values/sm_at_field_capacity
                grb.indicatorOfParameter = 142
                grb.indicatorOfTypeOfLevel = 'sfc'
                grb.typeOfLevel = 'depthBelowLand'
                for level in list(SM_LEVELS.keys()):
                    grb.level = level
                    grb.values = sm_values*SM_LEVELS[level]
                    msg = grb.tostring()
                    grbout.write(msg)
            #elif grb.indicatorOfParameter == 139:
            #    skt_grb = grb
            #    have_skt = True
            #elif grb.indicatorOfParameter == 103:
            #    sst_grb = grb
            #    have_sst = True
            #if have_sst and have_skt:
            #    values = skt_grb.values*sea_land_mask + sst_grb.values*(1-sea_land_mask)
            #    skt_grb.values = values
            #    skt_grb.indicatorOfParameter = 143
            #    msg = skt_grb.tostring()
            #    grbout.write(msg)
            #    del sst_grb
            #    del skt_grb
            #    have_sst = False
            #    have_skt = False
        grbs.close()
        grbout.close()
        
def add_months(date, months):
    """Helper function to add number of months to a datetime object"""
    year = date.year
    month = date.month + months + 1
    dyear, month = divmod(month - 1, 12)
    rdate = datetime.date(year + dyear, month + 1, 1) - datetime.timedelta(1)
    return rdate.replace(day = min(rdate.day, date.day))

def sort():
    """Sort the grb files by year and month into subdirectories YYYY/MM.
    Return a dict with monthly counter as key and corresponding date as value"""
    dates = {}
    files = os.listdir(DATADIR)
    for ffile in files:
        if not(ffile.endswith(".grb")):
            continue

        fnmatch = FILENAME_PATTERN.match(ffile)
        if not fnmatch:
            print("ERROR, non-matching filename %s" % ffile)
            sys.exit(-1)
            
        prefix = fnmatch.group(1)
        infix = fnmatch.group(2)
        suffix = fnmatch.group(3)
        mcounter = int(infix) - 1

        date = add_months(STARTDATE, mcounter)
        targetdir = os.path.join(DATADIR, date.strftime("%Y/%m"))
        if not os.path.isdir(targetdir):
            os.makedirs(targetdir)

        logger.info("Moving %s to %s" % (ffile, targetdir))
        status = os.system("mv %s %s" % (ffile, targetdir))
        if not status == 0:
            logger.error("Error moving %s to %s" % (ffile, targetdir))
        
        if not date in dates:
            dates[mcounter] = date

    return dates
    
def add_static_data(dates):
    """Add the static data to each subdirectory YYYY/MM and modify the
    date and time values in the grb messages so that ungrib will be happy."""
    grbs = pygrib.open(STATIC_DATA_INPUT_FILE)
    for mcounter in list(dates.keys()):
        date = dates[mcounter]
        grbout_file = os.path.join(DATADIR, date.strftime("%Y/%m"),
                                   STATIC_DATA_OUTPUT_FILE)
        ### Need datetime instead of date
        startdate = datetime.datetime.combine(date, datetime.time())
        enddate = datetime.datetime.combine(add_months(startdate, 1), datetime.time())
        logger.info("Adding static data for range %s to %s" % (
                startdate.strftime("%Y%m%d"), enddate.strftime("%Y%m%d")))
        
        grbout = open(grbout_file, "w")
        current_date = startdate
        while current_date < enddate:
            grbs.seek(0)
            for grb in grbs:
                grb.dataTime = int(current_date.strftime("%H"))*100
                grb.dataDate = int(current_date.strftime("%Y%m%d"))
                grb.P1 = 0
                msg = grb.tostring()
                grbout.write(msg)
            current_date += datetime.timedelta(hours=MODEL_TIMESTEP)        
        grbout.close()
    grbs.close()

def merge_data(dates):
    """Merge the data in each subdirectory YYYY/MM/*.grb into one file
    YYYY/YYYY_MM.grb and delete the subdirectory after a successful merge"""
    for mcounter in list(dates.keys()):
        date = dates[mcounter]
        infiles = date.strftime("%Y/%m/*.grb")
        outfile = date.strftime("%Y/%Y_%m.grb")
        if os.path.isfile(outfile):
            os.remove(outfile)
        cmd = "cdo merge %s %s" % (infiles, outfile)
        logger.info("Merging data from %s to %s" % (infiles, outfile))
        status = os.system(cmd)
        if not status == 0:
            logger.error("Error merging data from %s to %s" % (infiles, outfile))
        # Remove folder containing split data after successful merge
        shutil.rmtree(os.path.join(DATADIR, date.strftime("%Y/%m")))

def fix_skt_sst(dates):
    for mcounter in list(dates.keys()):
        date = dates[mcounter]
        cdir = os.getcwd()
        fdir = os.path.join(cdir, date.strftime("%Y"))
        fname = date.strftime("%Y_%m.grb")
        logger.info("Fixing skt and sst fields in %s" % (os.path.join(fdir, fname)))
        
        os.chdir(fdir)
        (prefix, suffix) = fname.split(".")
        command = """cdo -O selcode,4 {0}.grb {0}_004.grb && \
cdo -O selcode,103 {0}.grb {0}_103.grb && \
cdo -O selcode,139 {0}.grb {0}_139.grb && \
cdo -O ifnotthen {0}_004.grb {0}_103.grb {0}_144a.grb && \
cdo -O ifthenelse {0}_004.grb {0}_139.grb {0}_144a.grb {0}_143a.grb && \
cdo -O setcode,143 {0}_143a.grb {0}_143b.grb && \
cdo -O setcode,144 {0}_144a.grb {0}_144b.grb && \
cdo -O merge {0}.grb {0}_143b.grb {0}_144b.grb {0}_final.grb && \
mv {0}_final.grb {0}.grb""".format(prefix)
        status = os.system(command)

        if not status == 0:
            print("ERROR while executing '%s'" % command)
        command = "rm -f {0}_004.grb {0}_103.grb {0}_139.grb".format(prefix)
        command += " {0}_143a.grb {0}_143b.grb {0}_144a.grb {0}_144b.grb".format(prefix)
        status = os.system(command)
        if not status == 0:
            print("ERROR while executing '%s'" % command)
        os.chdir(cdir)

###########################################################################################
# Main function                                                                           #
###########################################################################################

def main():
    """Main function. Calls the subroutines and controls the data exchange"""
    (sm_at_field_capacity, sea_land_mask) = read_static_data()
    unpack()
    convert()
    calc(sm_at_field_capacity, sea_land_mask)
    dates = sort()
    add_static_data(dates)
    merge_data(dates)
    fix_skt_sst(dates)
    
if __name__ == "__main__":
    main()
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020