Python Forum
read complex file with both pandas and not
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
read complex file with both pandas and not
#1
Dear all,

I have this complex file:

# ID ,116
# Localita  ,SB16
# Lon/Lat ,11.138574,46.886774
# Quota ,1839
DATA ORA, T, RH,PSFC,DIR,VEL10, PREC, RAD, CC,FOG
yyyy-mm-dd hh:mm, °C, %, hPa, °N, m/s, mm/h,W/m², %,-
2012-01-01 06:00, -0.1,100, 815,313, 2.6, 0.0, 0, 0,0
2012-01-01 07:00, -1.2, 93, 814,314, 4.8, 0.0, 0, 0,0
2012-01-01 08:00, 1.7, 68, 815,308, 7.5, 0.0, 41, 11,0
2012-01-01 09:00, 2.4, 65, 815,308, 7.4, 0.0, 150, 33,0
2012-01-01 10:00, 3.0, 64, 816,305, 8.4, 0.0, 170, 44,0
2012-01-01 11:00, 2.6, 65, 816,303, 6.3, 0.0, 321, 22,0
....
....

I would like to read the value 1839 and store it in a variable. After that, I would like to read all the data after the # with pandas and store it in a dataframe. However, I would like to use "DATA ORA, T, RH,PSFC,DIR,VEL10, PREC, RAD, CC,FOG" as header and not the six line.

I am able to do read it with Pandas but I have to skip the first row and cancel the 6-th line from the file.

What do you think? Is it better to move to simpler file.
Thanks in advance for any help,

Diedro
Reply
#2
This is quite simple, you only need pandas if you want to save in a different format or want a better looking report:
** Note ** Uses f-string and requires python 3.6 or newer
import csv
import os


#I need following to read file from proper directory
os.chdir(os.path.abspath(os.path.dirname(__file__)))

def read_data(filename, delimiter=','):
    # can overide delimiter if necessary
    headerfound = False
    with open(filename) as csvdata:
        reader = csv.reader(csvdata, delimiter=delimiter)
        for row in reader:
            if '#' in row[0]:
                if row[0] == '# ID ':
                    id = row[1]                # This is id
                    print(f'\nId = {id}')
                elif row[0] == '# Quota ':
                    quota = row[1]              # This is quota
                    print(f'Quota: {quota}')
                    headerfound = True
                continue
            elif headerfound:
                # this is header for pandas
                print('---------------------------------------------------------------------' \
                      '---------------------------------------------------------------------' \
                      '-----------')
                for item in row:
                    print(f'{item:16}', end = '')
                print('\n---------------------------------------------------------------------' \
                      '---------------------------------------------------------------------' \
                      '-----------')
                headerfound = False
            else:
                data = row              # each iteration here is a row to insert into pandas
                for item in row:
                    print(f'{item:16}', end = '')
                print()


if __name__ == '__main__':
    read_data('cvsdata.csv', )
output:
Output:
Id = 116 Quota: 1839 ----------------------------------------------------------------------------------------------------------------------------------------------------- DATA ORA T RH PSFC DIR VEL10 PREC RAD CC FOG ----------------------------------------------------------------------------------------------------------------------------------------------------- yyyy-mm-dd hh:mm °C % hPa °N m/s mm/h W/m² % - 2012-01-01 06:00 -0.1 100 815 313 2.6 0.0 0 0 0 2012-01-01 07:00 -1.2 93 814 314 4.8 0.0 0 0 0 2012-01-01 08:00 1.7 68 815 308 7.5 0.0 41 11 0 2012-01-01 09:00 2.4 65 815 308 7.4 0.0 150 33 0 2012-01-01 10:00 3.0 64 816 305 8.4 0.0 170 44 0 2012-01-01 11:00 2.6 65 816 303 6.3 0.0 321 22 0
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pandas read csv file in 'date/time' chunks MorganSamage 4 1,628 Feb-13-2023, 11:24 AM
Last Post: MorganSamage
Smile How to further boost the data read write speed using pandas tjk9501 1 1,215 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  How to import an xml file to Pandas sjhazard 0 2,312 Jun-08-2021, 08:19 PM
Last Post: sjhazard
  Can't read text file with pandas zinho 6 11,935 May-24-2020, 06:13 AM
Last Post: azajali43
  Read json array data by pandas vipinct 0 1,886 Apr-13-2020, 02:24 PM
Last Post: vipinct
  Read file Into array with just $0d as Newline lastyle 5 3,276 Feb-03-2020, 11:58 PM
Last Post: lastyle
  getting trailing zeros with 1 during pandas read fullstop 1 3,543 Jan-05-2020, 04:01 PM
Last Post: ichabod801
  Read csv file from Yahoo Finance ian 3 4,584 Sep-22-2019, 06:47 AM
Last Post: ndc85430
  Read Nested JSON with pandas.io.json palo173 4 9,500 Apr-29-2019, 01:25 PM
Last Post: palo173
  Python read Password protected excel and convert to Pandas DataFrame FORTITUDE 2 16,978 Aug-30-2018, 01:08 PM
Last Post: FORTITUDE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020