Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
coding help
#1
hello i am trying to parse the data saved in a csv file so that i can arrange them in column and rows, i have tried to create dataframe with pandas but it does not work for me. i need every string before : sign to be a column and a string after : sign to be a row.
the content of the file is as follows below

History file
Output:
CREATEDATE, MSG 2021-08-13 00:03:12.349, (S:ARUSHA_NATIONAL_PARK;D:210812;T:210000;BAT:12.2;DP:11.8;DP1HA:11.8;DP1HX:12.0;DP1HM:11.3;PA:837.4;PA1HA:837.6;PA1HX:837.8;PA1HM:837.4;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:95;RH1HA:94;RH1HX:95;RH1HM:92;SR:-1;SR1HA:0;SR1HX:0;SR1HM:-1;TA:12.6;TA1HA:12.8;TA1HX:13.0;TA1HM:12.4;WD:184;WD2MA:196;WD10MA:183;WD2MX:229;WD10MX:242;WD2MM:139;WD10MM:127;WD1HA:190;WD1HX:242;WD1HM:127;WS:3.6;WS2MA:2.7;WS10MA:2.6;WS2MX:4.5;WS10MX:5.0;WS2MM:0.8;WS10MM:0.5;QFE:837.5;QFE1HA:837.8;QFE1HX:837.9;QFE1HM:837.5;QFF:1021.6;QFF1HA:1021.8;QFF1HX:1022.2;QFF1HM:1021.5;QNH:1023.7;QNH1HA:1024.0;QNH1HX:1024.2;QNH1HM:1023.7;a:0;p:0.3;ETO:1.32) 2021-08-13 01:03:13.118, (S:ARUSHA_NATIONAL_PARK;D:210812;T:220000;BAT:12.2;DP:12.2;DP1HA:12.0;DP1HX:12.2;DP1HM:11.8;PA:837.0;PA1HA:837.2;PA1HX:837.4;PA1HM:837.0;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:98;RH1HA:97;RH1HX:99;RH1HM:95;SR:0;SR1HA:-1;SR1HX:0;SR1HM:-1;TA:12.5;TA1HA:12.4;TA1HX:12.6;TA1HM:12.2;WD:161;WD2MA:171;WD10MA:173;WD2MX:194;WD10MX:231;WD2MM:137;WD10MM:137;WD1HA:176;WD1HX:243;WD1HM:97;WS:3.2;WS2MA:2.8;WS10MA:2.4;WS2MX:5.1;WS10MX:5.1;WS2MM:1.0;WS10MM:0.4;QFE:837.2;QFE1HA:837.4;QFE1HX:837.5;QFE1HM:837.2;QFF:1021.3;QFF1HA:1021.6;QFF1HX:1021.8;QFF1HM:1021.3;QNH:1023.3;QNH1HA:1023.6;QNH1HX:1023.7;QNH1HM:1023.3;a:8;p:-0.6;ETO:1.33) 2021-08-13 02:03:12.663, (S:ARUSHA_NATIONAL_PARK;D:210812;T:230000;BAT:12.1;DP:12.2;DP1HA:12.2;DP1HX:12.3;DP1HM:12.2;PA:836.5;PA1HA:836.8;PA1HX:837.0;PA1HM:836.5;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:100;RH1HA:99;RH1HX:100;RH1HM:98;SR:0;SR1HA:0;SR1HX:0;SR1HM:0;TA:12.3;TA1HA:12.4;TA1HX:12.5;TA1HM:12.3;WD:194;WD2MA:173;WD10MA:178;WD2MX:215;WD10MX:232;WD2MM:121;WD10MM:121;WD1HA:172;WD1HX:268;WD1HM:108;WS:2.5;WS2MA:2.2;WS10MA:2.3;WS2MX:3.5;WS10MX:4.6;WS2MM:0.9;WS10MM:0.5;QFE:836.6;QFE1HA:836.9;QFE1HX:837.2;QFE1HM:836.6;QFF:1020.7;QFF1HA:1021.0;QFF1HX:1021.3;QFF1HM:1020.7;QNH:1022.6;QNH1HA:1023.0;QNH1HX:1023.3;QNH1HM:1022.6;a:7;p:-1.3;ETO:1.34) 2021-08-13 03:03:12.572, (S:ARUSHA_NATIONAL_PARK;D:210813;T:000000;BAT:12.1;DP:12.2;DP1HA:12.2;DP1HX:12.3;DP1HM:12.2;PA:836.1;PA1HA:836.3;PA1HX:836.5;PA1HM:836.1;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:101;RH1HA:100;RH1HX:101;RH1HM:100;SR:0;SR1HA:0;SR1HX:0;SR1HM:0;TA:12.1;TA1HA:12.2;TA1HX:12.3;TA1HM:12.0;WD:169;WD2MA:175;WD10MA:171;WD2MX:201;WD10MX:230;WD2MM:150;WD10MM:109;WD1HA:175;WD1HX:230;WD1HM:109;WS:2.2;WS2MA:1.8;WS10MA:2.3;WS2MX:3.1;WS10MX:4.7;WS2MM:0.7;WS10MM:0.5;QFE:836.3;QFE1HA:836.4;QFE1HX:836.6;QFE1HM:836.2;QFF:1020.5;QFF1HA:1020.6;QFF1HX:1020.7;QFF1HM:1020.4;QNH:1022.2;QNH1HA:1022.4;QNH1HX:1022.6;QNH1HM:1022.2;a:7;p:-1.3;ETO:1.36) 2021-08-13 04:03:12.356, (S:ARUSHA_NATIONAL_PARK;D:210813;T:010000;BAT:12.1;DP:11.2;DP1HA:11.8;DP1HX:12.2;DP1HM:11.2;PA:836.1;PA1HA:836.1;PA1HX:836.2;PA1HM:836.0;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:94;RH1HA:98;RH1HX:101;RH1HM:94;SR:0;SR1HA:0;SR1HX:0;SR1HM:0;TA:12.2;TA1HA:12.1;TA1HX:12.2;TA1HM:12.0;WD:174;WD2MA:185;WD10MA:185;WD2MX:220;WD10MX:239;WD2MM:157;WD10MM:98;WD1HA:176;WD1HX:271;WD1HM:43;WS:1.4;WS2MA:2.1;WS10MA:2.4;WS2MX:3.4;WS10MX:5.2;WS2MM:0.8;WS10MM:0.4;QFE:836.2;QFE1HA:836.3;QFE1HX:836.3;QFE1HM:836.2;QFF:1020.3;QFF1HA:1020.4;QFF1HX:1020.6;QFF1HM:1020.3;QNH:1022.2;QNH1HA:1022.2;QNH1HX:1022.3;QNH1HM:1022.1;a:6;p:-0.9;ETO:1.35) 2021-08-13 05:03:12.949, (S:ARUSHA_NATIONAL_PARK;D:210813;T:020000;BAT:12.1;DP:11.0;DP1HA:11.1;DP1HX:11.3;DP1HM:11.0;PA:836.3;PA1HA:836.3;PA1HX:836.4;PA1HM:836.1;PR:0.0;PR1HS:0.0;PR24HS:0.0;PR5MS00:0.0;PR5MS05:0.0;PR5MS10:0.0;PR5MS15:0.0;PR5MS20:0.0;PR5MS25:0.0;PR5MS30:0.0;PR5MS35:0.0;PR5MS40:0.0;PR5MS45:0.0;PR5MS50:0.0;PR5MS55:0.0;RH:93;RH1HA:93;RH1HX:94;RH1HM:93;SR:0;SR1HA:0;SR1HX:0;SR1HM:0;TA:12.1;TA1HA:12.2;TA1HX:12.2;TA1HM:12.1;WD:156;WD2MA:169;WD10MA:177;WD2MX:187;WD10MX:276;WD2MM:147;WD10MM:130;WD1HA:181;WD1HX:276;WD1HM:108;WS:2.3;WS2MA:2.0;WS10MA:2.5;WS2MX:3.3;WS10MX:5.5;WS2MM:0.8;WS10MM:0.2;QFE:836.5;QFE1HA:836.4;QFE1HX:836.5;QFE1HM:836.2;QFF:1020.7;QFF1HA:1020.6;QFF1HX:1020.7;QFF1HM:1020.3;QNH:1022.5;QNH1HA:1022.4;QNH1HX:1022.5;QNH1HM:1022.2;a:5;p:-0.1;ETO:1.35)
Reply
#2
Something like
import csv
import pandas as pd

with open('history.csv') as f:
    rdr = csv.DictReader(f)
    data = []
    for row in rdr:
        record = {}
        record['CreateDate'] = row['CREATEDATE']
        fields = dict(item.split(':') for item in row[' MSG'][2:-1].split(';'))
        record.update(fields)
        data.append(record)

df = pd.DataFrame(data)
print(df)
Output:
CreateDate S D T BAT DP DP1HA ... QNH QNH1HA QNH1HX QNH1HM a p ETO 0 2021-08-13 00:03:12.349 ARUSHA_NATIONAL_PARK 210812 210000 12.2 11.8 11.8 ... 1023.7 1024.0 1024.2 1023.7 0 0.3 1.32 1 2021-08-13 01:03:13.118 ARUSHA_NATIONAL_PARK 210812 220000 12.2 12.2 12.0 ... 1023.3 1023.6 1023.7 1023.3 8 -0.6 1.33 2 2021-08-13 02:03:12.663 ARUSHA_NATIONAL_PARK 210812 230000 12.1 12.2 12.2 ... 1022.6 1023.0 1023.3 1022.6 7 -1.3 1.34 3 2021-08-13 03:03:12.572 ARUSHA_NATIONAL_PARK 210813 000000 12.1 12.2 12.2 ... 1022.2 1022.4 1022.6 1022.2 7 -1.3 1.36 4 2021-08-13 04:03:12.356 ARUSHA_NATIONAL_PARK 210813 010000 12.1 11.2 11.8 ... 1022.2 1022.2 1022.3 1022.1 6 -0.9 1.35 5 2021-08-13 05:03:12.949 ARUSHA_NATIONAL_PARK 210813 020000 12.1 11.0 11.1 ... 1022.5 1022.4 1022.5 1022.2 5 -0.1 1.35 [6 rows x 72 columns]
Note, you will need to convert some columns to numeric types in order to make calculations
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Photo 
Hello Buran thanks for your help on my request and sorry for my late feedback as i was sick since last week, i could not work with computer. back to the code i have run the code from jupyter notebook but i didn't work for me, the code is the same i did change any thing appart from the file name. i have include the screen shot in the attachments for the code and traceback output for you to see.

regards

david

Attached Files

Thumbnail(s)
   
Reply
#4
Have you created a CSV with the required headings?
while dad_has_cigs == True:
    happiness = True
    if dad_has_cigs == False:
    print("Dad come home!")
    happiness = not happiness
    break
Reply
#5
(Aug-18-2021, 09:26 AM)jamesaarr Wrote: Have you created a CSV with the required headings?


what headings?
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020