txt-file: read and append missing data

sufi · (This post was last modified: Dec-06-2019, 05:44 PM by Larz60+.)

I am trying to insert missing data into a large text file.

The file contains information on different types of objects (curves, points and text). Each object is marked with a header, either ".CURVE XXXXX:", ".POINT XXXXX:" or ".TEXT XXXXX:", where the XXXXX is the ID number for each object.

I want to search the lines that describes each curve (which means that the header-line will start with .CURVE) and see if they have lines that starts with "...A_Z" and "...B_Z". If they don't, I want to append "A_Z 0.00" and "B_Z 0.00", change the line "..NE" to "..NEZ" and then append 0.00 to the end of all the lines between "..NEZ" and the new header line (which starts with either ".CURVE", ".POINT" or ".TEXT".)

If this is a part of the text that describes a curve:

Output:.CURVE XXXXX:
..OBJTYPE Pipe
..QUAL * * * * *
...NR XXXXX
...BB A
...THEME AA
...THEMGRP X
...LENGTH XX.XX
...TR X
...KEY A
..NE
XXXXXXXXX XXXXXXXX
XXXXXXXXX XXXXXXXX
XXXXXXXXX XXXXXXXX
.CURVE XXXXX: #This line marks the start of a new curve/object

I want it to look like this after i have ran the code:

Output:.CURVE XXXXX:
..OBJTYPE Pipe
..QUAL * * * * *
...NR XXXXX
...BB A
...THEME AA
...THEMGRP X
...LENGTH XX.XX
...TR X
...KEY A
[b]...A_Z 0.00
...B_Z 0.00[/b]
..NE[b]Z[/b]
XXXXXXXXX XXXXXXXX [b]0.00[/b]
XXXXXXXXX XXXXXXXX [b]0.00[/b]
XXXXXXXXX XXXXXXXX [b]0.00[/b]
.CURVE XXXXX: #This line marks the start of a new curve/object

I tried to use the code below, until i found out that the lines describing the objects were not written in a specific order. This means that the line that starts with "...KEY" is not useful as a marker. I think what i have to do is to split the text file into seperate parts for every header, read the parts seperately, append missing data (if missing), and then put all the parts back together. But i am not really sure how to do that. Ideas are most welcome!

input = open('C:/Users/sufi/Desktop/info.txt','r')
output = open('C:/Users/sufi/Desktop/info_edit.txt','w')
    
for line1,line2 in itertools.zip_longest(*[input]*2):
    if line1.startswith('...A_Z') and line2.startswith('..N'):
        output.write(line1 + '...B_Z 0.00\n'+ line2)
    elif line1.startswith('...KEY') and line2.startswith('...B_Z'):
        output.write(line1+'...A_Z 0.00\n'+line2)
    elif line1.startswith('...KEY') and line2.startswith('..N'):
        output.write(line1+'...A_Z 0.00\n...B_Z 0.00\n'+line2)
    else:
        output.write(line1+line2)

input.close()
output.close()

**Gribouillis** · (This post was last modified: Dec-07-2019, 08:13 AM by Gribouillis.)

You need to parse the input one way or another. Here is a simple parser. It reads the input lines and for each line it calls a method line_event(). Initially this method simply outputs the line to the output file. If it meets a line starting with .CURVE, it changes the method to line_event_curve_header() which outputs the lines until it meets a line ..NE. In the meanwhile it detects if it meets a line starting with ...A_Z or ...B_Z. After the last line of the header, if ...A_Z or ...B_Z has not been found it changes the method to line_event_curve_body() which appends 0.00 to the lines until it meets a line starting a new object.

Here is the (untested) code

import re

class Parser:
    object_pattern = re.compile(r'^[.](CURVE|POINT|TEXT)')
    
    def __init__(self, infile, outfile):
        self.infile = infile
        self.outfile = outfile
        self.line_event = self.line_event_base
        
    def run(self):
        for line in self.infile:
            self.line_event(line)
        self.outfile.flush()
        
    def line_event_base(self, line):
        if line.startswith('.CURVE'):
            self.start_curve(line)
        else:
            self.outfile.write(line)
            
    def line_event_curve_header(self, line):
        if line.startswith('...A_Z') or line.startswith('...B_Z'):
            self.seen_ab_z = True
            self.output.write(line)
        elif line.rstrip('\n') == '..NE':
            if self.seen_ab_z:
                self.line_event = self.line_event_base
                self.output.write(line)
            else:
                self.output.write('...A_Z 0.00\n...B_Z 0.00\n..NEZ\n')
                self.line_event = self.line_event_curve_body
        else:
            self.output.write(line)
            
    def line_event_curve_body(self, line):
        if self.object_pattern.match(line):
            if line.startswith('.CURVE'):
                self.start_curve(line)
            else:
                self.line_event = self.line_event_base
                self.outfile.write(line)
        else:
            self.outfile.write(line.rstrip('\n'))
            self.outfile.write(' 0.00\n')
                
    def start_curve(self, line):
        self.line_event = self.line_event_curve_header
        self.seen_ab_z = False
        self.line_event(line)

if __name__ == '__main__':
    with open('C:/Users/sufi/Desktop/info.txt','r') as infile,\
            open('C:/Users/sufi/Desktop/info_edit.txt','w') as outfile:
        Parser(infile, outfile).run()

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to read a file as binary or hex "string" so that I can do regex search?	tatahuft	3	1,251	Dec-19-2024, 11:57 AM Last Post: snippsat
	Read TXT file in Pandas and save to Parquet	zinho	2	1,350	Sep-15-2024, 06:14 PM Last Post: zinho
	Pycharm can't read file	Genericgamemaker	5	1,704	Jul-24-2024, 08:10 PM Last Post: deanhystad
	Python is unable to read file	Genericgamemaker	13	4,127	Jul-19-2024, 06:42 PM Last Post: snippsat
	Connecting to Remote Server to read contents of a file	ChaitanyaSharma	1	3,434	May-03-2024, 07:23 AM Last Post: Pedroski55
	Help with to check an Input list data with a data read from an external source	sacharyya	3	1,789	Mar-09-2024, 12:33 PM Last Post: Pedroski55
	Recommended way to read/create PDF file?	Winfried	3	5,234	Nov-26-2023, 07:51 AM Last Post: Pedroski55
	python Read each xlsx file and write it into csv with pipe delimiter	mg24	4	4,014	Nov-09-2023, 10:56 AM Last Post: mg24
	read file txt on my pc to telegram bot api	Tupa	0	2,759	Jul-06-2023, 01:52 AM Last Post: Tupa
	parse/read from file seperated by dots	giovanne	5	2,345	Jun-26-2023, 12:26 PM Last Post: DeaD_EyE

txt-file: read and append missing data

User Panel Messages

Announcements