Python Forum
txt-file: read and append missing data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
txt-file: read and append missing data
#1
I am trying to insert missing data into a large text file.

The file contains information on different types of objects (curves, points and text). Each object is marked with a header, either ".CURVE XXXXX:", ".POINT XXXXX:" or ".TEXT XXXXX:", where the XXXXX is the ID number for each object.

I want to search the lines that describes each curve (which means that the header-line will start with .CURVE) and see if they have lines that starts with "...A_Z" and "...B_Z". If they don't, I want to append "A_Z 0.00" and "B_Z 0.00", change the line "..NE" to "..NEZ" and then append 0.00 to the end of all the lines between "..NEZ" and the new header line (which starts with either ".CURVE", ".POINT" or ".TEXT".)

If this is a part of the text that describes a curve:
Output:
.CURVE XXXXX: ..OBJTYPE Pipe ..QUAL * * * * * ...NR XXXXX ...BB A ...THEME AA ...THEMGRP X ...LENGTH XX.XX ...TR X ...KEY A ..NE XXXXXXXXX XXXXXXXX XXXXXXXXX XXXXXXXX XXXXXXXXX XXXXXXXX .CURVE XXXXX: #This line marks the start of a new curve/object
I want it to look like this after i have ran the code:
Output:
.CURVE XXXXX: ..OBJTYPE Pipe ..QUAL * * * * * ...NR XXXXX ...BB A ...THEME AA ...THEMGRP X ...LENGTH XX.XX ...TR X ...KEY A [b]...A_Z 0.00 ...B_Z 0.00[/b] ..NE[b]Z[/b] XXXXXXXXX XXXXXXXX [b]0.00[/b] XXXXXXXXX XXXXXXXX [b]0.00[/b] XXXXXXXXX XXXXXXXX [b]0.00[/b] .CURVE XXXXX: #This line marks the start of a new curve/object
I tried to use the code below, until i found out that the lines describing the objects were not written in a specific order. This means that the line that starts with "...KEY" is not useful as a marker. I think what i have to do is to split the text file into seperate parts for every header, read the parts seperately, append missing data (if missing), and then put all the parts back together. But i am not really sure how to do that. Ideas are most welcome!

input = open('C:/Users/sufi/Desktop/info.txt','r')
output = open('C:/Users/sufi/Desktop/info_edit.txt','w')
    
for line1,line2 in itertools.zip_longest(*[input]*2):
    if line1.startswith('...A_Z') and line2.startswith('..N'):
        output.write(line1 + '...B_Z 0.00\n'+ line2)
    elif line1.startswith('...KEY') and line2.startswith('...B_Z'):
        output.write(line1+'...A_Z 0.00\n'+line2)
    elif line1.startswith('...KEY') and line2.startswith('..N'):
        output.write(line1+'...A_Z 0.00\n...B_Z 0.00\n'+line2)
    else:
        output.write(line1+line2)

input.close()
output.close()
Reply
#2
You need to parse the input one way or another. Here is a simple parser. It reads the input lines and for each line it calls a method line_event(). Initially this method simply outputs the line to the output file. If it meets a line starting with .CURVE, it changes the method to line_event_curve_header() which outputs the lines until it meets a line ..NE. In the meanwhile it detects if it meets a line starting with ...A_Z or ...B_Z. After the last line of the header, if ...A_Z or ...B_Z has not been found it changes the method to line_event_curve_body() which appends 0.00 to the lines until it meets a line starting a new object.

Here is the (untested) code
import re

class Parser:
    object_pattern = re.compile(r'^[.](CURVE|POINT|TEXT)')
    
    def __init__(self, infile, outfile):
        self.infile = infile
        self.outfile = outfile
        self.line_event = self.line_event_base
        
    def run(self):
        for line in self.infile:
            self.line_event(line)
        self.outfile.flush()
        
    def line_event_base(self, line):
        if line.startswith('.CURVE'):
            self.start_curve(line)
        else:
            self.outfile.write(line)
            
    def line_event_curve_header(self, line):
        if line.startswith('...A_Z') or line.startswith('...B_Z'):
            self.seen_ab_z = True
            self.output.write(line)
        elif line.rstrip('\n') == '..NE':
            if self.seen_ab_z:
                self.line_event = self.line_event_base
                self.output.write(line)
            else:
                self.output.write('...A_Z 0.00\n...B_Z 0.00\n..NEZ\n')
                self.line_event = self.line_event_curve_body
        else:
            self.output.write(line)
            
    def line_event_curve_body(self, line):
        if self.object_pattern.match(line):
            if line.startswith('.CURVE'):
                self.start_curve(line)
            else:
                self.line_event = self.line_event_base
                self.outfile.write(line)
        else:
            self.outfile.write(line.rstrip('\n'))
            self.outfile.write(' 0.00\n')
                
    def start_curve(self, line):
        self.line_event = self.line_event_curve_header
        self.seen_ab_z = False
        self.line_event(line)

if __name__ == '__main__':
    with open('C:/Users/sufi/Desktop/info.txt','r') as infile,\
            open('C:/Users/sufi/Desktop/info_edit.txt','w') as outfile:
        Parser(infile, outfile).run()
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 403 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Recommended way to read/create PDF file? Winfried 3 2,872 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,431 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 1,111 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,105 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,255 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,520 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,196 Mar-27-2023, 08:59 PM
Last Post: deanhystad
Question How to append integers from file to list? Milan 8 1,447 Mar-11-2023, 10:59 PM
Last Post: DeaD_EyE
  Read text file, modify it then write back Pavel_47 5 1,589 Feb-18-2023, 02:49 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020