Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
XML log file --> CSV
#1
I'm trying to convert a log file to CSV and am stumped.

This is a representation of the real file. This would be two records and they're always separated by a blank line. The number of fields in the first line is not always the same and the total number of lines per block is dependent on the last field of the first line.
Quote:1 | 2 | 3
a b c
x y z

1 | 2 | 4
aa bb cc
xx yy zz

My output should be
Output:
1 | 2 | 3 | a b c x y z 1 | 2 | 4 | aa bb cc xx yy zz
So far all I've managed to do is stick a pipe after every line which is not really what I want to do (and it leaves a trailing | at the end)

#!/usr/bin/python                                                                                                                                                                             

import sys
usage = "       usage: " + sys.argv[0] + " <inputfile> <outputfile>"

if len(sys.argv)!=3:
        print(usage)
        sys.exit(1)

inFile = sys.argv[1]
outFile = sys.argv[2]

outStr = ''
with open(inFile) as f:
        for line in f:
                line = line.strip('\n')
                if len(line.strip()) == 0:
                        outStr += (line + '\n')
                else:
                        outStr += (line + '|')
f = open(outFile, 'w')                         
f.write(outStr.strip())                        
f.close()  


Here's the output from running it
Output:
1 | 2 | 3|a b c|x y z| 1 | 2 | 4|aa bb cc|xx yy zz|
Am I on the right track? Any better ways to do this?
Reply
#2
Your output doesn't make sense if you're generating csv.

That said, string concatenation inside a loop is normally frowned upon for performance reasons, and lists are suggested instead.  In your case, you also get the added benefit of removing the trailing pipe that way.
Here's some completely untested code, which should be pretty close to right:
with open(inFile) as f:
    with open(outFile, "w") as fout:
        parts = [ ]
        for line in f:
            line = line.strip()
            if line:
                parts.append(line)
            else:
                print("|".join(parts), file=fout)
                parts = [ ]
        if parts:
            print("|".join(parts), file=fout)
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020