Python Forum

Full Version: how to extract a portion of data from text lines by python 2
You're currently viewing a stripped down version of our content. View the full version with proper formatting.



I have a text file including a string header and data body. The contents seem like this:

this is a technical data file....................COMMENT
the creator : Adams............................COMMENT
2017.05.10.......................................THE FIRST TIME
.........................................................END OF HEADER
OA 10123.4532 12345.0102 -1827734.3475 -1893255.1023 45.12 23.01 8923.12
XB 10125.4132 13345.0702 -1843734.7875 -1834255.1913 44.12 23.02 8924.12
...
...
.........................................................END OF FILE

I need to write some codes with python 2 to extract only portions of data from the lines and get like:

10123.45 12345.01 -1827734.34 -1893255.10
10125.41 13345.07 -1843734.78 -1834255.19
...

How can I get them? Please give some suggestions. Thanks

BTW, I thought to use the ‘read’ with formmated strings, but I found no resources online about that use. I just found some modules like struct. But is there any concise method to finish that? After all the more modules could decrease the efficiency of the program.
I tried the following codes to read the data lines but not succeeded:
with open('example.txt', 'r') as f:
... line = f.readline('%2s%11.2f%11.2f%14.2f%%14.2f')
....

BTW, all the commas is actually the blank space, 'cause I do not know why the space was canceled automatically for the header lines.
write out each line without formatting, and then do your formatting when the data is read back in
This can be done in a module that is called by any program that needs the data.

A better solution, is to format the data beforehand in a structure, like a list or dictionary that
can be saved in json format using json.dump

Then when you read it back in with json.load it will already be formated.
Do all lines you are interested in starting with 'OA' or 'XB'? I'd read each line and check if starts with one of these. Split the line to get it as a list. Slice it from 1 to 5. Then using list comprehension create e new list as rid of the last two symbols of each number. Finally, join it. The for loop will get four lines of code and you are done.