Python Forum

Full Version: Extracting data based on specific patterns in a text file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR" For now i have extracted all the lines in my script that start with that code and placed them in a separate file. The new file looks like this.

MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 63 10 Port is not registered [Folder: 'Picture']

MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\tree.png 315 10 Port is not registered [Folder: 'Picture.first_inst']

MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 315 10 Port is not registered [Folder: 'Picture.second_inst']

MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\tree.png 317 10 Port is not registered [Folder: 'Picture.third_inst']

MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 317 10 Port is not registered [Folder: 'Picture.fourth_inst']

For each of these lines i have to extract the data that lies after "[Folder: 'Picture" If there is no data after "[Folder: 'Picture" as in the case of my first line, then skip that line and move on to the next line. I also want to extract the file names for each of those lines- top.txt, tree.txt

I couldnt think of a simpler method to do this as this involves a loop and gets messier. Is there any way out i can do this? extracting just the file paths and the ending data of each line.
You can do this by learning regular expresions, the re module.