Python Forum
Help to find a string and read the next lines - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Help to find a string and read the next lines (/thread-25087.html)



Help to find a string and read the next lines - crlamaral - Mar-18-2020

Hello, I'm having trouble recognizing a specific line and reading two lines below it. I would like some help because I've tried many things and couldn't get where I need to.

Well, I have the test.txt file that needs to be read and every time the script finds the string "DOSIMETRY_TOTAL_DOSE" it needs to identify the line and seek the dose value reading that is located two lines below.

Below is the content of my TEST.TXT file

Quote:TEST.TXT
CHUCK AMARAL, 2020

[DOSIMETRY_TOTAL_DOSE_B: 00]

8.9762

[DOSIMETRY_TOTAL_DOSE_E: 00]

9.7324

[DOSIMETRY_TOTAL_DOSE_B: 01]

20.5469

[DOSIMETRY_TOTAL_DOSE_E: 01]

13.2534

[DOSIMETRY_TOTAL_DOSE_B: 02]

2.2764

[DOSIMETRY_TOTAL_DOSE_E: 02]

7.3634

[DOSIMETRY_TOTAL_DOSE_B: 03]

5.8867

[DOSIMETRY_TOTAL_DOSE_E: 03]

6.2521


So the script should identify the first string on line 4 and read the dose value on line 6 and write it in a file (extractedlines.txt), then identify another string on line 8, read the value on line 10, and write it again at the end of the extractedlines.txt file. The script should continue till the end of the TEST.TXT file.

My python script is:

in_file = "test.txt"
out_file = "extractedlines.txt"
 
search_for = "DOSIMETRY_TOTAL_DOSE"
line_num = 0
lines_found = 0
with open(out_file, 'w') as out_f:
    with open(in_file, "r") as in_f:
        for line in in_f:
            line_num += 1
            if search_for in line:
                lines_found += 1
                print("String '{}' found on line {}...".format(search_for, line_num))
                print("Dose value: ")
                out_f.write(line)
                out_f.write('Dose value: {} \n')  #HERE I DO NOT KNOW HOW TO MAKE THE SCRIPT READ THE VALUE TWO LINES BELOW
 
        print("{} lines were found with the string '{}'...".format(lines_found, search_for))
When I run it, it reads the TXT, it also identifies the strings and saves it in the extractedlines.txt file but I can't make it skip two lines and read the correct dose value and the result (as seen in my in extractedlines.txt) looks like this:

Quote:[DOSIMETRY_TOTAL_DOSE_B: 00]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_E: 00]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_B: 01]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_E: 01]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_B: 02]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_E: 02]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_B: 03]
Dose value: {}
[DOSIMETRY_TOTAL_DOSE_E: 03]
Dose value: {}

As you can see here, I can't get the dose values on the TEST.TXT file for each one of the observations.

Can someone help me? I've been racking my brain with this for a few days and I'm getting nowhere.

Thank you very much in advance for any help.

Best

Chuck


RE: Help to find a string and read the next lines - Larz60+ - Mar-18-2020

one way:
import os


# Make sure path is set to current directory
os.chdir(os.path.abspath(os.path.dirname(__file__)))


def read_file(filename):
    with open(filename) as fp:
        n = 0
        for line in fp:
            line = line.strip()
            if not len(line):
                continue
            if n == 0:
                n = 1
                continue
            if n == 1:
                nline = line.split(',')
                print(f"name: {nline[0]}, year: {nline[1]}")
            else:
                if (n % 2) == 0: # even
                    # print(f"n: {n}")
                    print(f"{line}: Dose: ", end = '')
                else:
                    print(f"{line}")
            n += 1


if __name__ == '__main__':
    read_file('TEST.TXT')
output:
Output:
name: CHUCK AMARAL, year: 2020 [DOSIMETRY_TOTAL_DOSE_B: 00]: Dose: 8.9762 [DOSIMETRY_TOTAL_DOSE_E: 00]: Dose: 9.7324 [DOSIMETRY_TOTAL_DOSE_B: 01]: Dose: 20.5469 [DOSIMETRY_TOTAL_DOSE_E: 01]: Dose: 13.2534 [DOSIMETRY_TOTAL_DOSE_B: 02]: Dose: 2.2764 [DOSIMETRY_TOTAL_DOSE_E: 02]: Dose: 7.3634 [DOSIMETRY_TOTAL_DOSE_B: 03]: Dose: 5.8867 [DOSIMETRY_TOTAL_DOSE_E: 03]: Dose: 6.2521



RE: Help to find a string and read the next lines - crlamaral - Mar-19-2020

Hello Larz60+... thx for your help and for the script... it worked here with the test file (test.txt) however, when I used the real file (download link for the real file --> https://drive.google.com/open?id=1VqpMrdOmUxylIctrYoqQeBJhYMjbwAGY) , it returned an error, as you can see below...

Traceback (most recent call last):
File "C:\Users\Administrador\Desktop\MSL\extractor2.py", line 31, in <module>
read_file('teste2.txt')
File "C:\Users\Administrador\Desktop\MSL\extractor2.py", line 20, in read_file
print(f"name: {nline[0]}, year: {nline[1]}")
IndexError: list index out of range

If you check the real file you will see 44 'DOSIMETRY_TOTAL_DOSE' blocks are spread along the entire file with a lot of info among them that should be discarded for the output file.

I'm right now trying to change your code but I still in need of help.

Thank you... best regards, Chuck


RE: Help to find a string and read the next lines - snippsat - Mar-19-2020

Here it may be easier to write an regex.
Also bye using compile and finditer make it efficient for larger files.
pattern = re.compile(r"\[DOSIMETRY_TOTAL.*\]\s+(\S+)")
for match in pattern.finditer(data):
    print(20 * '-')
    print(match.group(0)) 
Output:
[DOSIMETRY_TOTAL_DOSE_B: 00] 9.30988 -------------------- [DOSIMETRY_TOTAL_DOSE_E: 00] 8.45142 -------------------- [DOSIMETRY_TOTAL_DOSE_B: 01] 9.18214 -------------------- [DOSIMETRY_TOTAL_DOSE_E: 01] 8.41000 -------------------- [DOSIMETRY_TOTAL_DOSE_B: 02] 8.87531 -------------------- [DOSIMETRY_TOTAL_DOSE_E: 02] 8.35574 .....
data that i test with is just string of the whole file.
group(1) will be values only.
Output:
9.30988 -------------------- 8.45142 -------------------- 9.18214 -------------------- 8.41000 -------------------- 8.87531 -------------------- 8.35574 -------------------- 9.15688 -------------------- 8.40126 -------------------- 8.88971 -------------------- 8.48842 .....



RE: Help to find a string and read the next lines - Larz60+ - Mar-19-2020

using regex as snippsat suggests should be considered.

The reason my script didn't work with entire file, is probably this:

the sample you provided had TEST.TEXT as first line. I expect that the full file doesn't have that.

I also only checked for the name line at start of file. You probably have repeated names throughout the real file.

Either of there two conditions requires changes to my code.
is this the case?