Python Forum
capture next block of text after finding error in file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
capture next block of text after finding error in file
#1
I have some code to find an error message in an error log file. When the error is found, the very next block of text will be a path. I need to capture that path.

In other words, I am searching a text file for "reported errors in the". When that string is found in the file, I need the next block of text which will be something like /var/logs/[filename]. Not sure how to accomplish this.

My current code to find the error string is:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

# set variables
strFileExt=""
strFile=""
strError =""

# import required modules
import datetime
import os
import subprocess
now = datetime.datetime.now()

# for testing time format
#print (now.strftime("%m%d%y"))

strWhere = "/var/logs/error.log."+(now.strftime("%m%d%y"))
#print (strFileExt)

strWhat = "reported errors in the"
#print (strWhat)
#print (strWhere)
strResult = 0



# read file
try:
    with open(strWhere, "r") as file:
        lines = file.readlines()
except IOError:
    strError = 10
except FileNotFoundError:
    strError = 11
except Exception:
    strError = 12
    if (strError )>5:
        print ( strError )

for line in lines:
    line = line.strip()
    if line.find( strWhat )!= -1:
            strResult = strResult + 1
else:   #do this when the loop is finished
# display results
    print (strResult)
So basically,

if strResult = 1:

Grab the next contiguous block of text after strWhat

Not sure if that is clear, but thanks for any help in advance.
Reply
#2
Python 2.7

I am searching a file for an error string "reported errors in the".

When this phrase is found, I need to capture the NEXT word immediately after the search string.

Is there a way in Python to accomplish this? In my attempts thus far, I am only able to split and capture words which are in my original search string.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

strFileExt=""
strFile=""
strError = 0
strFoundwords = ""



# import required modules
import datetime
import os
import subprocess
now = datetime.datetime.now()

# for testing timie format
#print (now.strftime("%m%d%y"))


strWhere = "/var/logs/error.log."+(now.strftime("%m%d%y"))
#print (strWhere)

strWhat = "reported errors in the"
#print (strWhat)
#print (strWhere)
strResult = 0



# read file
try:
    with open(strWhere, "r") as Myfile:
        lines = Myfile.readlines()
except IOError:
    strError = 10
except FileNotFoundError:
    strError = 11
except Exception:
    strError = 12
    if (strError )>5:
        print ( strError )

for line in lines:
    line = line.strip()
    if line.find( strWhat )!= -1:
            strResult = strResult + 1
            s = "reported errors in the /var/logs/"
            q = 'reported'
            res = s[s.find(q)+len(q):].split()[+3]
else:   #do this when the loop is finished
# display results
    print (strResult)
    print ( strError)
    print ( res )
The best I have been able to do is collect words that are part of the search string. I need the next word after "/var/logs" .

As is, with split()[+3] gives we what I already know.

Output:
... ... 1 0 /var/logs/
If I try +4, I get an "index out of range" error.

Error:
... Traceback (most recent call last): File "<stdin>", line 7, in <module> IndexError: list index out of range >>>
Any advice would be appreciated.
Reply
#3
look at line 50, are you sure you want to slice something out of variable s ?
Reply
#4
(Nov-27-2019, 05:21 PM)ThomasL Wrote: look at line 50, are you sure you want to slice something out of variable s ?

When it comes to Python, I'm not sure of anything ;)

The way I read that, s= "my entire search string" and q = the starting point of my search string. The split()[+3] takes me to /var/logs/, but it is the next segment of the path that I need to collect as that segment will always be varied and unpredictable.
I think I'm coming to the realization that I can not advance past anything beyond the search string (s).
Reply
#5
just to say that showing sample input text/file may help us enormously to help you. at the moment we just guess how your text looks like.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
(Nov-27-2019, 06:25 PM)buran Wrote: just to say that showing sample input text/file may help us enormously to help you. at the moment we just guess how your text looks like.

Thanks Buran.
Here is a segment of the log I am checking, the last line is the error condition I am looking for. The file is much larger than this snip but shows both conditions, error line and non-error lines.


Output:
2019-11-18 01:46:47:INFO:3496839:/var/logs/apc/ : no errors detected. 2019-11-18 01:46:47:INFO:3496839:/var/logs/xyz/ : no errors detected. 2019-11-18 01:46:47:ERROR:3496839:check reported errors in the /var/logs/jkl/ database. These should be rechecked to verify if the errors are accurate.
Reply
#7
error.log
Output:
2019-11-18 01:46:47:INFO:3496839:/var/logs/apc/ : no errors detected. 2019-11-18 01:46:47:INFO:3496839:/var/logs/xyz/ : no errors detected. 2019-11-18 01:46:47:ERROR:3496839:check reported errors in the /var/logs/jkl/ database. These should be rechecked to verify if the errors are accurate. 2019-11-18 01:46:47:ERROR:3496839:check reported errors in the /var/logs/jkl/spam database. These should be rechecked to verify if the errors are accurate.
using just str methods
log_file = 'error.log'

with open(log_file) as lf:
    for line in lf:
        log_date_hour, log_minute, log_seconds, log_type, some_code, info, *rest = line.split(':')
        if log_type == 'ERROR':
            print(info.split(' ')[5])
Output:
/var/logs/jkl/ /var/logs/jkl/spam
using regex

import re
regex = re.compile(r'reported errors in the (?P<path>\S*)', flags=re.MULTILINE)
with open(log_file) as lf:
    logs = lf.read()

paths = regex.findall(logs)
print(paths)
Output:
['/var/logs/jkl/', '/var/logs/jkl/spam']
if file is huge it may be better to read it line by line (like in first example and use regex to parse the line). maybe it's possible to make better regex pattern though

also both snippets will fail if there is space in the path. if you expect these you may have to adjust the code accordingly
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [split] How to convert the CSV text file into a txt file Pinto94 5 686 Dec-23-2020, 08:04 AM
Last Post: ndc85430
  Unable to capture all images of a multipage TIFF file in a merge bendersbender 0 495 Nov-19-2020, 03:09 PM
Last Post: bendersbender
  Saving text file with a click: valueerror i/o operation on closed file vizier87 5 1,033 Nov-16-2020, 07:56 AM
Last Post: Gribouillis
  cx_Oracle.DatabaseError: Error while trying to retrieve text from error ORA-01804 rajeshparadker 0 2,678 Nov-12-2020, 07:34 PM
Last Post: rajeshparadker
  capture pytest results to a file maiya 2 781 Oct-17-2020, 03:42 AM
Last Post: maiya
  saving data from text file to CSV file in python having delimiter as space K11 1 598 Sep-11-2020, 06:28 AM
Last Post: bowlofred
  error "IndentationError: expected an indented block" axa 4 903 Sep-08-2020, 02:09 PM
Last Post: ibreeden
  Web Form to Python Script to Text File to zip file to web wfsteadman 1 743 Aug-09-2020, 02:12 PM
Last Post: snippsat
  Convert Excel file to Text file marvel_plato 6 3,818 Jul-17-2020, 01:45 PM
Last Post: marvel_plato
  Finding Duplicate in CSV file bond009 3 810 May-14-2020, 05:37 AM
Last Post: bond009

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020