Python Forum
[SOLVED] Find last occurence of pattern in text file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Find last occurence of pattern in text file?
#1
Thumbs Up 
Hello,

In a multiline text file, I need to find the last occurence of a pattern, ie. the Python equivalent of "tail".

None of the following works:

import re

INPUTFILE = "log.txt" 

with open(INPUTFILE) as reader:
    content = reader.read()

#====================
#very slow, even on 130KB file
p = re.compile("(?s:.*)^Some pattern.+$")
m = p.search(content)
if m.group(0):
	print(m.group(0))

#====================
#IndexError: list index out of range
p = re.compile("^Some pattern.+$")
m = p.findall(content)[-1]
if m.group(0):
	print(m.group(0))

#====================
#AttributeError: 'NoneType' object has no attribute 'group'
p = re.compile("^Some pattern.+$")
for line in content:
	m = p.search(line)
	if m.group(0):
		print(m.group(0))

#====================
#AttributeError: 'str' object has no attribute 'readlines'
for line in content.readlines():
	m = p.search(line)
	if m.group(0):
		print(m.group(0))
Anyone knows?

Thank you.
Reply
#2
tail doesn't select on patterns or show the "the last occurrence" of something, so I'm not sure I understand what you're looking for.

Do you have to support patterns (just a string match is insufficient)? Can the pattern span lines, or will it always be in a single line? Do you want to display the entire line the pattern matches, or just the result of the match?

Your first one looks like you have a slow pattern. It's possible to construct a pattern that requires backtracking. If you require pattern support across the entire file, it will be possible to supply a pattern that is slow. But if the pattern only has to match within a line, that will usually limit the problems that can arise.

If you don't need full pattern support, and you want to see the line of last occurrence, I'd probably suggest something like:
INPUTFILE = "log.txt"

target = "print"

with open(INPUTFILE) as reader:
    last_line = None
    for line in reader.read().splitlines():
        if target in line:
            last_line = line
if last_line:
    print(last_line)
else:
    print("No match")
Reply
#3
Sorry for the confusion. I was trying to turn a Windows batch script into Python, that used grep + sed + tail, but you're right, the meat was in the grep + sed.

The following works to find two close by slightly different patterns starting from the end of the file:

pattern = "^START_A.+to (.+?) \(.+$"
p = re.compile(pattern)
for line in reversed(list(open(INPUTFILE))):
	m = p.search(line.rstrip())
	if m:
		print(m.group(1))
		break

pattern = "^START_B.+to (.+?) \(.+$"
p = re.compile(pattern)
for line in reversed(list(open(INPUTFILE))):
	m = p.search(line.rstrip())
	if m:
		print(m.group(1))
		break
I'll see if I can refine it so as to avoid needless copy/pasting.

Thank you!
Reply
#4
If you can show the original grep/sed/tail, that might be useful.

Also, how big are the files? Reversing a MB file seems unnecessary, but acceptable. If you're scanning GB files, that starts to get silly.
Reply
#5
It's just a ~100KB file, so it fast enough.

I simplified the script with a function:

import re
#pip install pyperclip
import pyperclip

def SearchAndTell(MYFILE,mypattern):
	p = re.compile(mypattern)
	for line in reversed(open(MYFILE).readlines()):
		m = p.search(line.rstrip())
		if m:
			return m.group(1)
			break #needed?

INPUTFILE = "log.txt" 
clipb = None

pattern = "^START_A.+to (.+?) \(.+$"
clipb = f"-ss {SearchAndTell(INPUTFILE,pattern)} ".replace(",",".")
pattern = "^START_B.+to (.+?) \(.+$"
clipb += f"-to {SearchAndTell(INPUTFILE,pattern)}".replace(",",".")

pyperclip.copy(clipb)
FWIW, here's the batch script:
grep -Poha "^START_A.+$" log.txt | sed -r "s@^.+ to (.+?) \(.+$@-ss \1@" | sed -r "s@,@.@g" | tail -1
grep -Poha "^START_B.+$" log.txt | sed -r "s@^.+ to (.+?) \(.+$@-to \1@" | sed -r "s@,@.@g" | tail -1
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] Correct way to convert file from cp-1252 to utf-8? Winfried 8 543 Feb-29-2024, 12:30 AM
Last Post: Winfried
  FileNotFoundError: [WinError 2] The system cannot find the file specified NewBiee 2 1,490 Jul-31-2023, 11:42 AM
Last Post: deanhystad
  Loop through json file and reset values [SOLVED] AlphaInc 2 1,959 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  Cannot find py credentials file standenman 5 1,553 Feb-25-2023, 08:30 PM
Last Post: Jeff900
  selenium can't find a file in my desk ? SouAmego22 0 701 Feb-14-2023, 03:21 PM
Last Post: SouAmego22
  Pypdf2 will not find text standenman 2 877 Feb-03-2023, 10:52 PM
Last Post: standenman
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,061 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  Find (each) element from a list in a file tester_V 3 1,155 Nov-15-2022, 08:40 PM
Last Post: tester_V
  [SOLVED] [Beautifulsoup] Find if element exists, and edit/append? Winfried 2 4,129 Sep-03-2022, 10:14 PM
Last Post: Winfried
  [Solved by deanhystad] Create a zip file using zipfile library DZ_Galaxy 2 1,104 Aug-17-2022, 04:57 PM
Last Post: DZ_Galaxy

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020