Python Forum
[SOLVED] Find last occurence of pattern in text file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Find last occurence of pattern in text file?
#1
Thumbs Up 
Hello,

In a multiline text file, I need to find the last occurence of a pattern, ie. the Python equivalent of "tail".

None of the following works:

import re

INPUTFILE = "log.txt" 

with open(INPUTFILE) as reader:
    content = reader.read()

#====================
#very slow, even on 130KB file
p = re.compile("(?s:.*)^Some pattern.+$")
m = p.search(content)
if m.group(0):
	print(m.group(0))

#====================
#IndexError: list index out of range
p = re.compile("^Some pattern.+$")
m = p.findall(content)[-1]
if m.group(0):
	print(m.group(0))

#====================
#AttributeError: 'NoneType' object has no attribute 'group'
p = re.compile("^Some pattern.+$")
for line in content:
	m = p.search(line)
	if m.group(0):
		print(m.group(0))

#====================
#AttributeError: 'str' object has no attribute 'readlines'
for line in content.readlines():
	m = p.search(line)
	if m.group(0):
		print(m.group(0))
Anyone knows?

Thank you.
Reply
#2
tail doesn't select on patterns or show the "the last occurrence" of something, so I'm not sure I understand what you're looking for.

Do you have to support patterns (just a string match is insufficient)? Can the pattern span lines, or will it always be in a single line? Do you want to display the entire line the pattern matches, or just the result of the match?

Your first one looks like you have a slow pattern. It's possible to construct a pattern that requires backtracking. If you require pattern support across the entire file, it will be possible to supply a pattern that is slow. But if the pattern only has to match within a line, that will usually limit the problems that can arise.

If you don't need full pattern support, and you want to see the line of last occurrence, I'd probably suggest something like:
INPUTFILE = "log.txt"

target = "print"

with open(INPUTFILE) as reader:
    last_line = None
    for line in reader.read().splitlines():
        if target in line:
            last_line = line
if last_line:
    print(last_line)
else:
    print("No match")
Reply
#3
Sorry for the confusion. I was trying to turn a Windows batch script into Python, that used grep + sed + tail, but you're right, the meat was in the grep + sed.

The following works to find two close by slightly different patterns starting from the end of the file:

pattern = "^START_A.+to (.+?) \(.+$"
p = re.compile(pattern)
for line in reversed(list(open(INPUTFILE))):
	m = p.search(line.rstrip())
	if m:
		print(m.group(1))
		break

pattern = "^START_B.+to (.+?) \(.+$"
p = re.compile(pattern)
for line in reversed(list(open(INPUTFILE))):
	m = p.search(line.rstrip())
	if m:
		print(m.group(1))
		break
I'll see if I can refine it so as to avoid needless copy/pasting.

Thank you!
Reply
#4
If you can show the original grep/sed/tail, that might be useful.

Also, how big are the files? Reversing a MB file seems unnecessary, but acceptable. If you're scanning GB files, that starts to get silly.
Reply
#5
It's just a ~100KB file, so it fast enough.

I simplified the script with a function:

import re
#pip install pyperclip
import pyperclip

def SearchAndTell(MYFILE,mypattern):
	p = re.compile(mypattern)
	for line in reversed(open(MYFILE).readlines()):
		m = p.search(line.rstrip())
		if m:
			return m.group(1)
			break #needed?

INPUTFILE = "log.txt" 
clipb = None

pattern = "^START_A.+to (.+?) \(.+$"
clipb = f"-ss {SearchAndTell(INPUTFILE,pattern)} ".replace(",",".")
pattern = "^START_B.+to (.+?) \(.+$"
clipb += f"-to {SearchAndTell(INPUTFILE,pattern)}".replace(",",".")

pyperclip.copy(clipb)
FWIW, here's the batch script:
grep -Poha "^START_A.+$" log.txt | sed -r "[email protected]^.+ to (.+?) \([email protected] \[email protected]" | sed -r "[email protected],@[email protected]" | tail -1
grep -Poha "^START_B.+$" log.txt | sed -r "[email protected]^.+ to (.+?) \([email protected] \[email protected]" | sed -r "[email protected],@[email protected]" | tail -1
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Find (each) element from a list in a file tester_V 3 289 Nov-15-2022, 08:40 PM
Last Post: tester_V
  [SOLVED] [Beautifulsoup] Find if element exists, and edit/append? Winfried 2 466 Sep-03-2022, 10:14 PM
Last Post: Winfried
  [Solved by deanhystad] Create a zip file using zipfile library DZ_Galaxy 2 303 Aug-17-2022, 04:57 PM
Last Post: DZ_Galaxy
  [SOLVED] [BeautifulSoup] How to get this text? Winfried 6 462 Aug-17-2022, 03:58 PM
Last Post: Winfried
  read a text file, find all integers, append to list oldtrafford 12 1,081 Aug-11-2022, 08:23 AM
Last Post: Pedroski55
  what will be the best way to find data in txt file? korenron 2 427 Jul-25-2022, 10:03 AM
Last Post: korenron
  Delete empty text files [SOLVED] AlphaInc 5 598 Jul-09-2022, 02:15 PM
Last Post: DeaD_EyE
  find some word in text list file and a bit change to them RolanRoll 3 549 Jun-27-2022, 01:36 AM
Last Post: RolanRoll
  [SOLVED] [ElementTree] Grab text in attributes? Winfried 3 764 May-27-2022, 04:59 PM
Last Post: Winfried
  Modify values in XML file by data from text file (without parsing) Paqqno 2 638 Apr-13-2022, 06:02 AM
Last Post: Paqqno

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020