Python Forum
First line with digits before last line
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
First line with digits before last line
#1
Greetings!
I’m parsing log files, 95% of the files have the lines I’m looking for.
5% does not, and I must get the first line (Line starts with date time “2022-08-14 14:37:46 ”)
And the last line (Line starts with date time “2022-08-14 14:39:00”)
The problem is not all last lines have a Date Time in the string.
I need to read the lines before the last line until I find one that starts with the data time ...

Here is what I got so far, it does not work as I wanted Confused :
import re

with open(r"C:/01/last_line.txt") as mfiler:
    frt_ln = mfiler.readline()
    print(f" Fl -> {frt_ln}")
    
    for rn_l in mfiler: 
        if 'Start' in rn_l :
            continue        
            # do something with the lines
        last_line = rn_l
    print(f" Last Line ->{last_line}")
    if not re.search('^\d+', last_line) :
        next
    else :
        print(f" Line with the DateTime -> {last_line}")
        #break
Here is a short example of the file:

2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ===================="
2022-08-14 14:37:46.523,17784 ,Information,"Bac Info:
[DS_DK] Bac Test Result : Passed
[DS_DK] Bac Iteration Result : Passed
2022-08-14 14:37:46.524,17784 ,Warning
Condition: NO Condition
Allowed Stages: Any Stage
Set Type: Hard
2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces :
[Bac Setup] Ethernet Connection -2
[Bac Setup] USB 3.0 to GB Ethernet

Could you help me with this?
Thank you.
Reply
#2
import io
import re

data = io.StringIO("""
This is not the first line
2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ===================="
2022-08-14 14:37:46.523,17784 ,Information,"Bac Info:
[DS_DK] Bac Test Result : Passed
[DS_DK] Bac Iteration Result : Passed
2022-08-14 14:37:46.524,17784 ,Warning
Condition: NO Condition
Allowed Stages: Any Stage
Set Type: Hard
2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces :
[Bac Setup] Ethernet Connection -2
[Bac Setup] USB 3.0 to GB Ethernet
""")

date_pattern = re.compile("^\d{4}-\d{2}-\d{2}")

first = last = None
for line in data:
    if re.search(date_pattern, line):
        last = line
        if first is None:
            first = line

print(first)
print(last)
Output:
2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ====================" 2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces :
tester_V likes this post
Reply
#3
Example with start- and enddate. It's unclear what with tailing lines should happen. For example if the end date were detected, but there are following lines without a date. This example prints the remaining lines without date until a line with date is detected.

import io
import re
from datetime import datetime as DateTime


log_file_like = io.StringIO(
    """2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ===================="
2022-08-14 14:37:46.523,17784 ,Information,"Bac Info:
[DS_DK] Bac Test Result : Passed
[DS_DK] Bac Iteration Result : Passed
2022-08-14 14:37:46.524,17784 ,Warning
Condition: NO Condition
Allowed Stages: Any Stage
Set Type: Hard
2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces :
[Bac Setup] Ethernet Connection -2
[Bac Setup] USB 3.0 to GB Ethernet
2022-08-14 14:39:00.033,15060 <-- this line should be excluded from results
"""
)


DATE_REGEX = re.compile(r"(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{2,8})")


def parse_date(line):
    if match := DATE_REGEX.search(line):
        return DateTime.fromisoformat(match.group(1))


def find(fd, start_date: DateTime, end_date: DateTime):
    """
    Reading file line by line and check for each line the date.
    If a date were found and it's later then start_date, those lines are yielded.
    If the date reached the end_date, remaining lines until next date are yielded.
    """
    start_found = end_found = False

    for line in fd:
        date = parse_date(line)

        if end_found and date:
            return

        # date could be None
        # assigning date to last_date
        # for comparison
        if date is not None:
            start_found = True
            last_date = date

        if start_found and last_date >= end_date:
            end_found = True
            yield line
        elif start_found and last_date >= start_date:
            yield line


#          int,  int,   int, int,  int,    int,    int
# datetime(year, month, day, hour, minute, second, microsecond)

start_date = DateTime(2022, 8, 14, 14, 37, 46, 524 * 1000)
end_date = DateTime(2022, 8, 14, 14, 39, 0, 32 * 1000)


for line in find(log_file_like, start_date, end_date):
    print(line, end="")

print()
Output:
[andre@andre-Fujitsu-i5 ~]$ python xxxxxxxx.py 2022-08-14 14:37:46.524,17784 ,Warning Condition: NO Condition Allowed Stages: Any Stage Set Type: Hard 2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces : [Bac Setup] Ethernet Connection -2 [Bac Setup] USB 3.0 to GB Ethernet
tester_V likes this post
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
I really appreciate the snippets you shared!
Both examples look great.

But I'm not sure how to use it.
The script I'm using has about 300 lines already.
The problem with the "last line" is at the end of it.
The file I'm parsing is already open and a lot of things have happened to it by the time I come to the "last string".
I need to make changes here:
    if not re.search('^\d+', last_line) :
        next
    else :
        print(f" Line with the DateTime -> {last_line}")
        #break
Sorry about that!

Tester_v
Reply
#5
Without any information about what kind of processing is performed in your 300 line script it is difficult to determine if either example will work for your problem. This is a WAG for how to use my solution.
for line in data:
    if re.search(date_pattern, line):
        last = line
        if first is None:
            first = line
        # Additional processing for lines that start with date/time goes here
    else:
        # Processing for lines that don't start with date/time goes here
Or maybe it would work like this:
    if re.search(date_pattern, line):
        last = line
        if first is None:
            first = line
    # Additional processing for any line goes here
tester_V likes this post
Reply
#6
Thank you guys!
I appreciate your help!
I'm really not that good with programming.
I came up with a simple (for me) solution that seems working fine.
I probably invented a bicycle... Wink

Here is the code:

import re
import linecache
 
fl = 'C:/01/last_line.txt'
with open(r"C:/01/last_line.txt") as mfiler:
    lnc = 0  # <-- Line Numners
    frt_ln = mfiler.readline()  # <------------------ First Line
    print(f" Fl -> {frt_ln}")
    
    for rn_l in mfiler:
        lnc+=1    
        if 'Start' in rn_l :
            continue        
            # do something with the lines
        la_line = rn_l
 
while (lnc) > 0 :
    print(f" Number of Ln in the File = {str(lnc)}") # < --- Number of lines in the file
    if re.search('^\d+', la_line) :
        print(f"  LN has DT --> {la_line} ")
        lnc = 0
        break
    else :
        lnc = lnc -1
        la_line = linecache.getline('C:/01/last_line.txt', lnc)
Thank you again!
I love this forum Big Grin
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to add multi-line comment section? Winfried 1 139 Mar-24-2024, 04:34 PM
Last Post: deanhystad
  break print_format lengthy line akbarza 4 275 Mar-13-2024, 08:35 AM
Last Post: akbarza
  Reading and storing a line of output from pexpect child eagerissac 1 4,149 Feb-20-2024, 05:51 AM
Last Post: ayoshittu
  coma separator is printed on a new line for some reason tester_V 4 420 Feb-02-2024, 06:06 PM
Last Post: tester_V
  problem with spliting line in print akbarza 3 337 Jan-23-2024, 04:11 PM
Last Post: deanhystad
  Unable to understand the meaning of the line of code. jahuja73 0 274 Jan-23-2024, 05:09 AM
Last Post: jahuja73
  Receive Input on Same Line? johnywhy 8 610 Jan-16-2024, 03:45 AM
Last Post: johnywhy
  Reading in of line not working? garynewport 2 786 Sep-19-2023, 02:22 PM
Last Post: snippsat
  'answers 2' is not defined on line 27 0814uu 4 671 Sep-02-2023, 11:02 PM
Last Post: 0814uu
  Sequential number for rows retrieved and storing the Primary UKey to the line number GYKR 2 555 Aug-22-2023, 10:14 AM
Last Post: GYKR

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020