![]() |
First line with digits before last line - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: First line with digits before last line (/thread-38017.html) |
First line with digits before last line - tester_V - Aug-21-2022 Greetings! I’m parsing log files, 95% of the files have the lines I’m looking for. 5% does not, and I must get the first line (Line starts with date time “2022-08-14 14:37:46 ”) And the last line (Line starts with date time “2022-08-14 14:39:00”) The problem is not all last lines have a Date Time in the string. I need to read the lines before the last line until I find one that starts with the data time ... Here is what I got so far, it does not work as I wanted ![]() import re with open(r"C:/01/last_line.txt") as mfiler: frt_ln = mfiler.readline() print(f" Fl -> {frt_ln}") for rn_l in mfiler: if 'Start' in rn_l : continue # do something with the lines last_line = rn_l print(f" Last Line ->{last_line}") if not re.search('^\d+', last_line) : next else : print(f" Line with the DateTime -> {last_line}") #breakHere is a short example of the file: 2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ====================" 2022-08-14 14:37:46.523,17784 ,Information,"Bac Info: [DS_DK] Bac Test Result : Passed [DS_DK] Bac Iteration Result : Passed 2022-08-14 14:37:46.524,17784 ,Warning Condition: NO Condition Allowed Stages: Any Stage Set Type: Hard 2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces : [Bac Setup] Ethernet Connection -2 [Bac Setup] USB 3.0 to GB Ethernet Could you help me with this? Thank you. RE: First line with digits before last line - deanhystad - Aug-22-2022 import io import re data = io.StringIO(""" This is not the first line 2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ====================" 2022-08-14 14:37:46.523,17784 ,Information,"Bac Info: [DS_DK] Bac Test Result : Passed [DS_DK] Bac Iteration Result : Passed 2022-08-14 14:37:46.524,17784 ,Warning Condition: NO Condition Allowed Stages: Any Stage Set Type: Hard 2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces : [Bac Setup] Ethernet Connection -2 [Bac Setup] USB 3.0 to GB Ethernet """) date_pattern = re.compile("^\d{4}-\d{2}-\d{2}") first = last = None for line in data: if re.search(date_pattern, line): last = line if first is None: first = line print(first) print(last)
RE: First line with digits before last line - DeaD_EyE - Aug-22-2022 Example with start- and enddate. It's unclear what with tailing lines should happen. For example if the end date were detected, but there are following lines without a date. This example prints the remaining lines without date until a line with date is detected. import io import re from datetime import datetime as DateTime log_file_like = io.StringIO( """2022-08-14 14:37:46.523,17784 ,Information,"==================== Bac Start Run ====================" 2022-08-14 14:37:46.523,17784 ,Information,"Bac Info: [DS_DK] Bac Test Result : Passed [DS_DK] Bac Iteration Result : Passed 2022-08-14 14:37:46.524,17784 ,Warning Condition: NO Condition Allowed Stages: Any Stage Set Type: Hard 2022-08-14 14:39:00.032,15060 ,Information, Available network interfaces : [Bac Setup] Ethernet Connection -2 [Bac Setup] USB 3.0 to GB Ethernet 2022-08-14 14:39:00.033,15060 <-- this line should be excluded from results """ ) DATE_REGEX = re.compile(r"(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{2,8})") def parse_date(line): if match := DATE_REGEX.search(line): return DateTime.fromisoformat(match.group(1)) def find(fd, start_date: DateTime, end_date: DateTime): """ Reading file line by line and check for each line the date. If a date were found and it's later then start_date, those lines are yielded. If the date reached the end_date, remaining lines until next date are yielded. """ start_found = end_found = False for line in fd: date = parse_date(line) if end_found and date: return # date could be None # assigning date to last_date # for comparison if date is not None: start_found = True last_date = date if start_found and last_date >= end_date: end_found = True yield line elif start_found and last_date >= start_date: yield line # int, int, int, int, int, int, int # datetime(year, month, day, hour, minute, second, microsecond) start_date = DateTime(2022, 8, 14, 14, 37, 46, 524 * 1000) end_date = DateTime(2022, 8, 14, 14, 39, 0, 32 * 1000) for line in find(log_file_like, start_date, end_date): print(line, end="") print()
RE: First line with digits before last line - tester_V - Aug-22-2022 I really appreciate the snippets you shared! Both examples look great. But I'm not sure how to use it. The script I'm using has about 300 lines already. The problem with the "last line" is at the end of it. The file I'm parsing is already open and a lot of things have happened to it by the time I come to the "last string". I need to make changes here: if not re.search('^\d+', last_line) : next else : print(f" Line with the DateTime -> {last_line}") #breakSorry about that! Tester_v RE: First line with digits before last line - deanhystad - Aug-22-2022 Without any information about what kind of processing is performed in your 300 line script it is difficult to determine if either example will work for your problem. This is a WAG for how to use my solution. for line in data: if re.search(date_pattern, line): last = line if first is None: first = line # Additional processing for lines that start with date/time goes here else: # Processing for lines that don't start with date/time goes hereOr maybe it would work like this: if re.search(date_pattern, line): last = line if first is None: first = line # Additional processing for any line goes here RE: First line with digits before last line - tester_V - Aug-22-2022 Thank you guys! I appreciate your help! I'm really not that good with programming. I came up with a simple (for me) solution that seems working fine. I probably invented a bicycle... ![]() Here is the code: import re import linecache fl = 'C:/01/last_line.txt' with open(r"C:/01/last_line.txt") as mfiler: lnc = 0 # <-- Line Numners frt_ln = mfiler.readline() # <------------------ First Line print(f" Fl -> {frt_ln}") for rn_l in mfiler: lnc+=1 if 'Start' in rn_l : continue # do something with the lines la_line = rn_l while (lnc) > 0 : print(f" Number of Ln in the File = {str(lnc)}") # < --- Number of lines in the file if re.search('^\d+', la_line) : print(f" LN has DT --> {la_line} ") lnc = 0 break else : lnc = lnc -1 la_line = linecache.getline('C:/01/last_line.txt', lnc)Thank you again! I love this forum ![]() |