Jun-24-2021, 09:17 AM
Hello ,
I have a log file that in the end of the day get to ~ 6GB of text
now I want to be able to cut from it a certion windows of time
for example
from 08:00:00 -- until 08:15:00
I have checked and in 15 min I have a around 1.5 milion lines (1,500,000)
when I run the code in the morning , when the log file is less then 1GB - everything is working .
when I run the code in the end of the day (when the log is more then 5GB)
It get stuck , sometime I get on my computer Memory error
and when I try to search another later window (7:00pm-7:20pm ) it can take more then 3 min before it get stuck
my question is
what can I do to make this run better ? faster ?
can pythion handale this amount of data?
this is the function
Thanks,
maybe to read
I have a log file that in the end of the day get to ~ 6GB of text
now I want to be able to cut from it a certion windows of time
for example
from 08:00:00 -- until 08:15:00
I have checked and in 15 min I have a around 1.5 milion lines (1,500,000)
when I run the code in the morning , when the log file is less then 1GB - everything is working .
when I run the code in the end of the day (when the log is more then 5GB)
It get stuck , sometime I get on my computer Memory error
and when I try to search another later window (7:00pm-7:20pm ) it can take more then 3 min before it get stuck
my question is
what can I do to make this run better ? faster ?
can pythion handale this amount of data?
this is the function
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
def FilterLogFile(StartDate, EndDate): StartDate = datetime.datetime.strptime(StartDate, '%d/%m/%Y-%H:%M:%S' ) EndDate = datetime.datetime.strptime(EndDate, '%d/%m/%Y-%H:%M:%S' ) EndDate = EndDate.strftime( '%d/%m/%Y-%H:%M:%S' ) StartDate = StartDate.strftime( '%d/%m/%Y-%H:%M:%S' ) StartDate = str (StartDate) EndDate = str (EndDate) print (StartDate) print (EndDate) count = 0 StartLine = 0 EndLine = 0 FullLogFile = open ( '/home/pi/logs/java.txt' , 'r' ) Lines = FullLogFile.readlines() ###------->>>> this part take to much time when it doens't stuck "Memory Error" FullLogFile.close() for line in Lines: count + = 1 if StartDate in line and StartLine = = 0 : print ( "Start Line {}: {}" . format (count, line.strip())) StartLine = count if EndDate in line and EndLine = = 0 : print ( "End Line {}: {}" . format (count, line.strip())) EndLine = count if StartLine ! = 0 and EndLine ! = 0 : break ## to stop the scan when he get to the wanted end time , no need to scan after the wanted time count = 0 print ( 'start line is %d , end line is %d' % (StartLine, EndLine)) print ( 'total number of line is %d' % (EndLine - StartLine)) with open (OutputFile, 'w' ) as f: for line in Lines: count + = 1 if StartLine < = count < = EndLine: f.write(line.strip() + "\r\n" ) return OutputFile |
maybe to read