Python Forum
"Cut" big log file according to wanted dates? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: "Cut" big log file according to wanted dates? (/thread-33301.html)



"Cut" big log file according to wanted dates? - korenron - Apr-14-2021

Hello ,
I created a small program to cut line from log file
import sys

OutputFile = open('C:\\Users\\David\\Desktop\\OutPutLog.txt', "w+")
file1 = open('C:\\Users\\David\\Desktop\\Log.txt', 'r')
Lines = file1.readlines()
file1.close()

StartDate = input('Enter Start Date\n')
EndDate = input('Enter End Date\n')

count = 0
StartLine = 0
EndLine = 0
# Strips the newline character
for line in Lines:
    count += 1
    if StartDate in line and StartLine == 0:
        print("Start Line {}: {}".format(count, line.strip()))
        StartLine = count
    if EndDate in line and EndLine == 0:
        print("End Line{}: {}".format(count, line.strip()))
        EndLine = count
count = 0
print('start line is %d , end line is %d' % (StartLine, EndLine))
print('total number of line is  %d' % (EndLine-StartLine))
for line in Lines:
    count += 1
    if StartLine <= count <= EndLine:
        OutputFile.write(line.strip() + "\r\n")
it's working
but I feel I can write it better
can someone help with make the code better\smallest\ prettier?

**** also I have notice that if the file is bigger then 4GB - I get memory error
is there any way to overcome this problem ?

Thanks ,


RE: "Cut" big log file according to wanted dates? - buran - Apr-14-2021

you can iterate over file line by line instead of reading all in memory. For every line check the date and process it accordingly, no need to count etc.

and your original post had sample log file. Now it's more unclear what your data look like.


RE: "Cut" big log file according to wanted dates? - korenron - Apr-18-2021

ho do I do this?
like this?
with open("sample.txt", "r") as a_file:
  for line in a_file:
    stripped_line = line.strip()
    print(stripped_line)
then what ?
I retun to same problem no ?
what do I need to change in my thinking\seeing of the problem?
I think I'm missing something here
Thanks ,


RE: "Cut" big log file according to wanted dates? - buran - Apr-18-2021

you need to parse the line. In your original post there was sample data. You said line looks like this
Output:
14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3
so
from datetime import datetime
line = "14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3\n"
my_date, *rest = line.split()
my_date = datetime.strptime(my_date, '%d/%m/%Y-%H:%M:%S:%f')
print(my_date)
if <some condition here>: # replace with condition you want
    # do something with the line
Output:
2021-04-14 08:45:09.110000
from datetime import datetime
with open("sample.txt", "r") as a_file:
    for line in a_file:
        my_date, *rest = line.split()
        my_date = datetime.strptime(my_date, '%d/%m/%Y-%H:%M:%S:%f')
        print(my_date)
        if <some condition here>: # replace with condition you want
            # do something with the line



RE: "Cut" big log file according to wanted dates? - korenron - Apr-18-2021

can you exlain to me this line please

my_date, *rest = line.split()
Thanks,


RE: "Cut" big log file according to wanted dates? - buran - Apr-18-2021

it's called extended iterable unpaking (PEP3132)
line = "14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3\n"
my_date, *rest = line.split()
print(my_date)
print(rest)
Output:
14/04/2021-08:45:09:110 ['can0', '18F106A7', '[8]', '7C', '00', '00', '00', '00', '00', '00', 'A3']



RE: "Cut" big log file according to wanted dates? - korenron - Apr-18-2021

OK
If I understadn correct , it will "cut" after the first space?
this is what it does?

I will try to write the code using this ,
if I will have any prbolem I will continue the post

Thank you for the help ( for now :-) )


RE: "Cut" big log file according to wanted dates? - buran - Apr-18-2021

(Apr-18-2021, 07:16 AM)korenron Wrote: If I understadn correct , it will "cut" after the first space?
str.split() (i.e. without argument) will split at any white space and yield list. That's on the RHS. Then on the LHS using extended iterable unpacking we assign first element to my_date and rest of the elements to rest

If you are unfamiliar with unpacking in general check e.g. https://stackoverflow.com/q/2322355/4046632