Python Forum
"Cut" big log file according to wanted dates?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
"Cut" big log file according to wanted dates?
#1
Hello ,
I created a small program to cut line from log file
import sys

OutputFile = open('C:\\Users\\David\\Desktop\\OutPutLog.txt', "w+")
file1 = open('C:\\Users\\David\\Desktop\\Log.txt', 'r')
Lines = file1.readlines()
file1.close()

StartDate = input('Enter Start Date\n')
EndDate = input('Enter End Date\n')

count = 0
StartLine = 0
EndLine = 0
# Strips the newline character
for line in Lines:
    count += 1
    if StartDate in line and StartLine == 0:
        print("Start Line {}: {}".format(count, line.strip()))
        StartLine = count
    if EndDate in line and EndLine == 0:
        print("End Line{}: {}".format(count, line.strip()))
        EndLine = count
count = 0
print('start line is %d , end line is %d' % (StartLine, EndLine))
print('total number of line is  %d' % (EndLine-StartLine))
for line in Lines:
    count += 1
    if StartLine <= count <= EndLine:
        OutputFile.write(line.strip() + "\r\n")
it's working
but I feel I can write it better
can someone help with make the code better\smallest\ prettier?

**** also I have notice that if the file is bigger then 4GB - I get memory error
is there any way to overcome this problem ?

Thanks ,
Reply
#2
you can iterate over file line by line instead of reading all in memory. For every line check the date and process it accordingly, no need to count etc.

and your original post had sample log file. Now it's more unclear what your data look like.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
ho do I do this?
like this?
with open("sample.txt", "r") as a_file:
  for line in a_file:
    stripped_line = line.strip()
    print(stripped_line)
then what ?
I retun to same problem no ?
what do I need to change in my thinking\seeing of the problem?
I think I'm missing something here
Thanks ,
Reply
#4
you need to parse the line. In your original post there was sample data. You said line looks like this
Output:
14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3
so
from datetime import datetime
line = "14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3\n"
my_date, *rest = line.split()
my_date = datetime.strptime(my_date, '%d/%m/%Y-%H:%M:%S:%f')
print(my_date)
if <some condition here>: # replace with condition you want
    # do something with the line
Output:
2021-04-14 08:45:09.110000
from datetime import datetime
with open("sample.txt", "r") as a_file:
    for line in a_file:
        my_date, *rest = line.split()
        my_date = datetime.strptime(my_date, '%d/%m/%Y-%H:%M:%S:%f')
        print(my_date)
        if <some condition here>: # replace with condition you want
            # do something with the line
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
can you exlain to me this line please

my_date, *rest = line.split()
Thanks,
Reply
#6
it's called extended iterable unpaking (PEP3132)
line = "14/04/2021-08:45:09:110 can0 18F106A7 [8] 7C 00 00 00 00 00 00 A3\n"
my_date, *rest = line.split()
print(my_date)
print(rest)
Output:
14/04/2021-08:45:09:110 ['can0', '18F106A7', '[8]', '7C', '00', '00', '00', '00', '00', '00', 'A3']
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#7
OK
If I understadn correct , it will "cut" after the first space?
this is what it does?

I will try to write the code using this ,
if I will have any prbolem I will continue the post

Thank you for the help ( for now :-) )
Reply
#8
(Apr-18-2021, 07:16 AM)korenron Wrote: If I understadn correct , it will "cut" after the first space?
str.split() (i.e. without argument) will split at any white space and yield list. That's on the RHS. Then on the LHS using extended iterable unpacking we assign first element to my_date and rest of the elements to rest

If you are unfamiliar with unpacking in general check e.g. https://stackoverflow.com/q/2322355/4046632
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020