Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Parsing a syslog file
#1
Hi,

I'm trying to extract data from a system log file but, I cannot get the syntax right.
Particularly, I'm trying to get the username and its timestamp value so I can save them into a DB.

I appreciate some help.
TIA

filename:
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] Started flows
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [mqtt-broker:server] Connected to broker: mqtt://192.168.1.230:1883
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [audit] {"event":"comms.open","level":98,"timestamp":1633848664512}
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [audit] {"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633848664540}
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [remote-access:Remote access] Using nodered02.remote-red.com on port 59153
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [remote-access:Remote access] starting ssh process
Oct 10 08:55:14 washup20 Node-RED[17201]: 10 Oct 08:55:14 - [audit] {"event":"auth.login.revoke","level":98,"user":{"username":"admin","permissions":"*"},"path":"/auth/revoke","ip":"192.168.1.28","timestamp":1633848914062}
import re
import json

filename = "systemfile.log"

# strip unneeded text from json format and save audit lines only
re_line= re.compile("audit")
data = []
with open(filename, "r") as in_file:
    # Loop over each log line
    for line in in_file:
        if re_line.search(line):
            data.append(line)

print(data)  # so far so good

for i in range(len(data)):
    print('user= {}'.format(int(data[0][i])))  #  <-- syntax error 


# for i in range(len(data)):
#     username = user[username]
#     timestamp =

# with open(data) as audits:
#     for line in audits:
#         audit = json.loads(line)
#         # process event dictionary
# print(audit['username']['timestamp'])
Reply
#2
(Oct-10-2021, 08:48 AM)ebolisa Wrote:
print(data)  # so far so good
Using data form here,and use literal_eva() the safe eval to recover the dictionary.
>>> res = data[1].strip().split('[audit] ')[1]
>>> res
'{"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633848664540}'
>>> 
>>> from ast import literal_eval
>>> 
>>> result = literal_eval(res)
>>> result
{'event': 'comms.auth',
 'level': 98,
 'timestamp': 1633848664540,
 'user': {'permissions': '*', 'username': 'admin'}}
>>> 
>>> result['timestamp']
1633848664540
>>> result['user']['username']
'admin' 
ebolisa likes this post
Reply
#3
Thank you, that works but why it won't iterate in this case?

for i in range(len(data)):
    res = data[i].strip().split('[audit] ')[i]
    result = literal_eval(res)
    print(result)
    timestamp = result['timestamp']
    user = result['user']['username']
    print(user, timestamp)
EDIT: added results
Traceback (most recent call last):
  File "C:\SharedFiles\Python\practice\temp.py", line 19, in <module>
    result = literal_eval(res)
  File "C:\Python39\lib\ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "C:\Python39\lib\ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 1
    Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - 
        ^
SyntaxError: invalid syntax
Reply
#4
Wouldn't it help if you showed whatever error you're getting?
Reply
#5
(Oct-10-2021, 10:53 AM)ndc85430 Wrote: Wouldn't it help if you showed whatever error you're getting?
Thanks, I edited my post.
Reply
#6
Do a test to make sure username is in line,as your regex dos not test that.
Use enumerate and not range(len(data)).
Then it will be like this.
import re
import json
from ast import literal_eval

filename = "systemfile.log"
# strip unneeded text from json format and save audit lines only
re_line= re.compile("audit")
data = []
with open(filename, "r") as in_file:
    # Loop over each log line
    for line in in_file:
        if re_line.search(line):
            data.append(line)

#print(data)

for index,line in enumerate(data):
    if 'username' in line:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
Output:
1633848664540 admin 1633848914062 admin
ebolisa likes this post
Reply
#7
(Oct-10-2021, 11:25 AM)snippsat Wrote:
Output:
1633848664540 admin 1633848914062 admin
Thank you much!!
Reply
#8
This' stange. The code works on Windows but when I ran on Linux, I get the following error, why is that? Think

Error:
Traceback (most recent call last): File "test.py", line 23, in <module> user = result['user']['username'] KeyError: 'user'
Reply
#9
It will give KeyError if a line pass trough and can not do a dictionary call.
To make it more robust change to this.
for index,line in enumerate(data):
    try:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
    except KeyError:
        pass
        #print(res) # lines that fail
ebolisa likes this post
Reply
#10
(Oct-10-2021, 01:46 PM)snippsat Wrote: It will give KeyError if a line pass trough and can not do a dictionary call.
To make it more robust change to this.
for index,line in enumerate(data):
    try:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
    except KeyError:
        pass
        #print(res) # lines that fail

Now fails with

Error:
Traceback (most recent call last): File "test.py", line 20, in <module> res = data[index].strip().split('[audit] ')[1] IndexError: list index out of range
This' the content of data[]
['Oct 10 11:42:42 washup20 kernel: [    0.044121] audit: initializing netlink subsys (disabled)\n', 'Oct 10 11:42:42 washup20 kernel: [    0.044434] audit: type=2000 audit(0.040:1): state=initialized audit_enabled=0 res=1\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"auth.login","username":"admin","client":"node-red-editor","scope":"*","level":98,"timestamp":1633862007571}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"comms.open","level":98,"timestamp":1633862007836}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"plugins.list.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/plugins","ip":"192.168.1.28","timestamp":1633862007849}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633862007899}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"plugins.configs.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/plugins","ip":"192.168.1.28","timestamp":1633862007938}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"nodes.list.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/nodes","ip":"192.168.1.28","timestamp":1633862007952}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"nodes.icons.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/icons","ip":"192.168.1.28","timestamp":1633862008083}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"nodes.configs.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/nodes","ip":"192.168.1.28","timestamp":1633862008097}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"flows.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633862008930}\n', 
'Oct 10 13:04:30 washup20 Node-RED[334]: 10 Oct 13:04:30 - [audit] {"event":"library.get","library":"local","type":"flows","path":"","level":98,"timestamp":1633863870620}\n', 
'Oct 10 13:04:30 washup20 Node-RED[334]: 10 Oct 13:04:30 - [audit] {"event":"library.get","library":"examples","type":"flows","path":"","level":98,"timestamp":1633863870627}\n', 
'Oct 10 13:04:46 washup20 Node-RED[334]: 10 Oct 13:04:46 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633863886938}\n', 
'Oct 10 13:06:55 washup20 Node-RED[334]: 10 Oct 13:06:55 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864015713}\n', 
'Oct 10 13:08:15 washup20 Node-RED[334]: 10 Oct 13:08:15 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864095264}\n', 
'Oct 10 13:09:50 washup20 Node-RED[334]: 10 Oct 13:09:50 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864190630}\n', 
'Oct 10 13:10:09 washup20 Node-RED[334]: 10 Oct 13:10:09 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"demo","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864209888}\n', 
'Oct 10 13:24:17 washup20 Node-RED[334]: 10 Oct 13:24:17 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865057510}\n', 
'Oct 10 13:24:31 washup20 Node-RED[334]: 10 Oct 13:24:31 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865071172}\n', 
'Oct 10 13:25:05 washup20 Node-RED[334]: 10 Oct 13:25:05 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865105808}\n', 
'Oct 10 13:30:34 washup20 Node-RED[334]: 10 Oct 13:30:34 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865434864}\n']
EDIT: Again, it works on Windows. For some reason, it fails on Linus where the '/var/log/syslog' file is.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Video doing data treatment on a file import-parsing a variable EmBeck87 15 2,914 Apr-17-2023, 06:54 PM
Last Post: EmBeck87
  Modify values in XML file by data from text file (without parsing) Paqqno 2 1,704 Apr-13-2022, 06:02 AM
Last Post: Paqqno
  Parsing xml file deletes whitespaces. How to avoid it? Paqqno 0 1,046 Apr-01-2022, 10:20 PM
Last Post: Paqqno
Thumbs Up Parsing a YAML file without changing the string content..?, Flask - solved. SpongeB0B 2 2,286 Aug-05-2021, 08:02 AM
Last Post: SpongeB0B
  Syslog server Fifoux082 5 2,800 Sep-15-2020, 07:08 PM
Last Post: Fifoux082
  File Name Parsing millpond 5 3,637 Aug-26-2020, 08:04 AM
Last Post: bowlofred
  Error while parsing tables from docx file aditi 1 3,743 Jul-14-2020, 09:24 PM
Last Post: aditi
  help parsing file aslezak 2 2,250 Oct-22-2019, 03:51 PM
Last Post: aslezak
  Python Script for parsing dictionary values from yaml file pawan6782 3 4,965 Sep-04-2019, 07:21 PM
Last Post: pawan6782
  Parsing an MBOX file Oliver 1 8,204 May-26-2019, 07:12 AM
Last Post: heiner55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020