Python Forum

Full Version: Parsing a syslog file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hi,

I'm trying to extract data from a system log file but, I cannot get the syntax right.
Particularly, I'm trying to get the username and its timestamp value so I can save them into a DB.

I appreciate some help.
TIA

filename:
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] Started flows
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [mqtt-broker:server] Connected to broker: mqtt://192.168.1.230:1883
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [audit] {"event":"comms.open","level":98,"timestamp":1633848664512}
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [audit] {"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633848664540}
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [remote-access:Remote access] Using nodered02.remote-red.com on port 59153
Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - [info] [remote-access:Remote access] starting ssh process
Oct 10 08:55:14 washup20 Node-RED[17201]: 10 Oct 08:55:14 - [audit] {"event":"auth.login.revoke","level":98,"user":{"username":"admin","permissions":"*"},"path":"/auth/revoke","ip":"192.168.1.28","timestamp":1633848914062}
import re
import json

filename = "systemfile.log"

# strip unneeded text from json format and save audit lines only
re_line= re.compile("audit")
data = []
with open(filename, "r") as in_file:
    # Loop over each log line
    for line in in_file:
        if re_line.search(line):
            data.append(line)

print(data)  # so far so good

for i in range(len(data)):
    print('user= {}'.format(int(data[0][i])))  #  <-- syntax error 


# for i in range(len(data)):
#     username = user[username]
#     timestamp =

# with open(data) as audits:
#     for line in audits:
#         audit = json.loads(line)
#         # process event dictionary
# print(audit['username']['timestamp'])
(Oct-10-2021, 08:48 AM)ebolisa Wrote: [ -> ]
print(data)  # so far so good
Using data form here,and use literal_eva() the safe eval to recover the dictionary.
>>> res = data[1].strip().split('[audit] ')[1]
>>> res
'{"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633848664540}'
>>> 
>>> from ast import literal_eval
>>> 
>>> result = literal_eval(res)
>>> result
{'event': 'comms.auth',
 'level': 98,
 'timestamp': 1633848664540,
 'user': {'permissions': '*', 'username': 'admin'}}
>>> 
>>> result['timestamp']
1633848664540
>>> result['user']['username']
'admin' 
Thank you, that works but why it won't iterate in this case?

for i in range(len(data)):
    res = data[i].strip().split('[audit] ')[i]
    result = literal_eval(res)
    print(result)
    timestamp = result['timestamp']
    user = result['user']['username']
    print(user, timestamp)
EDIT: added results
Traceback (most recent call last):
  File "C:\SharedFiles\Python\practice\temp.py", line 19, in <module>
    result = literal_eval(res)
  File "C:\Python39\lib\ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "C:\Python39\lib\ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 1
    Oct 10 08:51:04 washup20 Node-RED[17201]: 10 Oct 08:51:04 - 
        ^
SyntaxError: invalid syntax
Wouldn't it help if you showed whatever error you're getting?
(Oct-10-2021, 10:53 AM)ndc85430 Wrote: [ -> ]Wouldn't it help if you showed whatever error you're getting?
Thanks, I edited my post.
Do a test to make sure username is in line,as your regex dos not test that.
Use enumerate and not range(len(data)).
Then it will be like this.
import re
import json
from ast import literal_eval

filename = "systemfile.log"
# strip unneeded text from json format and save audit lines only
re_line= re.compile("audit")
data = []
with open(filename, "r") as in_file:
    # Loop over each log line
    for line in in_file:
        if re_line.search(line):
            data.append(line)

#print(data)

for index,line in enumerate(data):
    if 'username' in line:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
Output:
1633848664540 admin 1633848914062 admin
(Oct-10-2021, 11:25 AM)snippsat Wrote: [ -> ]
Output:
1633848664540 admin 1633848914062 admin
Thank you much!!
This' stange. The code works on Windows but when I ran on Linux, I get the following error, why is that? Think

Error:
Traceback (most recent call last): File "test.py", line 23, in <module> user = result['user']['username'] KeyError: 'user'
It will give KeyError if a line pass trough and can not do a dictionary call.
To make it more robust change to this.
for index,line in enumerate(data):
    try:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
    except KeyError:
        pass
        #print(res) # lines that fail
(Oct-10-2021, 01:46 PM)snippsat Wrote: [ -> ]It will give KeyError if a line pass trough and can not do a dictionary call.
To make it more robust change to this.
for index,line in enumerate(data):
    try:
        res = data[index].strip().split('[audit] ')[1]
        result = literal_eval(res)
        #print(result)
        timestamp = result['timestamp']
        user = result['user']['username']
        print(timestamp)
        print(user)
    except KeyError:
        pass
        #print(res) # lines that fail

Now fails with

Error:
Traceback (most recent call last): File "test.py", line 20, in <module> res = data[index].strip().split('[audit] ')[1] IndexError: list index out of range
This' the content of data[]
['Oct 10 11:42:42 washup20 kernel: [    0.044121] audit: initializing netlink subsys (disabled)\n', 'Oct 10 11:42:42 washup20 kernel: [    0.044434] audit: type=2000 audit(0.040:1): state=initialized audit_enabled=0 res=1\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"auth.login","username":"admin","client":"node-red-editor","scope":"*","level":98,"timestamp":1633862007571}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"comms.open","level":98,"timestamp":1633862007836}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"plugins.list.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/plugins","ip":"192.168.1.28","timestamp":1633862007849}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"comms.auth","user":{"username":"admin","permissions":"*"},"level":98,"timestamp":1633862007899}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"plugins.configs.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/plugins","ip":"192.168.1.28","timestamp":1633862007938}\n', 
'Oct 10 12:33:27 washup20 Node-RED[334]: 10 Oct 12:33:27 - [audit] {"event":"nodes.list.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/nodes","ip":"192.168.1.28","timestamp":1633862007952}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"nodes.icons.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/icons","ip":"192.168.1.28","timestamp":1633862008083}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"nodes.configs.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/nodes","ip":"192.168.1.28","timestamp":1633862008097}\n', 
'Oct 10 12:33:28 washup20 Node-RED[334]: 10 Oct 12:33:28 - [audit] {"event":"flows.get","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633862008930}\n', 
'Oct 10 13:04:30 washup20 Node-RED[334]: 10 Oct 13:04:30 - [audit] {"event":"library.get","library":"local","type":"flows","path":"","level":98,"timestamp":1633863870620}\n', 
'Oct 10 13:04:30 washup20 Node-RED[334]: 10 Oct 13:04:30 - [audit] {"event":"library.get","library":"examples","type":"flows","path":"","level":98,"timestamp":1633863870627}\n', 
'Oct 10 13:04:46 washup20 Node-RED[334]: 10 Oct 13:04:46 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633863886938}\n', 
'Oct 10 13:06:55 washup20 Node-RED[334]: 10 Oct 13:06:55 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864015713}\n', 
'Oct 10 13:08:15 washup20 Node-RED[334]: 10 Oct 13:08:15 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864095264}\n', 
'Oct 10 13:09:50 washup20 Node-RED[334]: 10 Oct 13:09:50 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864190630}\n', 
'Oct 10 13:10:09 washup20 Node-RED[334]: 10 Oct 13:10:09 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"demo","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633864209888}\n', 
'Oct 10 13:24:17 washup20 Node-RED[334]: 10 Oct 13:24:17 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865057510}\n', 
'Oct 10 13:24:31 washup20 Node-RED[334]: 10 Oct 13:24:31 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865071172}\n', 
'Oct 10 13:25:05 washup20 Node-RED[334]: 10 Oct 13:25:05 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865105808}\n', 
'Oct 10 13:30:34 washup20 Node-RED[334]: 10 Oct 13:30:34 - [audit] {"event":"flows.set","type":"full","level":98,"user":{"username":"admin","permissions":"*"},"path":"/flows","ip":"192.168.1.28","timestamp":1633865434864}\n']
EDIT: Again, it works on Windows. For some reason, it fails on Linus where the '/var/log/syslog' file is.
Pages: 1 2