Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 python one line file processing
#1
Hi coders,

I have a file stores alerts, and it only stores alerts generated today, alerts before today have been archived to another files with datastamp.

In this file, one line has one alert. First I need to find alert type A, commands like grep will give me a lot of rows which belongs to type A.
Then I need to find if it has a string named "srcip", if not, I just move on to look a new row, if this row has a string named "srcip", then I need to search string "srcport" and "dstip", and store these three variables.
Now I need to search alerts type B, type B also have a lot of rows, but there is a field called "timestamp", type A's "timestamp" should be a few seconds apart type B's, and if the time apart too much, it's not the same, which shouldn't be correlated.
If A's srcip and srcport and dstip is same with type B's, then it's a bingo, and I need to extract "dstport" from type B alert.

The main question I don't know is how to know which rows have already been processed, and only search for the new rows?
Quote
#2
Your question isn't very clear. Say you find A(n), a type A error. You want to find B(n), the matching type B error. Is the file such that B(n) is going to appear in the file after A(n) but before A(n+1), the next type A error? If so, this is easy: Keep track of the last type A you found, and check it against any type B's you find.

If that is not the case, you need to keep track of all the type A's you find (that match your other criteria, of course). I would put them in a list, probably ordered by timestamp. If there's a match, put the match in the output, and remove the matching type A.

Depending on the data, I might use a dictionary. The key would a tuple of (scrip, srcport, dstip), the list would be a list of matching type A errors. Then for a given type B, you could find all of the potentially matching type A's, and check their time stamps.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures

Quote
#3
Aha, my main question I don't know is how to know which rows have already been processed, and only search for the new rows? Does python have some library about this?
Quote
#4
You can process the first time the whole file and use after the iteration of lines the method tell of of the file object, which tells you where you are (at which byte). You can convert the integer to a str and write it to a file. Next time the script looks for this file and if the file is present, it should load the content of the file, convert it back to an int and you use before you start iterating over the lines, you use seek(position) on the file object. Then you have the position, where your script finished last time.

In [20]: with open('birds.txt') as fd: 
    ...:     for line in fd: 
    ...:         print(line.strip()) 
    ...:     print(fd.tell()) 
    ...: #fd.tell() <- file is already closed 
    ...:                                                                                                                                                                                    
2010-01-01 01:01:00.0000 left
2010-11-01 01:01:00.0000 right
2010-10-01 01:01:00.0000 right
91
So, if a program writes now to birds.txt, it starts as byte position 91.
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#5
Amazing! Thank you DeaD_EyE, this is what I need.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Python convert multi line into single line formatted string karthidec 2 169 Dec-23-2019, 12:46 PM
Last Post: karthidec
  Detect end of line in text file including line breaks DanielM 4 188 Dec-18-2019, 11:57 AM
Last Post: Malt
  update txt file but keep a specific line 3Pinter 2 162 Dec-16-2019, 07:54 AM
Last Post: 3Pinter
  pdf file processing: how to "Enable Editing" Pavel_47 4 179 Dec-04-2019, 10:00 AM
Last Post: Pavel_47
  line number of first and second occurance of string in a file mdalireza 1 120 Nov-18-2019, 09:55 AM
Last Post: perfringo
  How to read text file line by line SriRajesh 1 145 Nov-05-2019, 01:51 PM
Last Post: snippsat
  How to do real-time audio signal processing using python Zenolen 7 571 Nov-04-2019, 02:57 AM
Last Post: jefsummers
  insert value to specific line in CSV file asheru93 1 202 Oct-21-2019, 03:37 PM
Last Post: Larz60+
  How do you replace a word after a match from a list of words in each line of a file? vijju56 1 273 Oct-17-2019, 03:04 PM
Last Post: baquerik
  Read each line, replace string and save into a new file igormonteiro 2 333 Sep-15-2019, 01:24 PM
Last Post: buran

Forum Jump:


Users browsing this thread: 1 Guest(s)