Extract parts of a log-file and put it in a dataframe

hasiro · (This post was last modified: Apr-06-2022, 09:14 AM by buran.)

Hi

I'm a python beginner and would like to extract parts of a log-file and put it in a dataframe.
I tryed something, but it is not what I want.

Log-File content looks like this:

Output:----------------------------------------------------------
Model:				Hamilton-C1
S/N:				25576
Export timestamp:	2020-09-17_11-03-40
SW-Version:			2.2.9

I want extract only Hamilton-C1, 25576, 2020-09-17_11-03-40, 2.2.9

        
              #list
result = []
 
#function
def appendlines(line, result, word):
  if line.startswith(word):
    del result[:]
  result.append(line)
  return line, result
 
with open(file, "r") as lines: 
  for line in lines:              
    appendlines(line, result, ":")
new_result = [line.split() for line in result[1:5]]
 
print(new_result)

Output:
[['Model:', 'Hamilton-C1'], ['S/N:', '25455'], ['Export', 'timestamp:', '2020-09-16_21-12-40'], ['SW-Version:', '2.2.9']]

I want only this output:

Output:
[['Hamilton-C1'], ['25455'], ['2020-09-16_21-12-40'], ['2.2.9']]

How I have to change my code?

Thanks for help!

buran write Apr-06-2022, 09:13 AM:
Please, use proper tags when post code, traceback, output, etc. This time I have added tags for you.
See BBcode help for more info.

**deanhystad** · Apr-05-2022, 08:32 PM

This makes a dictionary where "Model" is a key and "Hamilton-C1" the value. You can use dictionary operations to get the keys or the values, or get the value associated with a key.

        
              with open("data.txt", "r") as file:
    items = {}
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items[a] = b
 
print(items)
print(*items.keys())
print(*items.values())

Output:{'Model': 'Hamilton-C1', 'S/N': '25576', 'Export timestamp': '2020-09-17_11-03-40', 'SW-Version': '2.2.9'}
Model S/N Export timestamp SW-Version
Hamilton-C1 25576 2020-09-17_11-03-40 2.2.9

If you have no interest in a dictionary make items a list and append items.

        
              with open("data.txt", "r") as file:
    items = []
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items.append(a)
 
print(items)

\

Output:
['Model', 'S/N', 'Export timestamp', 'SW-Version']

If you want each item in the list to be a list do this.

        
              with open("data.txt", "r") as file:
    items = []
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items.append([b])
 
print(items)

Output:
[['Hamilton-C1'], ['25576'], ['2020-09-17_11-03-40'], ['2.2.9']]

If every line in you log file is in the form "name: value" you could open this as a CSV file using ":" as the delimiter, or you could read it using pandas.

hasiro · Apr-06-2022, 08:18 PM

(Apr-05-2022, 08:32 PM)deanhystad Wrote: This makes a dictionary where "Model" is a key and "Hamilton-C1" the value. You can use dictionary operations to get the keys or the values, or get the value associated with a key.

1
2
3
4
5
6
7
8
9
10

with open("data.txt", "r") as file:
    items = {}
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items[a] = b

print(items)
print(*items.keys())
print(*items.values())
Output:{'Model': 'Hamilton-C1', 'S/N': '25576', 'Export timestamp': '2020-09-17_11-03-40', 'SW-Version': '2.2.9'}
Model S/N Export timestamp SW-Version
Hamilton-C1 25576 2020-09-17_11-03-40 2.2.9
If you have no interest in a dictionary make items a list and append items.

1
2
3
4
5
6
7
8

with open("data.txt", "r") as file:
    items = []
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items.append(a)

print(items)

\
Output:
['Model', 'S/N', 'Export timestamp', 'SW-Version']
If you want each item in the list to be a list do this.

1
2
3
4
5
6
7
8

with open("data.txt", "r") as file:
    items = []
    for line in file:
        if ":" in line:
            a, b = map(str.strip, line.split(":"))
            items.append([b])

print(items)
Output:
[['Hamilton-C1'], ['25576'], ['2020-09-17_11-03-40'], ['2.2.9']]
If every line in you log file is in the form "name: value" you could open this as a CSV file using ":" as the delimiter, or you could read it using pandas.

Thank you for helping. For me is the Output of only the values (last solution) the best.
But If I execute the code, I got the following error:

Output:---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [167], in <module>
      5     for line in file:
      6         if ":" in line:
----> 7             a,b = map(str.strip, line.split(":"))
      8             items.append([b])
     10 print(items)

ValueError: too many values to unpack (expected 2)

How I can deal with this?

**deanhystad** · Apr-06-2022, 09:14 PM

There must be lines with more than one ":" in it. Maybe a time?

You can tell split when to stop splitting. This tells split to stop after the first split.

        
              a, b = map(str.strip, line.split(":", maxsplit=1))
items.append([b])

hasiro · Apr-08-2022, 01:18 PM

(Apr-06-2022, 09:14 PM)deanhystad Wrote: There must be lines with more than one ":" in it. Maybe a time?

You can tell split when to stop splitting. This tells split to stop after the first split.

1
2

a, b = map(str.strip, line.split(":", maxsplit=1))
items.append([b])

Thank you, it is working now with the following code:

        
              with open(data, "r") as file:
    items = []
    for line in file:
        if ":" in line:
            a,b = map(str.strip, line.split(":", maxsplit=1))
            items.append(b)
 
new_result = items[0:4]
             
print(new_result)

Output:
['Hamilton-C1', '25455', '2020-09-16_21-12-40', '2.2.9']

After the list, I put it in a dataframe:

        
              import pandas as pd
 
df = pd.DataFrame([new_result], columns=['Model', 'S/N', 'timestamp', 'SW'])
 
df

Output: Model    S/N            timestamp     SW
0  Hamilton-C1  25455  2020-09-16_21-12-40  2.2.9

Now, I have a folder with thousends of log files and will get this information and put it also in a dataframe.
How I can do this, as simple as possible?

Thanks for helping me again.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	JSON File - extract only the data in a nested array for CSV file	shwfgd	2	1,084	Aug-26-2024, 10:14 PM Last Post: shwfgd
	Extract and rename a file from an Archive	tester_V	4	3,735	Jul-08-2024, 07:54 AM Last Post: tester_V
	docx file to pandas dataframe/excel	iitip92	1	2,610	Jun-27-2024, 05:28 AM Last Post: Pedroski55
	splitting a Dataframe Column in two parts	nafshar	2	1,812	Jan-30-2023, 01:19 PM Last Post: nafshar
	Converting a json file to a dataframe with rows and columns	eyavuz21	13	14,026	Jan-29-2023, 03:59 PM Last Post: eyavuz21
	Extract file only (without a directory it is in) from ZIPIP	tester_V	1	4,016	Jan-23-2023, 04:56 AM Last Post: deanhystad
	Save multiple Parts of Bytearray to File ?	lastyle	1	1,607	Dec-10-2022, 08:09 AM Last Post: Gribouillis
	How to extract specific data from .SRC (note pad file)	Shinny_Shin	2	2,178	Jul-27-2022, 12:31 PM Last Post: Larz60+
	Extract parts of multiple log-files and put it in a dataframe	hasiro	4	3,710	Apr-27-2022, 12:44 PM Last Post: hasiro
	Extract a string between 2 words from a text file	OscarBoots	2	2,748	Nov-02-2021, 08:50 AM Last Post: ibreeden

Extract parts of a log-file and put it in a dataframe

User Panel Messages

Announcements