Posts: 3
Threads: 1
Joined: Jun 2023
File.txt:
...
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3
...
What is a good way to read/parse the File.txt into python and be able to access the data in a simple way?
The Keys are separated by dots.
Every data for Step.x, Step.x.x belongs together.
E.g. for loop over the entries and print all *.Name, or maybe some other way to make this structured
Posts: 4,779
Threads: 76
Joined: Jan 2018
Jun-23-2023, 01:49 PM
(This post was last modified: Jun-23-2023, 01:49 PM by Gribouillis.)
You could parse it by using a tree of nested dicts
import io
class Node(dict):
data = None
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __missing__(self, key):
n = type(self)()
self[key] = n
return n
def __repr__(self):
return f'{type(self).__name__}<data={self.data!r}, {super().__repr__()}>'
def parse(file):
root = Node()
for line in file:
lvalue, rest = line.split('=', maxsplit=1)
n = root
for word in (x.strip() for x in lvalue.split('.')):
n = n[word]
n.data = rest.strip()
return root
file = io.StringIO('''\
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
''')
if __name__ == '__main__':
root = parse(file)
print(root)
print(root['Step']['1']['Progress'].data)
print(root['Step']['2'].keys()) Output: Node<data=None, {'Step': Node<data=None, {'1': Node<data='Task1', {'Name': Node<data='My first Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>, '2': Node<data='Task2', {'Name': Node<data='My second Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>}>}>
finished
dict_keys(['Name', 'Result', 'Progress'])
Posts: 3
Threads: 1
Joined: Jun 2023
Thanks your very much, that was what I was looking for
Now I have to learn the details you have implemented it
I have tested to loop over the "Steps": Working well
print(len(root['Step']))
for item in root['Step']:
print(item + " ---> " + root['Step'][item]['Progress'].data + " ---> " + root['Step'][item]['Result'].data) Output: 2
1 ---> finished ---> good
2 ---> finished ---> error
One question, I'm struggling to get it really from file.
You used embedded in the code:
file = io.StringIO('''\
Step.1 = Tas... How to get it from the separate file "File.txt"?
If I use:
file = open('File.txt') I get:
Output: Traceback (most recent call last):
File "f:\python_test\parse.py", line 30, in <module>
root = parse(file)
File "f:\python_test\parse.py", line 20, in parse
lvalue, rest = line.split('=', maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)
Posts: 4,779
Threads: 76
Joined: Jan 2018
You are getting this error message because the file contains some lines that don't have the = sign. I assumed that all the lines had the form
Output: spam.ham.eggs = value
You could modify the code to skip the lines that don't contain = for example.
Posts: 3
Threads: 1
Joined: Jun 2023
That is right, there are lines that doesn't contain = sign.
I will modify the code.
Thank you very much for the support.
Posts: 2,120
Threads: 10
Joined: May 2017
Jun-26-2023, 12:26 PM
(This post was last modified: Jun-26-2023, 12:26 PM by DeaD_EyE.)
Here is a different approach.
The structural pattern matching requires Python 3.10.
The dataclass is a convenient way to create classes which contains data.
from __future__ import annotations
from collections.abc import Generator
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
@dataclass(order=True)
class Step:
ident: str = field(compare=False)
number: int = field(compare=True)
name: str = field(compare=False)
result: str = field(compare=False)
progress: str = field(compare=False)
def parse(file: str | Path | None = None, text: str | None = None) -> Generator[Step, None, None]:
data : dict[str, Any] = {}
if file and text:
raise TypeError("text and file are mutually exclusive")
if not (file or text):
raise TypeError("text or file must be given")
if file:
text = Path(file).read_text()
if not text:
return
for line in text.splitlines():
try:
step, assignment = map(str.strip, line.split("=", maxsplit=1))
except ValueError:
continue
elements = step.split(".", maxsplit=3)
match len(elements):
case 2:
if data:
try:
yield Step(**data)
except TypeError:
pass
data.clear()
data["number"] = int(elements[1])
data["ident"] = assignment
case 3:
data[elements[2].lower()] = assignment
if data:
try:
yield Step(**data)
except TypeError:
pass
text = """
garbage90832.rtzwrtzwrt.ztzwrtz = gg
Step.0 = Foo
!!!!ยง$252346"$%/
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3 = FOO
Step.3.Name = My thrid Task
Step.3.Result = xx
Step.3.Progress = waiting
garbage $%&)(/=
"""
result = sorted(parse(text=text))
for step in result:
print(step) Result from for-loop:
Quote:Step(ident='Task1', number=1, name='My first Task', result='good', progress='finished')
Step(ident='Task2', number=2, name='My second Task', result='good', progress='finished')
Step(ident='FOO', number=3, name='My thrid Task', result='xx', progress='waiting')
TypeHints are optional. Nothing is checked during runtime.
|