parse/read from file seperated by dots

giovanne · Jun-23-2023, 12:49 PM

File.txt:
...
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3
...

What is a good way to read/parse the File.txt into python and be able to access the data in a simple way?
The Keys are separated by dots.
Every data for Step.x, Step.x.x belongs together.

E.g. for loop over the entries and print all *.Name, or maybe some other way to make this structured

**Gribouillis** · (This post was last modified: Jun-23-2023, 01:49 PM by Gribouillis.)

You could parse it by using a tree of nested dicts

import io

class Node(dict):
    data = None

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __missing__(self, key):
        n = type(self)()
        self[key] = n
        return n

    def __repr__(self):
        return f'{type(self).__name__}<data={self.data!r}, {super().__repr__()}>'

def parse(file):
    root = Node()
    for line in file:
        lvalue, rest = line.split('=', maxsplit=1)
        n = root
        for word in (x.strip() for x in lvalue.split('.')):
            n = n[word]
        n.data = rest.strip()
    return root

file = io.StringIO('''\
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
''')

if __name__ == '__main__':
    root = parse(file)
    print(root)
    print(root['Step']['1']['Progress'].data)
    print(root['Step']['2'].keys())

Output:Node<data=None, {'Step': Node<data=None, {'1': Node<data='Task1', {'Name': Node<data='My first Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>, '2': Node<data='Task2', {'Name': Node<data='My second Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>}>}>
finished
dict_keys(['Name', 'Result', 'Progress'])

giovanne · Jun-23-2023, 07:59 PM

Thanks your very much, that was what I was looking for Smile

Now I have to learn the details you have implemented it Shy

I have tested to loop over the "Steps": Working well

    print(len(root['Step']))    
    for item in root['Step']:
        print(item + " ---> " + root['Step'][item]['Progress'].data + " ---> " + root['Step'][item]['Result'].data)

Output:2
1 ---> finished ---> good
2 ---> finished ---> error

One question, I'm struggling to get it really from file.
You used embedded in the code:

file = io.StringIO('''\
Step.1 = Tas...

How to get it from the separate file "File.txt"?
If I use:

file = open('File.txt')

I get:

Output:Traceback (most recent call last):
  File "f:\python_test\parse.py", line 30, in <module>
    root = parse(file)
  File "f:\python_test\parse.py", line 20, in parse
    lvalue, rest = line.split('=', maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

**Gribouillis** · Jun-23-2023, 08:04 PM

You are getting this error message because the file contains some lines that don't have the = sign. I assumed that all the lines had the form

Output:
spam.ham.eggs = value

You could modify the code to skip the lines that don't contain = for example.

giovanne · Jun-25-2023, 06:48 PM

That is right, there are lines that doesn't contain = sign.
I will modify the code.

Thank you very much for the support.

DeaD_EyE · (This post was last modified: Jun-26-2023, 12:26 PM by DeaD_EyE.)

Here is a different approach.
The structural pattern matching requires Python 3.10.
The dataclass is a convenient way to create classes which contains data.

from __future__ import annotations

from collections.abc import Generator
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any


@dataclass(order=True)
class Step:
    ident: str = field(compare=False)
    number: int = field(compare=True)
    name: str = field(compare=False)
    result: str = field(compare=False)
    progress: str = field(compare=False)


def parse(file: str | Path | None = None, text: str | None = None) -> Generator[Step, None, None]:
    data : dict[str, Any] = {}

    if file and text:
        raise TypeError("text and file are mutually exclusive")

    if not (file or text):
        raise TypeError("text or file must be given")

    if file:
        text = Path(file).read_text()

    if not text:
        return

    for line in text.splitlines():
        try:
            step, assignment = map(str.strip, line.split("=", maxsplit=1))
        except ValueError:
            continue

        elements = step.split(".", maxsplit=3)

        match len(elements):
            case 2:
                if data:
                    try:
                        yield Step(**data)
                    except TypeError:
                        pass
                    data.clear()
                data["number"] = int(elements[1])
                data["ident"] = assignment
            case 3:
                data[elements[2].lower()] = assignment

    if data:
        try:
            yield Step(**data)
        except TypeError:
            pass


text = """
garbage90832.rtzwrtzwrt.ztzwrtz = gg

Step.0 = Foo

!!!!§$252346"$%/
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3 = FOO
Step.3.Name = My thrid Task
Step.3.Result = xx
Step.3.Progress = waiting
garbage $%&)(/=
"""

result = sorted(parse(text=text))

for step in result:
    print(step)

Result from for-loop:

Quote:Step(ident='Task1', number=1, name='My first Task', result='good', progress='finished')
Step(ident='Task2', number=2, name='My second Task', result='good', progress='finished')
Step(ident='FOO', number=3, name='My thrid Task', result='xx', progress='waiting')

TypeHints are optional. Nothing is checked during runtime.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to read a file as binary or hex "string" so that I can do regex search?	tatahuft	3	1,133	Dec-19-2024, 11:57 AM Last Post: snippsat
	Read TXT file in Pandas and save to Parquet	zinho	2	1,268	Sep-15-2024, 06:14 PM Last Post: zinho
	Pycharm can't read file	Genericgamemaker	5	1,598	Jul-24-2024, 08:10 PM Last Post: deanhystad
	Python is unable to read file	Genericgamemaker	13	3,785	Jul-19-2024, 06:42 PM Last Post: snippsat
	Connecting to Remote Server to read contents of a file	ChaitanyaSharma	1	3,328	May-03-2024, 07:23 AM Last Post: Pedroski55
	Recommended way to read/create PDF file?	Winfried	3	4,808	Nov-26-2023, 07:51 AM Last Post: Pedroski55
	parse json field from csv file	lebossejames	4	2,028	Nov-14-2023, 11:34 PM Last Post: snippsat
	python Read each xlsx file and write it into csv with pipe delimiter	mg24	4	3,877	Nov-09-2023, 10:56 AM Last Post: mg24
	read file txt on my pc to telegram bot api	Tupa	0	2,632	Jul-06-2023, 01:52 AM Last Post: Tupa
	Formatting a date time string read from a csv file	DosAtPython	5	5,048	Jun-19-2023, 02:12 PM Last Post: DosAtPython

parse/read from file seperated by dots

User Panel Messages

Announcements