Python Forum
parse/read from file seperated by dots
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
parse/read from file seperated by dots
#1
File.txt:
...
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3
...

What is a good way to read/parse the File.txt into python and be able to access the data in a simple way?
The Keys are separated by dots.
Every data for Step.x, Step.x.x belongs together.

E.g. for loop over the entries and print all *.Name, or maybe some other way to make this structured
Reply
#2
You could parse it by using a tree of nested dicts
import io

class Node(dict):
    data = None

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __missing__(self, key):
        n = type(self)()
        self[key] = n
        return n

    def __repr__(self):
        return f'{type(self).__name__}<data={self.data!r}, {super().__repr__()}>'

def parse(file):
    root = Node()
    for line in file:
        lvalue, rest = line.split('=', maxsplit=1)
        n = root
        for word in (x.strip() for x in lvalue.split('.')):
            n = n[word]
        n.data = rest.strip()
    return root

file = io.StringIO('''\
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
''')

if __name__ == '__main__':
    root = parse(file)
    print(root)
    print(root['Step']['1']['Progress'].data)
    print(root['Step']['2'].keys())
Output:
Node<data=None, {'Step': Node<data=None, {'1': Node<data='Task1', {'Name': Node<data='My first Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>, '2': Node<data='Task2', {'Name': Node<data='My second Task', {}>, 'Result': Node<data='good', {}>, 'Progress': Node<data='finished', {}>}>}>}> finished dict_keys(['Name', 'Result', 'Progress'])
Reply
#3
Thanks your very much, that was what I was looking for Smile
Now I have to learn the details you have implemented it Shy

I have tested to loop over the "Steps": Working well
    print(len(root['Step']))    
    for item in root['Step']:
        print(item + " ---> " + root['Step'][item]['Progress'].data + " ---> " + root['Step'][item]['Result'].data)
Output:
2 1 ---> finished ---> good 2 ---> finished ---> error
One question, I'm struggling to get it really from file.
You used embedded in the code:
file = io.StringIO('''\
Step.1 = Tas...
How to get it from the separate file "File.txt"?
If I use:
file = open('File.txt')
I get:
Output:
Traceback (most recent call last): File "f:\python_test\parse.py", line 30, in <module> root = parse(file) File "f:\python_test\parse.py", line 20, in parse lvalue, rest = line.split('=', maxsplit=1) ValueError: not enough values to unpack (expected 2, got 1)
Reply
#4
You are getting this error message because the file contains some lines that don't have the = sign. I assumed that all the lines had the form
Output:
spam.ham.eggs = value
You could modify the code to skip the lines that don't contain = for example.
Reply
#5
That is right, there are lines that doesn't contain = sign.
I will modify the code.

Thank you very much for the support.
Reply
#6
Here is a different approach.
The structural pattern matching requires Python 3.10.
The dataclass is a convenient way to create classes which contains data.


from __future__ import annotations

from collections.abc import Generator
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any


@dataclass(order=True)
class Step:
    ident: str = field(compare=False)
    number: int = field(compare=True)
    name: str = field(compare=False)
    result: str = field(compare=False)
    progress: str = field(compare=False)


def parse(file: str | Path | None = None, text: str | None = None) -> Generator[Step, None, None]:
    data : dict[str, Any] = {}

    if file and text:
        raise TypeError("text and file are mutually exclusive")

    if not (file or text):
        raise TypeError("text or file must be given")

    if file:
        text = Path(file).read_text()

    if not text:
        return

    for line in text.splitlines():
        try:
            step, assignment = map(str.strip, line.split("=", maxsplit=1))
        except ValueError:
            continue

        elements = step.split(".", maxsplit=3)

        match len(elements):
            case 2:
                if data:
                    try:
                        yield Step(**data)
                    except TypeError:
                        pass
                    data.clear()
                data["number"] = int(elements[1])
                data["ident"] = assignment
            case 3:
                data[elements[2].lower()] = assignment

    if data:
        try:
            yield Step(**data)
        except TypeError:
            pass


text = """
garbage90832.rtzwrtzwrt.ztzwrtz = gg

Step.0 = Foo

!!!!ยง$252346"$%/
Step.1 = Task1
Step.1.Name = My first Task
Step.1.Result = good
Step.1.Progress = finished
Step.2 = Task2
Step.2.Name = My second Task
Step.2.Result = good
Step.2.Progress = finished
Step.3 = FOO
Step.3.Name = My thrid Task
Step.3.Result = xx
Step.3.Progress = waiting
garbage $%&)(/=
"""

result = sorted(parse(text=text))

for step in result:
    print(step)
Result from for-loop:
Quote:Step(ident='Task1', number=1, name='My first Task', result='good', progress='finished')
Step(ident='Task2', number=2, name='My second Task', result='good', progress='finished')
Step(ident='FOO', number=3, name='My thrid Task', result='xx', progress='waiting')

TypeHints are optional. Nothing is checked during runtime.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Recommended way to read/create PDF file? Winfried 3 2,904 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  parse json field from csv file lebossejames 4 771 Nov-14-2023, 11:34 PM
Last Post: snippsat
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,479 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 1,137 Jul-06-2023, 01:52 AM
Last Post: Tupa
  Formatting a date time string read from a csv file DosAtPython 5 1,305 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,720 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,208 Mar-27-2023, 08:59 PM
Last Post: deanhystad
  Read text file, modify it then write back Pavel_47 5 1,639 Feb-18-2023, 02:49 PM
Last Post: deanhystad
  Correctly read a malformed CSV file data klllmmm 2 1,974 Jan-25-2023, 04:12 PM
Last Post: klllmmm
  How to read csv file update matplotlib column chart regularly SamLiu 2 1,075 Jan-21-2023, 11:33 PM
Last Post: SamLiu

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020