Using TTP to extract an unknow number of words from a line

Calab

I'm using Python v 3.10.11 and ttp v 0.9.5. I am trying to build a TTP template that can extract an unknown number of words from a line into a group.

The data could look like this:

Output:link-aggregate 0         AdminState:Up      OperState:IS          
Member Ports:   6/1 6/5 6/6 6/7 6/8 6/9 6/10 6/11
                6/12 
Description:  << The port description >>

or, the data could like like this:

Output:link-aggregate 0         AdminState:Up      OperState:IS          
Member Ports:   6/1
Description:  << The port description >>

or anything in between.

The closest template that I can come up with is the following, but it misses the Member Port 6/12 in the first example, and returns the rest as a single string instead of a list of ports:

Output:link-aggregate {{ interfaceId }}          AdminState:{{ AdminState }}       OperState:{{ OperState }}          
Member Ports: {{ MemberPorts | PHRASE }}
Description:  {{ Description | ORPHRASE }}

How can I get my Member Ports into a list, and include any that may follow on the next line?

**deanhystad** · May-23-2024, 08:36 PM

I think the problem is that TTP uses re, and re does not support patterns with multiple matches. You can write a pattern where a match has multiple groups, but it is still just 1 match.

What's wrong with getting the ports as a string? Use split() to get a list of ports.

Calab · May-23-2024, 08:40 PM

After a bit of work I think I have a template that I can use...

Output:link-aggregate {{ interfaceId }} AdminState:{{ AdminState }} OperState:{{ OperState }}
Member Ports: {{ MemberPorts | ORPHRASE | split() }}
               {{ MemberPorts | ORPHRASE | split() | joinmatches }}
Description: {{ Description | ORPHRASE }}

**deanhystad** · May-23-2024, 08:44 PM

You don't need the joinmatches.

from ttp import ttp


data = """
link-aggregate 0 AdminState:Up OperState:IS
Member Ports: 6/1 6/2 6/3
Description: << The port description >>
"""

template = """
link-aggregate {{ interfaceId }} AdminState:{{ AdminState }} OperState:{{ OperState }}
Member Ports: {{ MemberPorts | PHRASE | split() }}
Description: {{ Description | ORPHRASE }}
"""

parser = ttp(data=data, template=template)
parser.parse()
print(parser.result(format="json")[0])

Output:[
    {
        "AdminState": "Up",
        "Description": "<< The port description >>",
        "MemberPorts": [
            "6/1",
            "6/2",
            "6/3"
        ],
        "OperState": "IS",
        "interfaceId": "0"
    }
]

Pedroski55 · May-25-2024, 05:57 AM

If ttp uses re, why not just use re and have more control?

I assume the key words are fixed: keys = ['link-aggregate', 'AdminState', 'OperState', 'Member Ports', 'Description'] If not, find them with a pattern.

import re
import json

data = """
link-aggregate 0 AdminState:Up OperState:IS
Member Ports: 6/1 6/2 6/3 6/4 6/5
6/6 6/7 6/8
7/1 7/2 /7/3
Description: << The port description >>
"""
#keys = ['link-aggregate', 'AdminState', 'OperState', 'Member Ports', 'Description']

p = re.compile(r'(link-aggregate) (\d+)') # get link-aggregate 0
q = re.compile(r'(AdminState):([A-Za-z]+)') # get AdminState Up
r = re.compile(r'(OperState):([A-Za-z]+)') # get OperState IS
s = re.compile(r'(Member Ports):') # get Member Ports
t = re.compile('([0-9]+/[0-9]+)') # get Member Ports values
u = re.compile('(Description:) <<([\s\S]*?)>>') # get Description

def getbits(data):
    d = {}
    resp = p.search(data) # res.group(2) = 0
    d[resp.group(1)] = resp.group(2)
    resq = q.search(data) # res.group(2) = 'Up'
    d[resq.group(1)] = resq.group(2)          
    resr = r.search(data) # res.group(2) = 'IS'
    d[resr.group(1)] = resr.group(2)
    # different because t uses .findall() not .search
    ress = s.search(data) # res.group(1) = 'Member Ports'   
    rest = t.findall(data) # res = ['6/1', '6/2', '6/3', '6/4', '6/5', '6/6', '6/7', '6/8', '7/1', '7/2', '7/3']
    d[ress.group(1)] = rest
    resu = u.search(data) # res.group(2) = ' The port description '
    d[resu.group(1)] = resu.group(2)
    return d

# if you have many sets of data, loop through them
# here there is only 1 data
data_dict = getbits(data)
for item in data_dict.items():
    print(item)

# save as json
mypath = '/home/pedro/temp/myjson.json'    
with open(mypath, 'a') as json_file:
    json.dump(data_dict, json_file)
    print('data saved to', mypath)

data_dict

Output:for item in data_dict.items():
    print(item)

    
('link-aggregate', '0')
('AdminState', 'Up')
('OperState', 'IS')
('Member Ports', ['6/1', '6/2', '6/3', '6/4', '6/5', '6/6', '6/7', '6/8', '7/1', '7/2', '7/3'])
('Description:', ' The port description ')

I never had much luck with re.VERBOSE and a complex, multiline re expression, so I just keep it simple!

**deanhystad** · May-26-2024, 10:25 PM

Writing new code when you can use a popular, supported package is not pythonic. TTP works great for this problem.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Can python detect style of language? eg. Flowery words vs simple words	mcp111	4	2,583	Jan-07-2020, 02:25 PM Last Post: mcp111

Using TTP to extract an unknow number of words from a line

User Panel Messages

Announcements