Python Forum

Full Version: python regex: get rid of double dot
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
hi,

i'm parsing ifconfig output and failed to get rid of the double dot after the interface name (en0: for example). anyone?

>>> print(ifconfig_output)
en0: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.16.84.106 netmask 0xffffc000 broadcast 172.16.127.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.17.8.4 netmask 0xfffff800 broadcast 172.17.15.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/64
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

>>> re.findall(r'^(\S+).*?[\s]+inet ([\d\.]+) netmask ([\w.]+) broadcast ([\d\.]+)', ifconfig_output, re.S | re.M)
[('en0:', '172.16.84.106', '0xffffc000', '172.16.127.255'), ('en1:', '172.17.8.4', '0xfffff800', '172.17.15.255'), ('lo0:', '127.0.0.1', '0xff000000', '127.255.255.255')]
import re

ifconfig_output = '''\
en0: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.16.84.106 netmask 0xffffc000 broadcast 172.16.127.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.17.8.4 netmask 0xfffff800 broadcast 172.17.15.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/64
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1'''

r = re.findall(r'^(\S+):.*?[\s]+inet ([\d\.]+) netmask ([\w.]+) broadcast ([\d\.]+)', ifconfig_output, re.S | re.M)
print(r)
Output:
[('en0', '172.16.84.106', '0xffffc000', '172.16.127.255'), ('en1', '172.17.8.4', '0xfffff800', '172.17.15.255'), ('lo0', '127.0.0.1', '0xff000000', '127.255.255.255')]
perfect, thank you!
ifconfig is deprecated and should not be used.
If you use iproute2, which is state of the art, you can use json data and prevent the use of regex.

To show ip addresses of all interfaces:
ip addr
To show ip addresses of all interfaces as json:
ip -j addr
To show ip addresses of all interfaces as json and only IPv4:
ip -j -4 addr
And the code, which can process the output of ip:
import json
import subprocess
import ipaddress


def get_networks():
    results = []
    stdout = subprocess.check_output(["ip", "-j", "addr"])
    # stdout is json data as raw bytes, no encoding here
    # json.loads can read str and bytes
    # and return a Python data type
    # in case of ip -j addr the outer structure is a list
    # and the elements in the list are Python dictionaries
    
    for interface in json.loads(stdout):
        for addr in interface["addr_info"] :
            label = addr.get("label", interface["ifname"])
            address = addr["local"]
            prefixlen = addr["prefixlen"]
            # broadcast = addr.get("broadcast", "")
            # creating a ip_interface object, which takes ip_address/prefixlen
            interface_addr = ipaddress.ip_interface(f"{address}/{prefixlen}")
            results.append((label, interface_addr))
    return results


result = get_networks()

print(result)
Output:
Output:
[('lo', IPv4Interface('127.0.0.1/8')), ('lo', IPv6Interface('::1/128')), ('eno1', IPv4Interface('192.168.0.2/24')), ('eno1', IPv6Interface('fe80::3e7c:3fff:fec2:76ce/64'))]
As you can see, the IP-Addresses are not str. They are IPv4Interface and IPv6Interface obejcts.
An IPvxInterface has an IP-Address and a Netmask. An IPvxAddress is only the address without netmask.

If you just print the IPvXInterface object, it will show the text form, but in a list you always see the Representation of the objects. The benefit is, that you can do calculations with this objects. For example you can add 1 and you'll get the next possible Address of an IP-Interface/IP-Address.


If you want to be independent of the operating system, use the package netifaces.
Then you don't have to use regex and subprocess.
this is AIX not Linux, there is still a world beside linux.