Python Forum
python regex: get rid of double dot
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python regex: get rid of double dot
#1
hi,

i'm parsing ifconfig output and failed to get rid of the double dot after the interface name (en0: for example). anyone?

>>> print(ifconfig_output)
en0: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.16.84.106 netmask 0xffffc000 broadcast 172.16.127.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.17.8.4 netmask 0xfffff800 broadcast 172.17.15.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/64
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

>>> re.findall(r'^(\S+).*?[\s]+inet ([\d\.]+) netmask ([\w.]+) broadcast ([\d\.]+)', ifconfig_output, re.S | re.M)
[('en0:', '172.16.84.106', '0xffffc000', '172.16.127.255'), ('en1:', '172.17.8.4', '0xfffff800', '172.17.15.255'), ('lo0:', '127.0.0.1', '0xff000000', '127.255.255.255')]
Reply
#2
import re

ifconfig_output = '''\
en0: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.16.84.106 netmask 0xffffc000 broadcast 172.16.127.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,18c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 172.17.8.4 netmask 0xfffff800 broadcast 172.17.15.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/64
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1'''

r = re.findall(r'^(\S+):.*?[\s]+inet ([\d\.]+) netmask ([\w.]+) broadcast ([\d\.]+)', ifconfig_output, re.S | re.M)
print(r)
Output:
[('en0', '172.16.84.106', '0xffffc000', '172.16.127.255'), ('en1', '172.17.8.4', '0xfffff800', '172.17.15.255'), ('lo0', '127.0.0.1', '0xff000000', '127.255.255.255')]
Reply
#3
perfect, thank you!
Reply
#4
ifconfig is deprecated and should not be used.
If you use iproute2, which is state of the art, you can use json data and prevent the use of regex.

To show ip addresses of all interfaces:
ip addr
To show ip addresses of all interfaces as json:
ip -j addr
To show ip addresses of all interfaces as json and only IPv4:
ip -j -4 addr
And the code, which can process the output of ip:
import json
import subprocess
import ipaddress


def get_networks():
    results = []
    stdout = subprocess.check_output(["ip", "-j", "addr"])
    # stdout is json data as raw bytes, no encoding here
    # json.loads can read str and bytes
    # and return a Python data type
    # in case of ip -j addr the outer structure is a list
    # and the elements in the list are Python dictionaries
    
    for interface in json.loads(stdout):
        for addr in interface["addr_info"] :
            label = addr.get("label", interface["ifname"])
            address = addr["local"]
            prefixlen = addr["prefixlen"]
            # broadcast = addr.get("broadcast", "")
            # creating a ip_interface object, which takes ip_address/prefixlen
            interface_addr = ipaddress.ip_interface(f"{address}/{prefixlen}")
            results.append((label, interface_addr))
    return results


result = get_networks()

print(result)
Output:
Output:
[('lo', IPv4Interface('127.0.0.1/8')), ('lo', IPv6Interface('::1/128')), ('eno1', IPv4Interface('192.168.0.2/24')), ('eno1', IPv6Interface('fe80::3e7c:3fff:fec2:76ce/64'))]
As you can see, the IP-Addresses are not str. They are IPv4Interface and IPv6Interface obejcts.
An IPvxInterface has an IP-Address and a Netmask. An IPvxAddress is only the address without netmask.

If you just print the IPvXInterface object, it will show the text form, but in a list you always see the Representation of the objects. The benefit is, that you can do calculations with this objects. For example you can add 1 and you'll get the next possible Address of an IP-Interface/IP-Address.


If you want to be independent of the operating system, use the package netifaces.
Then you don't have to use regex and subprocess.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#5
this is AIX not Linux, there is still a world beside linux.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Regex quest 2 2,322 Sep-22-2022, 03:15 AM
Last Post: quest
  Setup Portable Python on Windows for script starts with double clicks? pstein 0 1,811 Feb-18-2022, 01:29 PM
Last Post: pstein
  Using Regex Expression With Isin in Python eddywinch82 0 2,282 Apr-04-2021, 06:25 PM
Last Post: eddywinch82
  Exception handling in regex using python ShruthiLS 1 2,355 May-04-2020, 08:12 AM
Last Post: anbu23
  Python the regex not getting any attributes sarath_unrelax 1 1,844 Dec-19-2019, 11:06 AM
Last Post: Larz60+
  Cannot Remove the Double Quotes on a Certain Word (String) Python BeautifulSoup soothsayerpg 5 7,074 Oct-27-2019, 09:53 AM
Last Post: newbieAuggie2019
  Python regex to get only numbers tantony 6 4,074 Oct-09-2019, 11:53 PM
Last Post: newbieAuggie2019
  Replace Single Backslash with Double Backslash in python saswatidas437 2 33,289 Mar-19-2017, 10:48 AM
Last Post: Ofnuts

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020