Python Forum
IP string manipulation problem - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: IP string manipulation problem (/thread-15734.html)

Pages: 1 2


IP string manipulation problem - TheRealNoob - Jan-29-2019

Dear all,

I'm a complete beginner to Python and I have a question as to which direction I should take to achieve a string manipulation issue.

I have a device that gives me lists of IPs(v4) in this form:

10.1.1.0
10.1.1.2
10.1.1.4-10.1.1.5 <-- 2 contiguous numbers
10.1.1.7
10.1.1.11-10.1.1.13 <-- N variable contiguous numbers
10.1.1.21
...

What I need to achieve is a CVS list of all the actual IPs. (10.1.1.0, 10.1.1.2, 10.1.1.4, 10.1.1.5, 10.1.1.7, 10.1.1.11, 10.1.1.12, 10.1.1.13, 10.1.1.21)

I understand this is somewhat complex (certainly for me it is) the IPs can vary from 1 to 3 characters (with values ranging from 0 to 255) for each dot delimited octet, so any fixed char position is not ok.
Also the ranges (10.1.1.3-10.1.1.4) are not fixed; they can vary from 2 to N contiguous IPs.

I don't expect anyone to submit a solution, but would you be able to point me as to how best address this problem, let's say in high level code or general direction to pursue?
Is it better to go into a string manipulation direction? But I would not have ideas on how to address the logic of dealing with octets.

Is there a way to deal with this list with the ipaddress.IPv4Address primitives? any suggestion?

I will be happy to update this post with my developments/stages.

Thank you for your help!


RE: IP string manipulation problem - buran - Jan-29-2019

Have a look at ipaddress module.
Also, some string manipulation will be required


RE: IP string manipulation problem - perfringo - Jan-29-2019

Following code tries to construct IPv4 address, if it fails then extracts ip addresses last numbers and first three groups and constructs addresses for that range:

import ipaddress

addresses = ['10.1.1.0','10.1.1.2', '10.1.1.4-10.1.1.5',
             '10.1.1.7', '10.1.1.11-10.1.1.13', '10.1.1.21']

new = list()

for addr in addresses:
    try:
        ipaddress.IPv4Address(addr)
        new.append(addr)
    except ipaddress.AddressValueError:
        first, last = [int(el.split('.')[-1]) for el in addr.split('-')]
        triplet = addr[:addr.split('-')[0].rindex('.')]
        for i in range(first, last + 1):
            new.append(f'{triplet}.{i}')
Result is (list 'new'):
Output:
['10.1.1.0', '10.1.1.2', '10.1.1.4', '10.1.1.5', '10.1.1.7', '10.1.1.11', '10.1.1.12', '10.1.1.13', '10.1.1.21']



RE: IP string manipulation problem - TheRealNoob - Jan-30-2019

(Jan-29-2019, 01:40 PM)perfringo Wrote: Following code tries to construct IPv4 address, if it fails then extracts ip addresses last numbers and first three groups and constructs addresses for that range:

import ipaddress

addresses = ['10.1.1.0','10.1.1.2', '10.1.1.4-10.1.1.5',
             '10.1.1.7', '10.1.1.11-10.1.1.13', '10.1.1.21']

new = list()

for addr in addresses:
    try:
        ipaddress.IPv4Address(addr)
        new.append(addr)
    except ipaddress.AddressValueError:
        first, last = [int(el.split('.')[-1]) for el in addr.split('-')]
        triplet = addr[:addr.split('-')[0].rindex('.')]
        for i in range(first, last + 1):
            new.append(f'{triplet}.{i}')
Result is (list 'new'):
Output:
['10.1.1.0', '10.1.1.2', '10.1.1.4', '10.1.1.5', '10.1.1.7', '10.1.1.11', '10.1.1.12', '10.1.1.13', '10.1.1.21']

Thumbs Up

Thank you a lot! This would have taken me quite a while to do (and it will take a while to understand). I was not expecting that much!!

Really apreciated!


RE: IP string manipulation problem - perfringo - Jan-30-2019

(Jan-30-2019, 07:01 AM)TheRealNoob Wrote: Thank you a lot!
Really apreciated!

You are welcome Smile

If your source data is 'clean' and therefore there is no need to be 'defensive' it is possible to get same result without using ipaddress module (in first iteration it was used as convenient way to determine whether string is IPv4 or not). If all strings are guaranteed to be valid addresses or ranges containing '-' code can be simplified: moving getting addresses in range to separate function (how it's done) and using simple if clause (what is done):

def ip_from_range(ip_range):
    """Return list of ip addresses in ip_range.

    Separate ip addresses on '-', get value after last '.',
    convert to int and assign to names  first, last.

    Separate ip addresses on '-', get first address, get index
    of '.' from right, construct slice from address, get string
    consisting ip address first three groups (without comma) and
    assign to  name triplet.

    With list comprehension create list of ip addresses in range
    from first to last by joining triplet and range values.

    :param ip_range: first and last ip address separated by '-'
    :type ip_range: str
    :return: list of ip addresses in range
    :rtype: list
    """
    first, last = [int(el.split('.')[-1]) for el in ip_range.split('-')]
    triplet = addr[:ip_range.split('-')[0].rindex('.')]
    return [f'{triplet}.{i}' for i in range(first, last + 1)]
Now the code can be written:

new = []

for addr in addresses:
    if '-' in addr:
        new.extend(ip_from_range(addr))
    else:
        new.append(addr)



RE: IP string manipulation problem - TheRealNoob - Jan-30-2019

(Jan-30-2019, 07:40 AM)perfringo Wrote: If your source data is 'clean' and therefore there is no need to be 'defensive' it is possible to get same result without using ipaddress module (in first iteration it was used as convenient way to determine whether string is IPv4 or not). If all strings are guaranteed to be valid addresses or ranges containing '-' code can be simplified: moving getting addresses in range to separate function (how it's done) and using simple if clause (what is done):



To respond to your first question related to the "cleanness" of the data, I can say that yes. It will only be IPs, I can be sure of that. Only, I have not considered the case of more than a /24 network into a range.

I cannot exclude my input containing something like 10.1.1.1-10.1.2.31 or worse 10.1.1.1-11.255.1.1, etc. (but it will never have 11.255.1.1-10.1.1.1, that is also sure, the first IP will always be the beginning of the range)

I have not entirely understood how the following parts of the first version work, I don't yet know all the keywords

        first, last = [int(el.split('.')[-1]) for el in addr.split('-')]
        triplet = addr[:addr.split('-')[0].rindex('.')]


and

  new.append(f'{triplet}.{i}')
In any case I have modified a little your first version to allow me to use a comma separated list list as input and to get the results in the output:

import ipaddress
input_string = input("Enter a list element separated by comma  ")

addresses = input_string. split(',')
print (addresses) #to double check what was ingested

new = list()

for addr in addresses:
    try:
        ipaddress.IPv4Address(addr)
        new.append(addr)
    except ipaddress.AddressValueError:
        first, last = [int(el.split('.')[-1]) for el in addr.split('-')]
        triplet = addr[:addr.split('-')[0].rindex('.')]
        for i in range(first, last + 1):
            new.append(f'{triplet}.{i}')
print (new)
I will have to read better the last codes sample you added, also that is not very clear to me. Doh I am learning Smile
By any means, it is very cool to learn this python language and thank you again!!! Thumbs Up


RE: IP string manipulation problem - perfringo - Jan-30-2019

Some explanations:

first, last = [int(el.split('.')[-1]) for el in addr.split('-')]
On right from = there is list comprehension. This returns list which is unpacked and values assigned to names first and last.

This is roughly equivalent to this:

lst = list()                       

for el in addr.split('-'):        
    lst.append(int(el.split('.')[-1]))

first, last = lst
Explanations by row numbers:


# 1 - we create list where we will accumulate results
# 3 - addr.split('-') - using split method on string we convert '10.1.1.4-10.1.1.5' to ['10.1.1.4', '10.1.1.5']. for-loop goes through all elements in list, one at a time (actually it's iterator which is created by Python autamagically with for-loop from result of .split() method but in this context it is not important).
# 4 - each element i.e. '10.1.1.4' and '10.1.1.5' is splitted again using el.split('.') resulting ['10', '1', '1', '4'] and ['10', '1', '1', '5'] respectively. From lists last element is retrieved using indexing ([-1]) and converted to int (int(..)). This int is appended (lst.append(..)) to list we created for accumulating results. When loop has finished lst value will be [4, 5]
# 6 - list is unpacked and values assigned to names first and last (4 and 5 respectively)

As you can see, list comprehension makes it more compact :-)

triplet = addr[:addr.split('-')[0].rindex('.')]
Here we use indexing again. On right side from = there is slice made from addr (good reading material from StackOverflow: How slicing works). Basically with addr.split('-')[0] we are performing following steps: '10.1.1.4-10.1.1.5' --> ['10.1.1.4', '10.1.1.5'] --> '10.1.1.4'. Now we have a string and with rindex we return index of first '.' starting from right. It this specific case it will be integer 6 and this code evaluates to addr[:6]. Result of this slice '10.1.1' is assinged to name triplet.

f'{triplet}.{i}'
This is Literal String Interpolation a.k.a f-string. This quite new addition to Python (available in Python 3.6 and later). This code actually constructs new string from two values (triplet and i). Something like that {triplet}.{i} --> {'10.1.1'}.{4} and {'10.1.1'}.{5} respectively --> '10.1.1.4' and '10.1.1.5' respectively.

Hopefully it helps to understand :-)

Does 10.1.1.1-11.255.1.1 mean that there are all ip addresses in that space a la all combinations from:

10.1-255.1-255.1-255 + 11.1-255.1.1


RE: IP string manipulation problem - TheRealNoob - Jan-31-2019

Perfingo you are pure gold! Really thank you!


(Jan-30-2019, 01:48 PM)perfringo Wrote: Does 10.1.1.1-11.255.1.1 mean that there are all ip addresses in that space a la all combinations from:

10.1-255.1-255.1-255 + 11.1-255.1.1
To answer your question, that is yes. 10.1.1.1-11.255.1.1 means the whole range of IPs among the two values.

To make it easier 10.1.1.254-10.1.2.2 equals to:

10.1.1.254
10.1.1.255
10.1.2.1
10.1.2.2

Perfingo your answers are pure gold! Really thank you!


(Jan-30-2019, 01:48 PM)perfringo Wrote: Does 10.1.1.1-11.255.1.1 mean that there are all ip addresses in that space a la all combinations from:

10.1-255.1-255.1-255 + 11.1-255.1.1
To answer your question, that yes. 10.1.1.1-11.255.1.1 means the whole range of combinations among the two values.

To make it easier: 10.1.1.254-10.1.2.2 equals to:

10.1.1.254
10.1.1.255
10.1.2.1
10.1.2.2


RE: IP string manipulation problem - perfringo - Jan-31-2019

(Jan-31-2019, 07:03 AM)TheRealNoob Wrote: To make it easier: 10.1.1.254-10.1.2.2 equals to:

10.1.1.254
10.1.1.255
10.1.2.1
10.1.2.2

IP address ending with .0 is perfectly legal these days. Is it typo or there is no 10.1.2.0 due to some firewall policy etc?


RE: IP string manipulation problem - TheRealNoob - Feb-01-2019

(Jan-31-2019, 10:42 AM)perfringo Wrote: IP address ending with .0 is perfectly legal these days. Is it typo or there is no 10.1.2.0 due to some firewall policy etc?

Sorry, my typo indeed. 10.1.2.0 is included.