Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 len() of ipaddress.ip_network()
#1
a network has a size. IMHO, a network object should have that same size, obtained via len(). but this seems to be missing in my Python version 3.5.2 (in Ubuntu 16.04.6). can someone who has Python version 3.7.X check it and see if it works? here is how to try it:
import ipaddress
len(ipaddress.ip_network('1::/32'))==2**96
it should be True. interface objects should work, too, giving the same size.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#2
With Python 3.7.3 (Debian 10 Buster/Testing):

Traceback (most recent call last):
  File "./test.py", line 3, in <module>
    len(ipaddress.ip_network('1::/32'))==2**96
TypeError: object of type 'IPv6Network' has no len()
Press any key to continue...
Quote
#3
I am not in familiar with ip addresses or networking stuff but I am not sure that it would be good idea to create sequence with 79228162514264337593543950336 elements (you need one to determine len).

Skimming documentation I found iterators:

hosts() (Returns an iterator over the usable hosts in the network.)

ipaddress.summarize_address_range(first, last) (Return an iterator of the summarized network range given the first and last IP addresses.)

One can always do simple math:

>>> int(ipaddress.IPv4Address('127.0.0.5')) - int(ipaddress.IPv4Address('127.0.0.1'))
4
>>> int(ipaddress.IPv6Address('1::'))
5192296858534827628530496329220096
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Life of Brian: Conjugate the verb, "to go" !
Quote
#4
no, it is not necessary to create a sequence to determine len(). the object can implement the .__len__() method to achieve that. i could create a subclass of these various network objects and include that method. but it is easy enough to determine the length from the network prefix length that i would not bother to make such subclasses for this purpose.

the new range class can do this, up to a point. apparently, it is implemented in C, in which numbers 2**64 and up require more code (BTDT in C).
Output:
>>> len(range(6,2**60,3)) 384307168202282324 >>> len(range(6,2**66,3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: Python int too large to convert to C ssize_t
do you think it created a 341 petabyte sequence to determine the length in the first one??
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#5
I have no idea how len of range is implemented, but wouldn't be surprised that something along those lines:

>>> len(range(6,2**60,3))
384307168202282324
>>> (2**60 - 6) / 3
3.843071682022823e+17
However, from documentation of range I can read:

Quote:The range type represents an immutable sequence of numbers and is commonly used for looping a specific number of times in for loops.

Ranges containing absolute values larger than sys.maxsize are permitted but some features (such as len()) may raise OverflowError.

The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).

EDIT: This question intrigued me, so I went to github to check from rangeobject.c how len of range determined. My knowledge of C is next to none, but:

/* Return number of items in range (lo, hi, step).  step != 0
 * required.  The result always fits in an unsigned long.
 */
static unsigned long
get_len_of_range(long lo, long hi, long step)
{
    /* -------------------------------------------------------------
    If step > 0 and lo >= hi, or step < 0 and lo <= hi, the range is empty.
    Else for step > 0, if n values are in the range, the last one is
    lo + (n-1)*step, which must be <= hi-1.  Rearranging,
    n <= (hi - lo - 1)/step + 1, so taking the floor of the RHS gives
    the proper value.  Since lo < hi in this case, hi-lo-1 >= 0, so
    the RHS is non-negative and so truncation is the same as the
    floor.  Letting M be the largest positive long, the worst case
    for the RHS numerator is hi=M, lo=-M-1, and then
    hi-lo-1 = M-(-M-1)-1 = 2*M.  Therefore unsigned long has enough
    precision to compute the RHS exactly.  The analysis for step < 0
    is similar.
    ---------------------------------------------------------------*/
    assert(step != 0);
    if (step > 0 && lo < hi)
        return 1UL + (hi - 1UL - lo) / step;
    else if (step < 0 && lo > hi)
        return 1UL + (lo - 1UL - hi) / (0UL - step);
    else
        return 0UL;
}
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Life of Brian: Conjugate the verb, "to go" !
Quote
#6
net = ipaddress.ip_network('1::/32')
print('Number of addresses is like expected size:', net.num_addresses == 2**(128-32))
Number of usable hosts should be 2**(128-32)
The class has no method for len. Let's do one:

ipaddress.IPv6Network.__len__ = lambda self: self.num_addresses
net = ipaddress.ip_network('1::/32')
print(net.__len__())
len(net)
# BOOOM
Error:
--------------------------------------------------------------------------- OverflowError Traceback (most recent call last) <ipython-input-52-fefbadd730e7> in <module> ----> 1 len(net) OverflowError: cannot fit 'int' into an index-sized integer
class Foo:
    def __len__(self):
        return 79228162514264337593543950336

len(Foo())
Error:
--------------------------------------------------------------------------- OverflowError Traceback (most recent call last) <ipython-input-64-955ff12672b5> in <module> ----> 1 len(Foo()) OverflowError: cannot fit 'int' into an index-sized integer
Now let's find out which number is too big:
class Foo:
    def __init__(self, len):
        self.len = len
    def __len__(self):
        return self.len

for n in itertools.count(2**63):
    try:
        len(Foo(n))
    except OverflowError:
        print('Last possible value for len is:', n-1)
        break
After some tests, I know that following does not fail:
len(Foo(2**62))

But this will fail:
len(Foo(2**63))

Doing the test above have to do 4611686018427387904 to reach from 2**62 to 2**63.
So this a little bit big for iteration, so we better calculate it and try
to make assumptions.

So let's try 2**63-1:
len(Foo(2**63-1))
Output:
9223372036854775807
From where does this number come from?

import sys
len(Foo(2**63-1)) == sys.maxsize
Output:
True
Now you can understand why they have not implemented __len__.
It's senseless to implement something, where you know from the beginning on,
that IPv6-Adresses have 128 Bit, which is a much bigger number as 2**63-1.

By the way, the iteration is still running. This approach was thought too simple :-D

I did not know this fact, that len() has an upper limit and I guess it comes from the implementation and can depend on architecture (32/64 bit).
Guessing: Maybe it comes from the limitation of addressing memory. The limit on 64 bit systems are usually 64 bit.
Some 32 bit CPUs can address more than 32 bit with PAE if they have more than 32 physical lines connected to memory for addressing.
Maybe you can look from where this limitation comes exactly and why.

The slice notation doesn't have this problem, so we are still able to address more than 2**63 in a file or iterable.
But if the object implements the __len__ method, it will fail. It will also fail, if you use this for iteration.
So if you would implement an iterator for IPv6Address, you could just make an __iter__ method and use a generator:

# definition
def my_iter(self):
    yield from self.hosts()


# monkey patching on
ipaddress.IPv6Network.__iter__ = my_iter
# monkey patching off
del my_iter # we don't need it

net = ipaddress.ip_network('1::/32')
# using directly need the iter function
# it's not an Iterator, because there is no next method
net_iter = iter(net)

# then manually do next
next(net_iter)
next(net_iter)
next(net_iter)
Let's make and IPv6Address Iterator just for fun:
def my_iter(self):
    return self

def my_next(self):
    if not hasattr(self, 'hosts_generator'):
        self.hosts_generator = self.hosts()
    return next(self.hosts_generator)
    # if StopIteration was rised by generator
    # it bubbles up to the caller
    # if the caller is a for-loop, the loop stops

# monkey patching on
ipaddress.IPv6Network.__iter__ = my_iter
ipaddress.IPv6Network.__next__ = my_next
# monkey patching off
del my_next, my_iter # we don't need it

net = ipaddress.ip_network('1::/32')
next(net)
next(net)
I think the original implementation is good as it is with paying attention about the corner cases.
perfringo likes this post
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#7
(May-18-2019, 06:25 AM)perfringo Wrote: EDIT: This question intrigued me, so I went to github to check from rangeobject.c how len of range determined. My knowledge of C is next to none, but:

the point is that it was implemented with C types. to make it work even higher, it needed to be calling the extended precision int code, like it would have to be doing for lots of other things like actual arithmetic -or- implement just the len() part in Python and have the C code call that. i doubt len() would be called too often to consider it a critical point.

slicing a network object would be the next thing. applying [:16] or [-32:] would get a list of one network and applying [8:24] would get a list of two networks since CIDR cannot span that range. or, they implement a network range type object. they did implement the "in" operation, at least for addresses in a network. i should test if network "in" network works.

i am thoroughly familiar with networks and have implement some of this stuff in C. maybe i should implement a subclass with these many added features (just in Python, not C).
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#8
i meant that len(network) is the number of addresses in the network the object describes, not the size of address. there is already an attribute ("max_prefixlen") for that.

@DeaD_EyE your 3rd code box shows that the core implementation of len() or __len__() is trying to do something wrong with the value being returned. IMHO, since Python version 3 has theoretically made the implementation of int obscure (e.g. we are to see no effect if the value fits in the native word size or not) they should do that everywhere, but have failed to do that, here. IMHO, this needs to be fixed in 3.8. what the implementation should do is pass the object it get, along, only checking to be sure it is type int. this kind of brokenness is something i would expect in version 2.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  module ipaddress Skaperen 2 396 Aug-14-2018, 05:59 AM
Last Post: Skaperen

Forum Jump:


Users browsing this thread: 1 Guest(s)