Loop Details

***ichabod801*** · Sep-13-2019, 06:41 PM

Often in computer programming you want to do the same thing over and over again. You may want to do it a certain number of times, or for certain items, or until something specific happens. But sooner or later, you will want to do something over and over again. Programmers call this 'looping' (or 'iteration' when they're being fancy), and the code that does it is called a 'loop'. This tutorial is about loops, but I will try no to get too loopy while doing it.

Python implements two loop statements: for and while. Other languages often implement other loop statements, such as until or for each. Python goes for simple syntax, but for and while can handle all of the loop types commonly used in other languages. We'll start with the while loop, which looks like this:

n = 9
while n != 1:
    print(n)
    if n % 2:
        n = 3 * n + 1
    else:
        n = n // 2
print(n)

If you run this code, you will see that it prints out the Collatz Conjecture sequence for n = 9. How does it do that? Let's step through the code. First, we set n to 9. Then the while loop checks the expression n != 1. That returns True, since 9 is not 1. So the indented code under the while statement is executed. This indentation is very important. Often new Python programmers don't indent everything they want to be run each time through the loop. If it's not indented, it won't be run as part of the loop.

So the code prints n (9), and then checks to see if it is odd (if n % 2:). If n is odd, it is multiplied by three and then added to one. Otherwise it is divided by two. In the case of 9, n is odd, and n becomes 28. That's it for the indented code.

Up to this point the while loop has just acted like an if statement: it checked an expression, and if that expression was True, it runs the indented code. In fact, some newbies use while loops as if statements, which rarely works out well. This is because after the while loop finishes executing the indented code, it goes back to the while statement and checks the condition again. So now n is 28, which is still not 1, so the indented code executes again. This time n is even, so it is divided by 2, and you get 14. Fourteen is still not one, so the while loop keeps repeating this calculation through 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, and finally 1. Once n == 1, the while loop goes back and checks the conditional, which is finally False. The the while loop skips the indented block of code, and the final print statement is executed, printing 1.

If we had set n to 1 at the beginning, the condition would have been False on the first check, and the indented code would never have been executed. All that the program would have done is print '1'.

The condition in the while statement can be any condition you would see in an if statement. However, there is one condition you often see in while loops that I don't think I've ever seen in an if statement:

while True:
    food = input('What would you like for breakfast? ')
    if input.lower() == 'spam':
        break
print(f'Okay, {food} for breakfast.')

The key part here is the break statement. The break statement exits the loop and goes to the end of the indented code. It doesn't matter what the loop statement is, it stops executing the loop statement and moves on. This is handy for verifying input, like the above code does. If we had to use the condition in the while statement to end this loop, it would look like this:

food = input('What would you like for breakfast? ')
while food.lower() != 'spam':
    food = input('What would you like for breakfast? ')
print(f'Okay, {food} for breakfast.')
[python]

Now, you might think this is actually better. It is one line less of code. However, you now have a repeated line of code. That's generally a bad idea. The problem comes when you have to do any processing on the input before validating it. If you do, then you have to repeat all of that processing twice (once before the loop and once in the loop). So [inline]while True:[/inline] can be a useful construct. There can be a problem, however. Say we don't think things through correctly, and our while loop looks like this:

[python]
while True:
    food = input('What would you like for breakfast? ')
    if input.lower() == 'spam':
        print(f'Okay, {food} for breakfast.')

Notice that we left out the break statement. Now the loop never stops asking the question, even if you enter exactly the right answer. The condition is always True, so the indented code always executes. This is called an "infinite loop," and it is the bane of programmers in any language (although we used to have fun messing up the computer lab with them in middle school). It is for this reason that some programmers say you should never use while True:. I disagree with that. Let's go back to our Collatz loop, but let's say we want to try a variation on the Collatz Conjecture:

n = 9
while n != 1:
    print(n)
    if n % 2:
        n = 2 * n + 1
    else:
        n = n // 2
print(n)

The problem here is that 2n+1 is always odd. So if we start with an odd number, or divide to one, the code will go into an infinite loop making larger and larger odd numbers. Indeed, that will happen with any starting number except a power of two. So even a while loop with a real condition can become an infinite loop if we're not careful. So always be sure to check your while loops to make sure the condition can be met, or a break statement can be executed.

***ichabod801*** · Sep-13-2019, 06:42 PM

Next let's try a for loop. A for loop works a bit like a while loop in that it repeats, but it is generally used when you know how many times you want to repeat something. Here's a simple example:

for bottle in range(99):
    print('{0} bottles of beer on the wall, {0} bottles of beer.'.format(99 - bottle))
    print('Take one down, pass it around, {} bottles of beer on the wall.'.format(98 -bottle))

If you run this code, you will see that it recreates (almost) the repetitive song we teach to children to turn them into alcoholics. That is certainly an easier way to do it than writing each of those lines 99 times in your code.

So what exactly happened here? The format here is 'for <variable> in <iterable>'. Don't worry about what an iterable is right now, that's more of an advanced topic. It includes lists, strings, dictionaries, sets, and certain functions. The range function is one of those functions. It repeatedly returns numbers starting at 0, increasing by one, and stopping when it gets to the number specified.

So, when the for statement is first reached in the program, it gets a number from range, in this case 0. Then it assigns that number to the variable (bottle). Then the for loop runs the block of code indented beneath it, as with the while loop.

Our indented block has two print function calls, each with a format call. The bottle variable is used in the format calls. 99 minus 0 is 99, so the song starts with 99 bottles. 98 minus 0 is 98, so after we take one down on the third line, there are 98 bottles left.

After the indented block of code (the two print calls), the program returns to the for statement. It evaluates the range function again, getting the number 1. Then it goes and runs the indented block of code again, this time with bottle equal to 1.

This continues for some time. Ninety-nine times, in fact, because that's what we told the range function to do. However, on the 100th time through, range stops giving out numbers. (Instead it gives a special StopIteration error, but that's an advanced detail you don't need to worry about.) Note that the last number it gives out is 98, not 99. When range gets to the stop point you gave it, it stops without returning that value. When range stops giving out numbers, the loop stops executing. The program skips from the for statement, past the indented block of code, and starts executing any code that comes after that block of code.

Now, this is a pretty basic and common for loop, using the range function. But there is more we can do with the range function than just count up from zero. Say you get really tired of the 99 Bottles of Beer song rather quickly. Who doesn't? So you only want to start with only 19 bottles of beer on the wall. One way to do this is to give two parameters to the range function:

for bottle in range(80, 99):
    print('{0} bottles of beer on the wall, {0} bottles of beer.'.format(99 - bottle))
    print('Take one down, pass it around, {} bottles of beer on the wall.'.format(98 -bottle))

With two parameters, the range function starts with the first parameter, and keeps increasing by one, stopping when it gets to the second parameter (and again, not giving out that last number). So now, the first time through the loop bottle is equal to 80, and the program says there are 20 bottles of beer on the wall.

But wait! There's more!

The range function can take a third parameter: the increment used from one number to the next. So we could change our loop to only sing the verses for odd numbers:

for bottle in range(0, 99, 2):
    print('{0} bottles of beer on the wall, {0} bottles of beer.'.format(99 - bottle))
    print('Take one down, pass it around, {} bottles of beer on the wall.'.format(98 -bottle))

Note that the three numbers of the range function work just like the three numbers in a slice. So you can use this whenever you are confused about the range function. If you are confused about slice, you can just remember that they work like the range function.

***ichabod801*** · Sep-13-2019, 06:43 PM

There are some more statements that you can use with loops, either for loops or while loops. The simplest one is the continue statement, which is sort of the opposite of the break statement. While the break statement goes to the end of the loop, the continue statement goes back to the top of the loop. If the continue statement is in a while loop, the condition is checked again. If it is in a for loop, the next value from the iterator is assigned to the loop variable.

The continue statement is generally used to skipping the processing of certain loop items. Say for example that we suffer from triskaidekaphobia (fear of the number 13). We certainly don't want any singing about 13 bottles of bear on the wall. Note that this happens twice: with 13 after one is taken down and before one is taken down. To skip these line we could use an if and a continue:

for bottle in range(99):
    if bottle in (85, 86):
        continue
    print('{0} bottles of beer on the wall, {0} bottles of beer.'.format(99 - bottle))
    print('Take one down, pass it around, {} bottles of beer on the wall.'.format(98 -bottle))

Now there are no thirteens to worry about.

A bit more confusing is the use of the else statement with loops. First, consider that loops are often used to search for things. Say we want to see if there is a power of seven between two numbers:

for num in range(start, end + 1):   # Note that we use end + 1 so that the last value is end.
    if not num % 7:
        break
print(f'The power of 7 is {num}.')

This works fine if we start is 45 and end is 50. It prints 'The power of 7 is 49.' But if end is 48, it prints 'The power of 7 is 48.' But 48 isn't a power of seven. We could put in a check at the end to be sure we found one:

for num in range(start, end + 1):   # Note that we use end + 1 so that the last value is end.
    if not num % 7:
        break
if not num % 7:
    print(f'The power of 7 is {num}.')
else:
    print('No power of seven was found.')

That works, but there are two problems. First, it's repeating code again. Whenever you repeat code, try to think of a good way not to repeat it. There might not be a good way, but think about it. Second, if the test is complicated, not only are we repeating more code, but the test might not be doable any more. But Python gives us a simpler way to do this (keeping it simple is always a good idea):

for num in range(start, end + 1):   # Note that we use end + 1 so that the last value is end.
    if not num % 7:
        print(f'The power of 7 is {num}.')
        break
else:
    print('No power of seven was found.')

Now we print the power of seven when we find it, right before we break out of the loop. Our warning about no power of seven is now in the else. If you test it, it will work. Why? Because an else statement triggers if a loop ends without a break statement. Another way to say that is the else statement triggers if the loop ends normally. So if we don't find a power of seven, there is no break statement, and we print our warning.

This use of else can be confusing, but remember the example of looking for something. Finding that thing, and using the break statement, is kind of like the True value in a conditional. Not finding that thing is therefore kind of like the False value of a conditional, which triggers the else statement.

***ichabod801*** · Sep-13-2019, 06:44 PM

Now, I noted the connection between the range function and list slicing. Lots of people notice that, and that leads them to iterate over lists like this:

primes = [2, 3, 5, 7, 11, 13, 17]
for prime_index in range(len(primes)):
    square = prime[prime_index] ** 2
    print(f'The square of {prime[prime_index]} is {square}.')

This is also common among programmers familiar with other languages, as it is a common way to loop over lists or arrays or whatever other languages call them. However, in Python it is wasteful. We don't care what index the prime is in the list, we just care about the prime an it's square. Python gives us a way to get the primes without the indexes:

primes = [2, 3, 5, 7, 11, 13, 17]
for prime in primes:
    square = prime ** 2
    print(f'The square of {prime} is {square}.')

Now we have simpler code with the exact same result. Simple is good. But how does it work? Lists can be iterators, just like the range function. As iterators (actually, converted to iterators) they assign one item at a time to the looping variable from the list, in order. So we get the primes without having to worry about their index.

But maybe we do want the index. Should we go back to using range? No, we should use the enumerate function:

primes = [2, 3, 5, 7, 11, 13, 17]
for prime_index, prime in enumerate(primes):
    square = prime ** 2
    print(f'The square of prime #{prime_index + 1} is {square}.')

The enumerate function works as an iterator, but it applies to another iterator, in this case the primes list. It returns tuples of the index of the item and the item. That's why we have two loop variables in the above for loop. (If you are not familiar with tuple assignment in Python, look it up.)

We had to add one above to change the index starting at zero to an index starting at one. We can get around that with the start parameter to enumerate:

primes = [2, 3, 5, 7, 11, 13, 17]
for prime_index, prime in enumerate(primes, start = 1):
    square = prime ** 2
    print(f'The square of prime #{prime_index} is {square}.')

It's more typing in this case, but it can be less typing, and I think it's clearer. Clear is even more important than simple, although simple also helps clarity.

Another common use of looping with indexes is to loop over two lists at the same time, like so:

primes = [2, 3, 5, 7, 11, 13, 17]
fibs = [1, 1, 2, 3, 5, 8, 13]
for index in range(len(primes)):
    print(primes[index] + fibs[index])

Note that if fibs is shorter than primes, you will get an index error here. We could correct for that with a call to min(), but there is an easier way that avoids indexing: the zip function. The zip function takes two or more lists and "zips" them together. First it returns a tuple of the first item in each list, then a tuple of the second item in each list, and so on. So if you have nums = [1, 2, 3] and chars = ['A', 'B', 'C'], then zip(nums, chars) gives you (1, 'a'), then (2, 'b'), then (3, 'c'). As a bonus, it automatically stops when the shortest list runs out. So our loop becomes:

primes = [2, 3, 5, 7, 11, 13, 17]
fibs = [1, 1, 2, 3, 5, 8, 13]
for prime, fib in zip(primes, fibs):
    print(prime + fib)

And what do you know: it's simpler.

Yet another common use of index loops is to loop over sequential pairs of items in a list:

primes = [2, 3, 5, 7, 11, 13, 17]
for prime_index in range(len(primes) - 1):
    print(primes[prime_index] * primes[prime_index + 1])

This gives us the products of the consecutive primes. This gets a bit more complicated that going over two lists at once. We have to subtract one to avoid going over the end of the list with the second item, and add one to get to the second item. We can do this with zip by zipping the list to itself without the first item:

primes = [2, 3, 5, 7, 11, 13, 17]
for first, second in zip(primes, primes[1:]):
    print(first * second)

Here we take advantage of zip stopping when the shortest list runs out. That way we don't have to subtract an item from the range to account for an addition elsewhere.

Just as Python gives a clean, simple way to loop over lists, we can also loop over other containers:

Looping over a string gives you the characters of the string.
Looping over tuples is just like looping over a list.
Looping over dictionaries loops over the keys of the dictionary.
- You can loop over the values of a dictionary with some_dict.values()
- You can loop over the (key, value) pairs of a dictionary with some_dict.items()
Looping over files loops over the lines of the files.

I often find myself wanting strings to loop over the words in the string, but that is easy to do with for word in text.split():.

***ichabod801*** · Sep-13-2019, 06:47 PM

Loops can be combined together. For example, say we want to make a list that represents a deck of cards. We will abbreviate the card names, so the Five of Clubs is 5C and the Queen of Hearts is QH. Rather than type all of these abbreviations out by hand, we can have Python do it for us:

deck = []
for rank in 'A23456789TJQK':
    for suit in 'CDHS':
        deck.append(rank + suit)

That will give a list of all 52 card abbreviations. Let's step through this, so it is clear what is going one. First we have our empty deck, and then we have our for loop over the ranks of the cards. The first time through this loop (often called the "outer loop"), rank is set to 'A'. Remember that looping over a string gives us the characters of the string. Then we start the loop over suits (the "inner loop"), where suit is set to 'C'. Then we append our first card abbreviation to the deck, 'AC' for the Ace of Clubs.

Now the code indented under the inner loop is finished, so we go back to the top of the inner loop. Some new programmers expect the code to go back to the top of the outer loop. However, the code indented under the outer loop is not done until the inner loop is completely finished. So we go back to the inner loop over suits, and suit is set to 'D'. The rank variable has not been changed, so we append 'AD' to deck.

This continues through hearts and spades, until our deck list is ['AC', 'AD', 'AH', 'AS']. After 'AS' is appended to deck, the code indented under the inner loop is done. We go back to the top of the inner loop, and find there are no more suits to get. Then we go to just after the end of the inner loop. But that is the end of the code indented under the outer loop. So that triggers again and assigns '2' to rank. Now we go back to the top of the inner loop, which starts over again, and assigns 'C' to suit. Then we append our fifth card to the deck list, '2C'. The inner loop cycles through, appending all of the 2's to the deck, and then we get the next iteration of the outer loop and rank is set to '3'. This repeats until we get to the final card in the deck, 'KS'.

Let's say that we are in the inner loop, and we want to break out of the loop and stop building the deck. We might try this:

deck = []
for rank in 'A23456789TJQK':
    for suit in 'CDHS':
        if (rank, suit) == ('J', 'H'):
            break
        deck.append(rank + suit)

The above code will not stop building the deck when it gets to the Jack of Hearts. It will merely skip the Jack of Hearts and the Jack of Spades, but will include all of the queens and kings. What happens is that the break statement stops the inner loop, but we are still in the outer loop. So the outer loop goes to the next iteration, and assign 'Q' to rank, and then the inner loop starts up again. This is the same for the continue statement as well. A continue in the inner loop will only affect the inner loop, and won't skip to the next pass through the outer loop.

How might we deal with this situation? One method is to use a flag variable to do a second break:

deck = []
done = False
for rank in 'A23456789TJQK':
    for suit in 'CDHS':
        if (rank, suit) == ('J', 'H'):
            done = True
            break
        deck.append(rank + suit)
    if done:
        break

So now we are sending a message to the outer loop using the done variable, so that the outer loop also knows to also break. If you are familiar with try/except blocks, you can use an exception to stop both loops:

deck = []
try:
    for rank in 'A23456789TJQK':
        for suit in 'CDHS':
            if (rank, suit) == ('J', 'H'):
                raise RuntimeError('Loop finished.')
            deck.append(rank + suit)
except RuntimeError:
    pass

Another possibility is to use an else to continue past a break. While an else statement is attached to a particular loop, and is only used by that for, the code indented under the else statement is executed after the loop is finished. So breaks and continues in that block of code will affect the next loop out.

deck = []
for rank in 'A23456789TJQK':
    for suit in 'CDHS':
        if (rank, suit) == ('J', 'H'):
            break
        deck.append(rank + suit)
    else:
        continue
    break

So think through how this works. The first several times through the outer loop, the inner loop does not execute the break. Since no break occurs, the code indented under the else statement for the inner loop is executed. That is a continue statement, which takes us to the next iteration of the outer loop. Going to the top like that skips that final break statement.

Eventually we get to the Jack of Hearts, at which point we break out of the inner loop. Since we had a break statement, the else statement does not trigger, and the continue is not executed. That takes us to the final break statement, taking us out of the outer loop as well.

If that last example was rather confusing to you, that's why I don't like that way of breaking out of an inner loop. It's a bit simpler than using a flag variable, but it's also a bit harder to follow how it skips all over the place. Personally, I find using an exception to get out of multiple loops overkill, but it's really up to you.

If we go back to our original nested loop example, we see that it puts together every possible combination of rank and suit. If you are familiar with set theory or SQL joins, you will recognize this as the Cartesian product of the ranks and the suits. This is a very common thing you run into when programming. That's why Python has a product function to do this for us in the incredibly useful itertools package.

import itertools
deck = []
for rank, suit in itertools.product('A23456789TJQK', 'CDHS'):
    deck.append(rank + suit)

Now we generate the whole deck with just one loop, avoiding any issues with breaking out of multiple loops. It is still good to know how to break out of multiple loops. While the itertools package is awesome, and you should definitely check it out for other ways to simplify loops, it can't do everything.

***ichabod801*** · Sep-13-2019, 06:49 PM

Our last example builds a list of values. This is another very common use of loops. Once again, Python provides us with a simpler way to do it: list comprehensions. These are not functions, like enumerate, zip, and itertools.product, but syntax for building lists using loop statements. The classic loop form of building a list is this:

some_list = []
for value in another_list:
    some_list.append(some_function(value))

The list comprehension form of this is just one line:

some_list = [some_function(value) for value in another_list]

So first we take whatever we are appending, and we put that at the beginning of the list comprehension. Then we take the for statement, and we put that at the end of the list comprehension. Note that we drop the colon from the for statement, since we are not starting a new block of code.

So our deck of cards example could become a list comprehension this way:

from itertools import product
deck = [rank + suit for rank, suit in product('A23456789TJQK', 'CDHS')]

That's the one loop version of the deck example, but you can also have nested loops in a list comprehension. Here's how you would do the nested loop version of the deck example:

deck = [rank + suit for rank in 'A23456789TJQK' for suit in 'CSHS']

Note that this gives you the exact same sequence of cards that our earlier example gave. That means that the inner loop in a list comprehension is the one that is listed second.

But wait! There's more! Often you have a conditional in your loop. Take generating a list of primes:

def is_prime(number):
    for factor in range(2, number // 2):
        if not number % factor:
            return False
    return True 

primes = []
for number in range(2, 18):
    if is_prime(number):
        primes.append(number)

Yeah, it could be more efficient, but this is just a toy example. The real question is: how do we put this in a list comprehension?

def is_prime(number):
    for factor in range(2, number // 2):
        if not number % factor:
            return False
    return True 

primes = [number for number in range(2, 18) if is_prime(number)]

As you can see, we can put the conditional in the list expression as well, right after the for statement. That's nice, but take a look at that is_prime function. It looks a lot like the loops we've been converting into list comprehensions, it's just not appending the values to a list. But if did make a list of those values, we could put it into the all() function, which returns True if all the values in it's list are True. That would turn our is_prime function into a list comprehension:

def is_prime(number):
    return all([number % factor for factor in range(2, number // 2)])

primes = [number for number in range(2, 18) if is_prime(number)]

While I may have pooh-poohed efficiency earlier with this toy example, this does make it even worse. Our old function returned False after finding the first factor of the number. This function checks all of the potential factors before returning a value. It turns out there is a subtle way to fix this:

def is_prime(number):
    return all(number % factor for factor in range(2, number // 2))

primes = [number for number in range(2, 18) if is_prime(number)]

Do you see the difference? All we did was take out the brackets for the list comprehension in the function. Now it's not a list comprehension, it's a generator. What is a generator? They're what we've been using all along: range, enumerate, zip, dict.values(), dict.items() and product are all generators. The key point with these generators is that give out their values one at a time, rather than creating all of the values at once and returning a list of them.

This is of use here because the all() function is smart. It looks at the values one at a time, and if it finds a False value, it stops looking and returns False. Just like the original version of our is_prime function did.

So before we dealt with two loops at the same time, nesting one within the other. Can we do that with list comprehensions, or with a list comprehension and a generator? It turns out we can. Here's the prime number generator in one line:

primes = [number for number in range(2, 18) if all(number % factor for factor in range(2, number // 2))]

While this may seem cool, it's actually a pet peeve I have with list comprehensions. Some people seem to think that every list than can be built with a list comprehension should be built with a list comprehension. Then end up with huge list comprehensions that are just a confusing hot mess. My advice is that if your list comprehension doesn't fit on one line, put it back in a standard loop structure. The indentation structure of a standard loop will make your code more clear to whoever is trying to understand it.

Note that lists don't have all the fun. Dictionaries have comprehensions as well. The difference is, of course, that you need two values each time through the "loop," a key and a value. For example, take this loop:

rot13 = {}
letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
for plain, cipher in zip(letters, letters[13:] + letters[:13]):
    rot13[plain] = cipher

With a list comprehension, we started with the part that was being appended. With a dictionary comprehension we start with the key and value of the assignment. But instead of an assignment (which is can't be in a list comprehension), we use a colon as in a dictionary literal. As before, the for statement goes in the second part of the comprehension. So the dictionary comprehension for the above loop is:

letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
rot13 = {plain: cipher for plain, cipher in zip(letters, letters[13:] + letters[:13])}

As with the list comprehension, we could add a conditional at the end, and both the key and value can be any valid expression.

Loop Details

User Panel Messages

Announcements