Efficiency Crash Course

***micseydel*** · (This post was last modified: Oct-10-2016, 07:12 PM by micseydel.)

[Migrated from old forum, originally posted 20 January 2015.]

I'm writing this post because occasionally people ask about efficiency on the forum, and it's hard to start this discussion with people who have no basis from which to use familiar jargon or ideas. I intend to provide that basis here. I assume that the person is already somewhat familiar programming in Python, and that they have high-school level algebra knowledge (I promise you don't need to freak out here).

I must start off by saying efficiency is rarely a concern, and in all but extreme cases is sacrificed in favor of code clarity or cleanliness. There is also a wiki which gives general tips, such as that using dots slows things down. Here is an example of code to remove a dot, and "speed things up" (in a sense)

upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
   append(upper(word))

This kind of "optimization" can be important if there's no other way for your code that needs to run faster to run faster, and profiling (running the code and looking at how long each part takes) has revealed that it's your biggest cost that you can change, but frankly it's boring and not very effective. If your code is for coursework, you're probably doing the wrong thing if you're removing dots, and if you're doing it with no noticeable increase in speed then you've just cluttered your code for no reason. How dare you. Instead I'll be discussing code like the following,

s = ""
for string in strings:
   s += string

being discouraged in favor of

s = "".join(strings)

The wiki mentions this, but it does not go into detail. Why exactly is one preferred over the other? That's what I will illuminate here.

We can objectively measure the kind of performance I'm discussing here. I'll come back to the string example momentarily, but let's take the following two functions defined in terms of an integer n.

def f(n):
   for x in xrange(n):
       print x

and

def g(n):
   for x in xrange(n):
       for y in xrange(n):
           print x, y

f() will print something n times, and g() will print n times n, also known as n**2. Exactly how fast they run will depend on the hardware, the version of Python, and various other things, but we can tell that the second block of code will take a factor of n more time regardless of the other aspects of the context.

As an example, if we have n set to 10, then g() will take ten times as long because it's n**2=100. What if we double n and then run both functions? Will g() still take ten times as long? No. With n=20, n**2=400, so it'll take twenty times as long! Anytime we double n, the n**2 code will take 4x longer rather than 2x longer (since 2**2 = 4). And if we make n 10x greater, the second one will take 100x longer.

What if we had both functions print five times instead of once? Now they'll both take 5x as long, but that is true regardless of the value of n. This constant factor is usually ignored, often because you can't do anything about it but also because the increase in speed is often negligible especially compared to the code readability problems. Usually any gains on this front come for free with code cleanup, and it's ok to stop there. From here on I'll only discuss the version that depends on the size of the input (n), with part of the motivation being that small values of n will run fast enough anyway.

The language around this, where we consider how the code "scales" with respect to the size of the input (n) and ignore the constant factor (e.g. number of prints), is called Big O Notation. We'd say that the f() is O(n) and g() is O(n**2). The first is faster than the second, and if they solved the same problem then the first would be a more efficient solution to the problem.

Now back to the string example above, which I'll show again here.

Bad:

s = ""
for string in strings:
   s += string

Good:

s = "".join(strings)

The reason is that the bad code is O(n**2) (with n being the length of strings) where the second is O(n). This discrepancy is due to the fact that Python strings are immutable, meaning you can't modify them

>>> my_string = "Pot."
>>> my_string[0] = "H"

Traceback (most recent call last):
 File "<pyshell#2>", line 1, in <module>
   my_string[0] = "H"
TypeError: 'str' object does not support item assignment

Since you can't modify them, a new string has to be made that always had the two strings concatenated together as its value. That means that each iteration, you'll have to create a fresh string, and each one is bigger than the previous. Creating the very last string is an O(n) operation, and if you only do it once you're fine. But the second to last time you iterate you're doing nearly that much work. So that means even though you're doing relatively close to 0 work the first iteration, that your average is n/2 (per iteration), which means the loop is (n/2)**2 which is 1/4 * n**2, and we ignore that 1/4 much like we did the 5x before (since a doubling of the size of the input still quadruples the time it takes).

That's really the basis I want everyone to have.

I'm very empirical, and this was relatively academic, so if you're like me you'll want to prove for yourself that the string concatenation version really is so slow. If you used the code above, you might be disappointed, and I'd like to explain why. I intend to follow up with that if no one else beats me to it, but for now I find this adequate.

Feedback to improve this tutorial is welcome.

Efficiency Crash Course

User Panel Messages

Announcements