Introduction

I want to keep this post really short, simple, and to the point. With that being said, here’s the toy problem that we’ll use to illustrate how we can leverage pleasingly parallel problem types with small I/O requirements to speed up python programs. The speeds recorded in this article were obtained using an i7-6800K processor.

We are going to deploy the absolute worst method for determining the primality of a number (i.e. brute force), and then abuse all of our CPU cores. One could argue the only way to abuse a CPU worse than this is to mine cryptocurrency with it.

Required tech

Code Review

Here’s the bit of code that deals with the imports and prime numbers.

from concurrent.futures import ProcessPoolExecutor
import time

# a list of numbers we want to check for primality
numbers = [2, 7, 13, 28, 99991, 188877, 1616161, 4441939, 90870847,
           92525533, 94939291, 98776551, 99999999, 100030001]

def primeCheck(number):
    """ return True if the number is prime, else return False """
    if number % 2 == 0 and number != 2:
        return False
    else:
        for i in range(3, number):
            if number % i == 0:
                return False
        return True

We’ll be using ProcessPoolExecutor to perform the parallel processing and time to compute the time. We’ve also been given a list of numbers that may or may not be prime, and we’ll use the function primeCheck to brute force whether a number is prime or not. Let’s check out the script that does the processing now.

if __name__ == "__main__":

    # no speedup
    print("Sequential processing...")
    start = time.time()
    results1 = list(map(primeCheck, numbers))
    print(results1)
    finish = time.time()
    print("Finished in %.3f seconds." % (finish - start))

    # speedup
    print("Use all CPU cores...")
    start = time.time()
    pool = ProcessPoolExecutor(max_workers=None)
    results2 = list(pool.map(primeCheck, numbers))
    print(results2)
    finish = time.time()
    print("Finished in %.3f seconds." % (finish - start))

Visually the code in # no speedup is very similar to # speedup. The main difference is the usage of pool, and it’s application to map. Note that setting max_workers=None will utilize all possible cores. ProcessPoolExecutor does a lot of work behind the scenes to allow for speedup to occur. In fact, so much work goes on behind the scenes that it is very possible to experience no speedup (or worse a decrease of performance) when using parellelization if the overhead costs exceed the parallelization gains. The moral of the story is intense mathematical computations that don’t depend on each other and have low I/O requirements have a good chance of being sped up.

Speedup Results

Here is the output from the script:

>> Sequential processing...
>> [True, True, True, False, True, False, True, True, True, True, True, False, False, True]
>> Finished in 22.667 seconds.
>> Use all CPU cores...
>> [True, True, True, False, True, False, True, True, True, True, True, False, False, True]
>> Finished in 6.319 seconds.

Comparing numbers we see a 72.92% decrease in runtime (roughly 3.5 times faster) when using parallelization. My CPU has 6 cores and 12 threads, so clearly it didn’t scale linearly. Why might that be? Here are a couple possibilities:

  • There isn’t enough work to benefit from the additional processing power.
  • Due to the serialization performed by the multiprocessing module.

Conclusion

We can conclude by saying it is quite possible to speed up certain python scripts by utilizing what multiprocessing has to offer. It requires very specific conditions to achieve, but those conditions appear quite often in programming applications so it is a tool worth having in our tool belts!


Further Reading

[1] https://pymotw.com/3/concurrent.futures/
[2] https://docs.python.org/2/library/multiprocessing.html

Project Code

from concurrent.futures import ProcessPoolExecutor
import time

# a list of numbers we want to check for primality
numbers = [2, 7, 13, 28, 99991, 188877, 1616161, 4441939, 90870847,
           92525533, 94939291, 98776551, 99999999, 100030001]

def primeCheck(number):
    """ return True if the number is prime, else return False """
    if number % 2 == 0 and number != 2:
        return False
    else:
        for i in range(2, number):
            if number % i == 0:
                return False
        return True


if __name__ == "__main__":

    # no speedup
    print("Sequential processing...")
    start = time.time()
    results1 = list(map(primeCheck, numbers))
    print(results1)
    finish = time.time()
    print("Done in %.3f seconds." % (finish - start))

    # speedup
    print("Use all CPU cores...")
    start = time.time()
    pool = ProcessPoolExecutor(max_workers=None)
    results2 = list(pool.map(primeCheck, numbers))
    print(results2)
    finish = time.time()
    print("Done in %.3f seconds." % (finish - start))