Concurrency¶
Concurrent programming means that you have two or more sub-programs running simultaneously. This potentially allows you to use all your processors at once. This sounds like an enticing idea, but there are good reasons to be very cautious with it.
Pros and Cons of Concurrency¶
Often concurrency is a bad idea. The devil is lurking in the details:
Coordinating parallel sub-programs is very difficult to debug (look up the words “race condition” and “heisenbug”).
Python has a strange thing called the GIL (Global Interpreter Lock). That means, Python can really only execute one command at a time.
There are great existing solutions for many typical applications (web scraping, web servers).
On the other hand, concurrency can be a good idea:
if your tasks are waiting for some I/O anyway, the speed of Python does not matter.
starting multiple separate Python processes is rather easy (with the multiprocessing module).
if you are looking for a challenge.
There are three noteworthy approaches to concurrency in Python: threads, coroutines and multiple processes.
Multithreading¶
This is the old way to implement parallel execution. It has its flaws but you can grasp the basic idea:
"""
Factorial with threads
# adopted from
http://www.devshed.com/c/a/Python/Basic-Threading-in-Python/1/
"""
import threading
import time
import random
class FactorialThread(threading.Thread):
def __init__(self, number):
super().__init__()
self.number = number
@staticmethod
def factorial(n):
return (
1 if n == 0
else n * FactorialThread.factorial(n - 1)
)
def run(self):
result = self.factorial(self.number)
time.sleep(random.randint(5, 20))
print(f"{self.number}! = {result}")
for number in range(10):
FactorialThread(number).start()
Async Coroutines¶
The async interface has been added to Python more recently. It fixes many problems of threads.
"""
Example of parallel execution with asyncio
see:
https://docs.python.org/3/library/asyncio-task.html
"""
import asyncio
import random
from functools import reduce
def multiply(a, b):
return a * b
async def factorial(number):
"""delayed calculation of factorial numbers"""
result = reduce(multiply, range(1, number + 1), 1)
delay = random.randint(5, 20)
await asyncio.sleep(delay)
print(f"after {delay:2} seconds: {number}! = {result}")
async def main():
# create concurrent tasks
tasks = []
for i in range(10):
tasks.append(asyncio.create_task(factorial(i)))
# wait for tasks to finish
# (all have to be awaited)
for t in tasks:
await t
# run the toplevel async function
asyncio.run(main())
Subprocesses¶
The subprocess module allows you to launch extra processes through the operating system. Subprocesses are not restricted to Python programs. This is the most flexible approach, but also has the highest overhead.
"""
Launch processes with the subprocess module
https://docs.python.org/3/library/subprocess.html
"""
import subprocess
# launch a single external process
# r = subprocess.run(["python", "factorial.py", str(5)])
# try some other command than Python
procs = []
for i in range(10):
cmd = ["python", "factorial.py", str(i)]
p = subprocess.Popen(cmd) # , stdout=subprocess.PIPE)
# add stdout argument to see results immediately
procs.append(p)
for p in procs:
p.wait()
# read output from pipe
#print(p.stdout.read().encode())
import sys
import time
import random
n = int(sys.argv[1])
result = 1
while n > 0:
result *= n
n -= 1
delay = random.randint(5, 15)
time.sleep(delay)
print(f"factorial of {sys.argv[1]} = {result} after {delay} sec")
Challenge: Gaussian Elimination¶
In gauss_elim.py you find an implementation of the Gauss Elimination Algorithm to solve linear equation systems.
The algorithm has a cubic time complexity.
Parallelize the execution of the algorithm and check whether it gets any faster.
In test_gauss_elim.py you find unit tests for the module.
Note
The linear equation solver is written in plain Python. Of course, Numpy would also speed up the execution considerably.