Python's
Parallel
Processing
Possibilities
Samuel Colvin
Who Am I
Agenda
I'll try to:
- Talk about 4 levels of concurrency
- Demonstrate them using Python
- Why you might (not) use them
- Make it fun
I wont:
- Prepare you for a CS exam on distributed computing
- Go into detail on protocols
- Give an exhaustive description of the technology
source: www.spec.org, github.com/samuelcolvin/analyze-spec-benchmarks
The rational for Parallel Processing
The Metaphor
1. Multiple Machines
Machine = host/computer/virtual machine/container
import requests
def count_words(year: int):
resp = requests.get(f'https://ep{year}.europython.eu/en/')
print(f'{year}: {len(resp.text.split())}')
RQ
Example
worker.py
from redis import Redis
from rq import Queue
from worker import count_words
q = Queue(connection=Redis())
for year in range(2016, 2020):
print(q.enqueue(count_words, year))
rq_example.py
Multiple Machines - Advantages
- Scaling is easy
- Linear cost increase
- Isolation!
Multiple Machines - Disadvantages
- Need to take care of networking between the machines
- Harder to setup in a dev environment
- No standard library implementation
2. Multiple Processes
- Processes are an Operating System concept
- Exist (with a little variation) on all OSes
- Often used as a stop gap for multiple machines during testing
Processes
Example
from multiprocessing import Process, JoinableQueue
import requests
def count_words(year: int):
resp = requests.get(f'https://ep{year}.europython.eu/en/')
print(f'{year}: {len(resp.text.split())} words')
def worker(id):
while True:
item = q.get()
if item is None:
print('quitting worker', id)
break
count_words(item)
q.task_done()
q = JoinableQueue()
process = []
for id in range(2):
p = Process(target=worker, args=(id,))
p.start()
process.append(p)
for year in range(2016, 2020):
q.put(year)
q.join()
for _ in process:
q.put(None)
for p in process:
p.join()
➤ python multiprocessing_example.py
2017: 4123 words
2016: 3794 words
2019: 1953 words
2018: 4334 words
quitting worker 0
quitting worker 1
- Easy to run
- OS guarantees memory separate between processes
- Fast to communicate
Multiple Processes - Advantages
Multiple Processes - Disadvantages
- Limits to scaling
- Fixed capacity
3. Multiple Threads
- Threads allow concurrent execution from within a single process
- Thus multiple threads can access the same memory
- 2 varieties: kernel threads, user/green threads
- "Threading" in python generally refers to kernel threads
Threading
Example
from queue import Queue
from threading import Thread
import requests
def count_words(year: int):
resp = requests.get(f'https://ep{year}.europython.eu/en/')
print(f'{year}: {len(resp.text.split())}')
def worker(id):
while True:
item = q.get()
if item is None:
print('quitting thread', id)
break
count_words(item)
q.task_done()
q = Queue()
threads = []
for id in range(2):
t = Thread(target=worker, args=(id,))
t.start()
threads.append(t)
for year in range(2016, 2020):
q.put(year)
q.join()
for _ in threads:
q.put(None)
for t in threads:
t.join()
➤ python threading_example.py
2017: 4123 words
2016: 3794 words
2019: 1953 words
2018: 4334 words
quitting worker 0
quitting worker 1
Multiple Threads - Advantages
- Lighter than processes
- Faster to create and switch than processes
- Share memory (if you dare!)
Multiple Threads - Disadvantages
Memory locking is horrid -
The GIL limits the usefulness of threading with Python:
Do not communicate by sharing memory; instead, share memory by communicating.
- Go Proverb
GIL ... protects access to Python objects, preventing multiple threads from executing Python bytecodes at once
- Python Wiki
The Global Interpreter Lock
from queue import Queue
from threading import Thread
from time import time
def do_calcs(year: int):
print(sum(range(year * int(1e5))))
t1 = time()
for year in range(2016, 2020):
do_calcs(year)
t2 = time()
print(f'Time taken without threads: {t2 - t1:0.2f}s')
def worker(id):
while True:
item = q.get()
if item is None:
print('quitting thread', id)
break
do_calcs(item)
q.task_done()
t3 = time()
...
for year in range(2016, 2020):
q.put(year)
...
t4 = time()
print(f'Time taken with 2 threads: {t4 - t3:0.2f}s')
➤ python gil.py
20321279899200000
20341444899150000
20361619899100000
20381804899050000
Time taken without threads: 7.63s
20321279899200000
20341444899150000
20361619899100000
20381804899050000
quitting thread 1
quitting thread 0
Time taken with 2 threads: 7.65s
from queue import Queue
from threading import Thread
from time import time
import numpy as np
def do_calcs(year: int):
print(np.sum(np.arange(year * int(1e5))))
t1 = time()
for year in range(2016, 2020):
do_calcs(year)
t2 = time()
print(f'Time taken without threads: {t2 - t1:0.2f}s')
def worker(id):
while True:
item = q.get()
if item is None:
print('quitting thread', id)
break
do_calcs(item)
q.task_done()
t3 = time()
...
for year in range(2016, 2020):
q.put(year)
...
t4 = time()
print(f'Time taken with 2 threads: {t4 - t3:0.2f}s')
➤ python gil_numpy.py
20321279899200000
20341444899150000
20361619899100000
20381804899050000
Time taken without threads: 2.36s
20321279899200000
20341444899150000
20381804899050000
20361619899100000
quitting thread 1
quitting thread 0
Time taken with 2 threads: 1.34s
4. Asynchronous I/O
- AKA coroutines/green threads/fibers
- "Asyncio" in python
- Cooperative scheduling
- Mostly (but not always) used for networking tasks
- based on an event loop which schedules coroutines
- 1 kernel thread - only one piece of code is running at any time
Without
Asyncio
With
Asyncio
Asyncio Example
from aiohttp import ClientSession
import asyncio
async def count_words(year: int):
async with ClientSession() as session:
async with session.get(f'https://ep{year}.europython.eu/en/') as resp:
text = await resp.text()
print(f'{year}: {len(text.split())} words')
async def main():
coroutines = []
for year in range(2016, 2020):
coroutines.append(count_words(year))
await asyncio.gather(*coroutines)
asyncio.run(main())
➤ python asyncio_example.py
2019: 1953 words
2017: 4123 words
2016: 3782 words
2018: 4334 words
Asyncio - Advantages
- Even lighter - easily run thousands of concurrent tasks
- Easier to reason with
- Less risk of memory corruption
Asyncio - Disadvantages
- By default asyncio provides no speedup for CPU bound tasks
- Whole new way of thinking
- Applications have to be entirely rewritten
explicit cooperative scheduling is awesome, but it can't be implicit
- me
This is where it gets tricky
Machines
Processes
Threads
Asyncio
rq forks the main process to run the worker
ThreadPoolExecutor
ProcessPoolExecutor
aiohttp, arq
multiprocessing.Queue
Using Asyncio for Processes and Threading
- performance of processes or threads from the conform of asyncio
- ThreadPoolExecutor - for file operations
- ProcessPoolExecutor - for CPU bound tasks
ThreadPoolExecutor Example
from concurrent.futures import ThreadPoolExecutor
import asyncio
from time import time
import numpy as np
def do_calcs(year: int):
print(np.sum(np.arange(year * int(1e5))))
async def main():
loop = asyncio.get_event_loop()
with ThreadPoolExecutor(max_workers=2) as pool:
coroutines = [
loop.run_in_executor(pool, do_calcs, v)
for v in range(2016, 2020)
]
await asyncio.gather(*coroutines)
t1 = time()
asyncio.run(main())
print(f'Time taken with 2 threads: {time() - t1:0.2f}s')
➤ python asyncio_numpy.py
20321279899200000
20341444899150000
20381804899050000
20361619899100000
Time taken with 2 threads: 1.27s
Summary
- 4 levels of concurrency: machines, processes, threads, asyncio
- All possible with (but not limited to) python
- All have strengths, weaknesses and pitfalls
- They often interact with each other
It's easy to read the docs but the tricky thing (and what I tried to do today) is understanding the big picture
Thank you
Questions?
this presentation: tiny.cc/pythonsppp
PythonsPPP
By Samuel Colvin
PythonsPPP
- 2,553