The ability to achieve high performance is crucial in modern software development. Developers frequently encounter the challenge of selecting the most effective technique for tackling tasks demanding optimal performance.

In this blog post, we'll explore three different approaches to achieving parallelism in Python: using threads, processes, and coroutines. Each approach has its own advantages and use cases, and understanding them can help you choose the right tool for your specific problem.

Coroutines, threads, and processes can all be used to achieve concurrency, which allows tasks to appear to run simultaneously and potentially improve execution time.

Threads-> Concurrency

What is Concurrency?

Concurrency involves a system's capability to manage multiple tasks or processes concurrently. However, this doesn't always imply simultaneous execution. Instead, the system alternates the execution of tasks, advancing each task in sequence.

Concurrency proves beneficial for tasks requiring the wait for external resources, such as I/O operations or network requests.

Through interleaving task execution, the system can advance on other tasks while awaiting completion of slower I/O operations.

Threads provide a means to achieve concurrency within a single process. The operating system scheduler manages multiple execution threads, allocating CPU time to each. Threads share the process's memory space and resources, allowing for direct access to shared data structures and variables.

To enable concurrency on a single CPU core, the operating system scheduler utilizes time slicing. This involves rapidly switching between threads, creating the illusion of concurrent execution. Thread management is handled by the operating system kernel, which assigns CPU time based on priority and defined scheduling algorithms.

Use Cases:

  • Concurrency excels at handling I/O-bound tasks: This includes network requests, file I/O, and database queries.
  • I/O operations often involve waiting: These tasks frequently wait for external resources like network responses or disk access.
  • Concurrency keeps the CPU busy: The ability to manage multiple tasks simultaneously is what makes concurrency so efficient. It keeps the processor working on other jobs while waiting for I/O, resulting in a significant overall performance improvement.
  • No idle CPU during I/O waits: Concurrency ensures the CPU isn't sitting idle while waiting for slow I/O operations.
import time
def task1():
    for _ in range(5):
        print("Task 1 executing")
        time.sleep(1)
def task2():
    for _ in range(5):
        print("Task 2 executing")
        time.sleep(1)
# Create threads for each task
thread1 = threading.Thread(target=task1)
thread2 = threading.Thread(target=task2)
# Start the threads
thread1.start()
thread2.start()
# Wait for threads to finish
thread1.join()
thread2.join()

Process -> Parallelism

What is Parallelism?

Parallelism takes concurrency a step further by achieving the actual execution of multiple tasks simultaneously. This typically requires a system with multiple CPU cores or processors. Each task runs independently, with its execution potentially overlapping with others.

If you have a dual-core processor and create two separate processes using multiprocessing, the operating system's scheduler is likely to assign each process to one of the CPU cores. As a result, both processes can run concurrently, with each core executing its assigned task independently.

Parallelism using processes involves the OS spawning multiple independent processes, each with its own memory space and resources.

Unlike threads, processes do not share memory space by default, so inter-process communication (IPC) mechanisms such as pipes, shared memory, or message passing are used to exchange data between processes.

Use Cases:

  • Running multiple tasks at once: Parallelism lets your computer truly run multiple things at the same time, unlike concurrency which just makes it seem that way.
  • Multiple cores = more power: This works best on computers with multiple cores or processors.
  • Heavy duty jobs: Parallelism is great for tasks that take a lot of processing power, like complex math problems and image processing.
📑
Know everything about the Role of Incident Response Teams (IRTs) here!
import time
def task1():
    for _ in range(5):
        print("Task 1 executing")
        time.sleep(1)
def task2():
    for _ in range(5):
        print("Task 2 executing")
        time.sleep(1)
if __name__ == "__main__":
    # Create processes for each task
    process1 = Process(target=task1)
    process2 = Process(target=task2)
    # Start the processes
    process1.start()
    process2.start()
    # Wait for processes to finish
    process1.join()
    process2.join()
    print("All tasks completed")

Coroutines -> Asyncio

What are Coroutines?

Coroutines are typically managed by a Python runtime environment with event loop, which can efficiently schedule and execute multiple tasks within a single thread.

In CPython, the Global Interpreter Lock (GIL) limits the execution of Python bytecode to a single thread at a time, preventing true parallelism with threads.

Coroutines, on the other hand, can be scheduled and executed concurrently within a single thread without being affected by the GIL, making them a more efficient choice for CPU-bound tasks that involve Python code execution.

Threads are managed by the OS kernel, which allocates CPU time to each thread based on priority and scheduling algorithms.

Use Cases:

Performance Superiority: Asyncio offers superior performance compared to traditional concurrency models like threads or processes.

Ideal Use Cases: It excels in use cases involving high-concurrency I/O operations.

Application Benefits: Applications such as web scraping, file processing, and real-time streaming systems benefit from asyncio's capabilities.

Event-Driven Architecture: Asyncio's event-driven architecture and non-blocking I/O operations make it particularly efficient for handling numerous I/O-bound tasks concurrently.

📑
"Scaling is now something we take for granted"- Bob Lee
Read more about his experience and strategies to scale systems here!
import asyncio
async def task1():
    for _ in range(5):
        print("Task 1 executing")
        await asyncio.sleep(1)
async def task2():
    for _ in range(5):
        print("Task 2 executing")
        await asyncio.sleep(1)
async def main():
    # Create coroutines for each task
    coro1 = task1()
    coro2 = task2()
    # Schedule coroutines concurrently
    await asyncio.gather(coro1, coro2)
asyncio.run(main())

That concludes our guide on Concurrency, Parallelism, and Asyncio.

If you're involved in incident management, Zenduty can enhance your MTTA & MTTR by at least 60%. With our platform, your engineers receive timely alerts, reducing fatigue and boosting productivity.

Sign up for a free trial today and see firsthand how you can achieve these results Additionally, you can also schedule a demo to understand more about the tool.

Dheeraj

Professional over-thinker and part-time wizard, turning caffeine into questionable life choices 😜