Python guide: Using multiprocessing versus multithreading

An overview of the fundamental differences between multithreading and multiprocessing in Python

Multiprocessing and multithreading are two strategies to increase an application’s speed, performance and throughput. This technical article seeks to explain these paradigms how they differ and the appropriate situations to use each strategy. Oftentimes, the words “concurrency,” and “parallelism” are used interchangeably when referencing multithreaded or multiprocessed programs. This article will also clarify the differences and relationships between these commonly misused terms. 

What is multiprocessing?

Multiprocessing is a technique that allows a program to run multiple tasks simultaneously. Each task is executed in its own process, which is a separate instance of the program. Multiprocessing takes advantage of multicore operating system architectures, in which multiple cores communicate directly through shared hardware caches. Multiprocessing makes computing more effective by exploiting parallelism and running programs across multiple physical cores.

Multiprocessing: A technique to speed up CPU-bound tasks

Multiprocessing is most useful for tasks that are CPU-bound. CPU-bound tasks would be completed faster if the CPU itself were faster (CPU speeds have limits). For example, a computer program that sums many numbers together would be considered CPU-bound. Multiprocessing harnesses the power of modern multi-core computing systems by running many CPU-bound tasks in parallel across multiple cores.

What is multithreading?

Multithreading is a technique that allows a program to run multiple tasks concurrently within a single process. Each task is executed in its own thread, which is a lightweight unit of execution that shares the same memory space as other threads in the process.

Multithreading: A technique to speed up I/O-bound tasks

Multithreading is useful for tasks that are I/O bound. The input and output subsystem limits I/O (input/output) bound tasks. Reads and writes from a hard drive, or reads and writes from a network API, are examples of a limiting subsystem. An example I/O bound task would be a computer program that reads from a file because the speed of the hard drive limits the speed of the program. Multithreading attempts to get around the limitations of I/O bound tasks by running multiple threads of a program concurrently, allowing each thread to make progress independently. In Python, a major characteristic is its Global Interpreter Lock (GIL) which allows only one thread to hold control of the Python interpreter at once—preventing multiple threads from executing in parallel.

Parallelism and concurrency: What's the connection?

Parallelism and concurrency are more generalized terms than multiprocessing and multithreading. Parallelism refers to the execution of multiple tasks that are actually being executed simultaneously. Concurrency refers to the execution of multiple tasks being interleaved, instead of each task being executed sequentially one after another. With concurrent executions of tasks, they can be executed in any order without the final outcome being different.

The dictionary definitions of “concurrent” and “parallel” are quite similar—referring to things happening at the same time—but the computer science definitions are more nuanced. To make this distinction even more clear, imagine a group of ten people standing in two lines at a grocery store checkout, with five people in each checkout line. If only one cashier is servicing both checkout lines, alternating which checkout line they are assisting, this scenario would be concurrent execution. However, if another cashier showed up and each of the two cashiers checked out one line of customer search, this scenario would be parallel execution.  

Multiprocessing more closely resembles parallelism, whereas multithreading more closely resembles concurrency. Multiprocessing achieves parallelism by running tasks on separate cores, while multithreading achieves concurrency by running tasks in separate threads within a single core.

An example concurrency diagram showing a single CPU core into two different tasks and an example parallelism diagram showing two cpu cores for individual tasks

Demonstrating Python code capabilities

Multiprocessing example code

This Python program demonstrates how multiprocessing can boost the performance of a CPU-bound task. The objective of this program is to sum all numbers from 0 to 100,000,000, five times. This is certainly a CPU-bound task in that a single-core computing system could only complete this task faster by speeding up the CPU itself. However, with multiprocessing this program can take advantage of the multiple computing cores available to it on the local machine. This program uses the standard Python multiprocessing library to run the sum_all_numbers()function on five separate processes, in parallel. Running the program on my local machine is nearly five times faster using multiprocessing.

    import multiprocessing
import time

NUMBER_OF_PROCESSES = 5
NUMBER_OF_ITERATIONS = 5
N = 100000000  # 100 million


def sum_all_numbers(n):
    """
    Sums all the numbers from zero to n.

    :param n: The upper bound of numbers to be summed
    :return: The sum of all the numbers from 0 to n
    """

    total_sum = sum(range(n + 1))
    return print("Sum: " + str(total_sum))


def without_multiprocessing():
    print("Starting function without multiprocessing.")
    for i in range(NUMBER_OF_ITERATIONS):
        sum_all_numbers(N)


def with_multiprocessing():
    print("Starting function with multiprocessing.")
    jobs = []

    for i in range(NUMBER_OF_PROCESSES):
        process = multiprocessing.Process(
            target=sum_all_numbers,
            args=(N,)
        )
        jobs.append(process)

    for j in jobs:
        j.start()

    for j in jobs:
        j.join()


def main():
    print("Summing all numbers between 0 and " + str(N) + ".\n")

    start_time = time.time()
    without_multiprocessing()
    print("--- Function without multiprocessing took %s seconds ---\n" % (
            time.time() - start_time))

    start_time = time.time()
    with_multiprocessing()
    print("--- Function with multiprocessing took %s seconds ---" % (
            time.time() - start_time))


if __name__ == "__main__":
    main()
  

Multithreading example code

This Python program demonstrates how multithreading can boost the performance of an I/O bound task. The objective of this program is to download five images from the internet. This is certainly an I/O bound task because the program’s speed is limited by the connections to the internet web server that the images are stored on. However, with the standard Python threading library, this program is able to take advantage of multithreading by downloading each of the five images in its own thread. This allows the multithreaded function to make progress on downloading each image concurrently, rather than waiting for each image to download completely before starting to download the next image. Running the program on my local machine is nearly three times faster using multithreading.

    import time
from queue import Queue
from threading import Thread

import requests

NUMBER_OF_THREADS = 5
q = Queue()


def download_image(download_location):
    """
    Download image from image_url.
    """
    global q

    while True:
        image_url = q.get()
        res = requests.get(image_url, stream=True, verify=False)
        filename = f"{download_location}/{image_url.split('/')[-1]}.jpg"

        with open(filename, 'wb') as f:
            for block in res.iter_content(1024):
                f.write(block)

        print("Image downloaded.")
        q.task_done()


def download_images_with_multithreading(images):
    print("Starting function with multithreading.")
    for image_url in images:
        q.put(image_url)

    for t in range(NUMBER_OF_THREADS):
        worker = Thread(target=download_image, args=(
            "with_multithreading_photos",))
        worker.daemon = True
        print("Starting " + worker.name)
        worker.start()

    q.join()


def download_images_without_multithreading(images):
    print("Starting function without multithreading or multiprocessing.")
    for image_url in images:
        res = requests.get(image_url, stream=True, verify=False)

        filename = f"without_multithreading_photos/" \
                   f"{image_url.split('/')[-1]}.jpg"

        with open(filename, 'wb') as f:
            for block in res.iter_content(1024):
                f.write(block)

        print("Image downloaded.")


def main():
    images = [
        'https://images.unsplash.com/photo-1428366890462-dd4baecf492b',
        'https://images.unsplash.com/photo-1541447271487-09612b3f49f7',
        'https://images.unsplash.com/photo-1560840067-ddcaeb7831d2',
        'https://images.unsplash.com/photo-1522069365959-25716fb5001a',
        'https://images.unsplash.com/photo-1533752125192-ae59c3f8c403',
    ]

    print("Downloading images from Internet.\n")

    start_time = time.time()
    download_images_with_multithreading(images)
    print("--- Function with multithreading took %s seconds ---\n" % (
            time.time() - start_time))

    start_time = time.time()
    download_images_without_multithreading(images)
    print("--- Function without multithreading took %s seconds ---\n" % (
            time.time() - start_time))


if __name__ == "__main__":
    main()
  

Multiprocessing in serverless: A case-by-case analysis

Multiprocessing can also be implemented in serverless architectures,  like Amazon Web Services (AWS) Lambda. According to AWS, “Lambda allocates CPU power proportional to the amount of memory provisioned, customers now have access to up to 6 vCPUs. This helps compute-intensive applications like machine learning, modeling, genomics and high-performance computing (HPC) applications perform faster.”  However, developers must weigh the tradeoffs of i multiprocessing at the individual lambda level, with abstracting the parallelism one level up to multiple lambdas. For example, with Lambda a developer would have the option to run a program in a single lambda across multiple processes or multiple lambdas with one process each. This scenario has cost, efficiency and architectural implications that must be considered on a case-by-case basis.

Multithreading and multiprocesses are valuable strategies for any Python developer. Incorporate these techniques in your next Python project to increase speed and performance.


Carlton Marshall II, Senior Associate Software Engineer, Cyber Engineering

Carlton Marshall II is a dedicated software engineer working on a backend software engineering team that develops and maintains an internal serverless data pipeline for vulnerability data. Carlton graduated from Northeastern University in 2019 with a cross-discipline degree in Computer Science and Business Administration. Carlton is an associate in the Cyber Engineering organization, and is a former tech intern and Technology Development Program member. Carlton is very committed to living the values at Capital One and participates in the TDP Alumni Council and the Blacks in TechBusiness Resource Group. You can connect with Carlton on LinkedIn: linkedin.com/in/carltonmarshall.

Related Content