MLSYS ENGINEERING

3.4. Async call

Now, let's see an example of an async call. We make task_one and task_two async functions. Calling them returns immediately, before their execution is finished. They continue to run in the background in parallel without blocking the main thread. When the main thread calls gather(), which means gathering the results from the tasks, it blocks itself until the two tasks are finished.

Code 9. Asynchronous function call.
import time
import asyncio

async def task_one():
    print("Task 1 Start")
    time.sleep(1)  # Wait for 1 sec.
    print("Task 1 Complete")

async def task_two():
    print("Task 2 Start")
    time.sleep(1)  # Wait for 1 sec.
    print("Task 2 Complete")

future_one = task_one()
future_two = task_two()

print("Main Start")
time.sleep(0.5)
print("Main Complete")
await asyncio.gather(future_one, future_two)

# Output:
# Task 1 Start
# Task 2 Start
# Main Start
# Main Complete
# Task 1 Complete
# Task 2 Complete

When an async function is called, it does not execute immediately. Instead, it returns a future, which is a placeholder object representing a computation that has been scheduled but not yet completed. In the code above, future_one and future_two are futures. Their values are empty until the corresponding tasks actually finish running. They serve as convenient handles that let us await the background work and retrieve the results later, which is exactly what asyncio.gather does when it collects the results from all the futures passed to it.

The contrast with the synchronous diagram is striking. Main fires off both task_one() and task_two() back-to-back without waiting for either to finish. As a result, all three participants, Main, Task 1, and Task 2, have active rectangles overlapping at the same point in time. This is the defining visual signature of parallel execution: instead of a single active rectangle moving down the diagram one at a time, multiple rectangles are lit up simultaneously. Main completes its 0.5-second sleep while the two tasks are still running, and only at the very end does it call gather to wait for both to finish.

Main Task 1 Task 2 task_one() 1s task_two() 1s 0.5s gather
Figure 5. Asynchronous function call sequence diagram.

Comparing the two approaches makes the performance difference concrete. In the synchronous version, the total wall-clock time is 2.5 seconds: one second for Task 1, one second for Task 2, and 0.5 seconds for Main's sleep, all running back-to-back. In the asynchronous version, Task 1 and Task 2 run concurrently, and Main's 0.5-second sleep overlaps with both of them. The total wall-clock time collapses to just 1 second, the duration of the longest single task. This is the fundamental advantage of parallel execution: instead of waiting for every task to finish one after another, you only wait as long as the slowest task takes. Of course, this is the ideal case, as there are other factors to consider that we will introduce later.