When using asyncio, how do you allow all running tasks to finish before shutting down the event loop

PythonPython Asyncio

Python Problem Overview


I have the following code:

@asyncio.coroutine
def do_something_periodically():
    while True:
        asyncio.async(my_expensive_operation())
        yield from asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

I run this function until complete. The problem occurs when shutdown is set - the function completes and any pending tasks are never run.

This is the error:

task: <Task pending coro=<report() running at script.py:33> wait_for=<Future pending cb=[Task._wakeup()]>>

How do I schedule a shutdown correctly?

To give some context, I'm writing a system monitor which reads from /proc/stat every 5 seconds, computes the cpu usage in that period, and then sends the result to a server. I want to keep scheduling these monitoring jobs until I receive sigterm, when I stop scheduling, wait for all current jobs to finish, and exit gracefully.

Python Solutions


Solution 1 - Python

You can retrieve unfinished tasks and run the loop again until they finished, then close the loop or exit your program.

pending = asyncio.all_tasks()
loop.run_until_complete(asyncio.gather(*pending))
  • pending is a list of pending tasks.
  • asyncio.gather() allows to wait on several tasks at once.

If you want to ensure all the tasks are completed inside a coroutine (maybe you have a "main" coroutine), you can do it this way, for instance:

async def do_something_periodically():
    while True:
        asyncio.create_task(my_expensive_operation())
        await asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

    await asyncio.gather(*asyncio.all_tasks())

Also, in this case, since all the tasks are created in the same coroutine, you already have access to the tasks:

async def do_something_periodically():
    tasks = []
    while True:
        tasks.append(asyncio.create_task(my_expensive_operation()))
        await asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

    await asyncio.gather(*tasks)

Solution 2 - Python

As of Python 3.7 the above answer uses multiple deprecated APIs (asyncio.async and Task.all_tasks,@asyncio.coroutine, yield from, etc.) and you should rather use this:

import asyncio


async def my_expensive_operation(expense):
    print(await asyncio.sleep(expense, result="Expensive operation finished."))


async def do_something_periodically(expense, interval):
    while True:
        asyncio.create_task(my_expensive_operation(expense))
        await asyncio.sleep(interval)


loop = asyncio.get_event_loop()
coro = do_something_periodically(1, 1)

try:
    loop.run_until_complete(coro)
except KeyboardInterrupt:
    coro.close()
    tasks = asyncio.all_tasks(loop)
    expensive_tasks = {task for task in tasks if task._coro.__name__ != coro.__name__}
    loop.run_until_complete(asyncio.gather(*expensive_tasks))

Solution 3 - Python

I noticed some answers suggested using asyncio.gather(*asyncio.all_tasks()), but the issue with that can sometimes be an infinite loop where it waits for the asyncio.current_task() to complete, which is itself. Some answers suggested some complicated workarounds involving checking coro names or len(asyncio.all_tasks()), but it turns out it's very simple to do by taking advantage of set operations:

async def main():
    # Create some tasks.
    for _ in range(10):
        asyncio.create_task(asyncio.sleep(10))
    # Wait for all other tasks to finish other than the current task i.e. main().
    await asyncio.gather(*asyncio.all_tasks() - {asyncio.current_task()})

Solution 4 - Python

Use a wrapper coroutine that waits until the pending task count is 1 before returning.

async def loop_job():
    asyncio.create_task(do_something_periodically())
    while len(asyncio.Task.all_tasks()) > 1:  # Any task besides loop_job() itself?
        await asyncio.sleep(0.2)

asyncio.run(loop_job())

Solution 5 - Python

I'm not sure if this is what you've asked for but I had a similar problem and here is the ultimate solution that I came up with.

The code is python 3 compatible and uses only public asyncio APIs (meaning no hacky _coro and no deprecated APIs).

import asyncio

async def fn():
  await asyncio.sleep(1.5)
  print('fn')

async def main():
    print('main start')
    asyncio.create_task(fn()) # run in parallel
    await asyncio.sleep(0.2)
    print('main end')


def async_run_and_await_all_tasks(main):
  def get_pending_tasks():
      tasks = asyncio.Task.all_tasks()
      pending = [task for task in tasks if task != run_main_task and not task.done()]
      return pending

  async def run_main():
      await main()

      while True:
          pending_tasks = get_pending_tasks()
          if len(pending_tasks) == 0: return
          await asyncio.gather(*pending_tasks)

  loop = asyncio.new_event_loop()
  run_main_coro = run_main()
  run_main_task = loop.create_task(run_main_coro)
  loop.run_until_complete(run_main_task)

# asyncio.run(main()) # doesn't print from fn task, because main finishes earlier
async_run_and_await_all_tasks(main)

output (as expected):

main start
main end
fn

That async_run_and_await_all_tasks function will make python to behave in a nodejs manner: exit only when there are no unfinished tasks.

Solution 6 - Python

You might also consider using asyncio.shield, although by doing this way you won't get ALL the running tasks finished but only shielded. But it still can be useful in some scenarios.

Besides that, as of Python 3.7 we also can use the high-level API method asynio.run here. As Python core developer, Yury Selivanov suggests: https://youtu.be/ReXxO_azV-w?t=636
Note: asyncio.run function has been added to asyncio in Python 3.7 on a provisional basis.

Hope that helps!

import asyncio


async def my_expensive_operation(expense):
    print(await asyncio.sleep(expense, result="Expensive operation finished."))


async def do_something_periodically(expense, interval):
    while True:
        asyncio.create_task(my_expensive_operation(expense))
        # using asyncio.shield
        await asyncio.shield(asyncio.sleep(interval))


coro = do_something_periodically(1, 1)

if __name__ == "__main__":
    try:
        # using asyncio.run
        asyncio.run(coro)
    except KeyboardInterrupt:
        print('Cancelled!')

Solution 7 - Python

If you want a clean way to await on all running tasks created within some local scope without leaking memory (and while preventing garbage collection errors), you can maintain a set of running tasks and use task.add_done_callback(...) to remove the task from the set. Here is a class that handles this for you:

class TaskSet:
    def __init__(self):
        self.tasks = set()

    def add(self, coroutine: Coroutine) -> Task:
        task = asyncio.create_task(coroutine)
        self.tasks.add(task)
        task.add_done_callback(lambda _: self.tasks.remove(task))
        return task

    def __await__(self):
        return asyncio.gather(*self.tasks).__await__()

Which can be used like this:

async def my_function():
    await asyncio.sleep(0.5)


async def go():
    tasks = TaskSet()
    for i in range(10):
        tasks.add(my_function())
    await tasks

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionderekdreeryView Question on Stackoverflow
Solution 1 - PythonMartin RichardView Answer on Stackoverflow
Solution 2 - Pythonthrows_exceptions_at_youView Answer on Stackoverflow
Solution 3 - PythonSimply Beautiful ArtView Answer on Stackoverflow
Solution 4 - PythongilchView Answer on Stackoverflow
Solution 5 - PythongrabantotView Answer on Stackoverflow
Solution 6 - PythonRamil AglyautdinovView Answer on Stackoverflow
Solution 7 - PythonMatthew D. ScholefieldView Answer on Stackoverflow