Can you perform multi-threaded tasks within Django?

PythonDjangoMultithreading

Python Problem Overview


The sequence I would like to accomplish:

  1. A user clicks a button on a web page
  2. Some functions in model.py start to run. For example, gathering some data by crawling the internet
  3. When the functions are finished, the results are returned to the user.

Should I open a new thread inside of model.py to execute my functions? If so, how do I do this?

Python Solutions


Solution 1 - Python

As shown in this answer you can use the threading package to perform an asynchronous task. Everyone seems to recommend Celery, but it is often overkill for performing simple but long running tasks. I think it's actually easier and more transparent to use threading.

Here's a simple example for asyncing a crawler:

#views.py
import threading
from .models import Crawl

def startCrawl(request):
    task = Crawl()
    task.save()
    t = threading.Thread(target=doCrawl,args=[task.id])
    t.setDaemon(True)
    t.start()
    return JsonResponse({'id':task.id})

def checkCrawl(request,id):
    task = Crawl.objects.get(pk=id)
    return JsonResponse({'is_done':task.is_done, result:task.result})

def doCrawl(id):
    task = Crawl.objects.get(pk=id)
    # Do crawling, etc.

    task.result = result
    task.is_done = True
    task.save()

Your front end can make a request for startCrawl to start the crawl, it can make an Ajax request to check on it with checkCrawl which will return true and the result when it's finished.


Update for Python3:

The documentation for the threading library recommends passing the daemon property as a keyword argument rather than using the setter:

t = threading.Thread(target=doCrawl,args=[task.id],daemon=True)
t.start()

Update for Python <3.7:

As discussed here, this bug can cause a slow memory leak that can overflow a long running server. The bug was fixed for Python 3.7 and above.

Solution 2 - Python

  1. Yes it can multi-thread, but generally one uses Celery to do the equivalent. You can read about how in the celery-django tutorial.
  2. It is rare that you actually want to force the user to wait for the website. While it's better than risks a timeout.

Here's an example of what you're describing.

User sends request
Django receives => spawns a thread to do something else.
main thread finishes && other thread finishes 
... (later upon completion of both tasks)
response is sent to user as a package.

Better way:

User sends request
Django receives => lets Celery know "hey! do this!"
main thread finishes
response is sent to user
...(later)
user receives balance of transaction 

Solution 3 - Python

If you don't want to add some overkill framework to your project, you can simply use subprocess.Popen:

def my_command(request):
    command = '/my/command/to/run'  # Can even be 'python manage.py somecommand'
    subprocess.Popen(command, shell=True)
    command = '/other/command/to/run'
    subprocess.Popen(command, shell=True)
    return HttpResponse(status=204)

[edit] As mentioned in the comments, this will not start a background task and return the HttpResponse right away. It will execute both commands in parallel, and then return the HttpResponse once both are complete. Which is what OP asked.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRobertView Question on Stackoverflow
Solution 1 - PythonnbwoodwardView Answer on Stackoverflow
Solution 2 - PythoncwallenpooleView Answer on Stackoverflow
Solution 3 - PythonThierry J.View Answer on Stackoverflow