background function in Python

PythonMultithreading

Python Problem Overview


I've got a Python script that sometimes displays images to the user. The images can, at times, be quite large, and they are reused often. Displaying them is not critical, but displaying the message associated with them is. I've got a function that downloads the image needed and saves it locally. Right now it's run inline with the code that displays a message to the user, but that can sometimes take over 10 seconds for non-local images. Is there a way I could call this function when it's needed, but run it in the background while the code continues to execute? I would just use a default image until the correct one becomes available.

Python Solutions


Solution 1 - Python

Do something like this:

def function_that_downloads(my_args):
    # do some long download here

then inline, do something like this:

import threading
def my_inline_function(some_args):
    # do some stuff
    download_thread = threading.Thread(target=function_that_downloads, name="Downloader", args=some_args)
    download_thread.start()
    # continue doing stuff

You may want to check if the thread has finished before going on to other things by calling download_thread.isAlive()

Solution 2 - Python

Typically the way to do this would be to use a thread pool and queue downloads which would issue a signal, a.k.a an event, when that task has finished processing. You can do this within the scope of the threading module Python provides.

To perform said actions, I would use event objects and the Queue module.

However, a quick and dirty demonstration of what you can do using a simple threading.Thread implementation can be seen below:

import os
import threading
import time
import urllib2
 
 
class ImageDownloader(threading.Thread):
 
    def __init__(self, function_that_downloads):
        threading.Thread.__init__(self)
        self.runnable = function_that_downloads
        self.daemon = True
 
    def run(self):
        self.runnable()
 
 
def downloads():
    with open('somefile.html', 'w+') as f:
        try:
            f.write(urllib2.urlopen('http://google.com').read())
        except urllib2.HTTPError:
            f.write('sorry no dice')
 
 
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
while not os.path.exists('somefile.html'):
    print 'i am executing but the thread has started to download'
    time.sleep(1)
 
print 'look ma, thread is not alive: ', thread.is_alive()

It would probably make sense to not poll like I'm doing above. In which case, I would change the code to this:

import os
import threading
import time
import urllib2
 
 
class ImageDownloader(threading.Thread):
 
    def __init__(self, function_that_downloads):
        threading.Thread.__init__(self)
        self.runnable = function_that_downloads
 
    def run(self):
        self.runnable()
 
 
def downloads():
    with open('somefile.html', 'w+') as f:
        try:
            f.write(urllib2.urlopen('http://google.com').read())
        except urllib2.HTTPError:
            f.write('sorry no dice')
 
 
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
# show message
thread.join()
# display image

Notice that there's no daemon flag set here.

Solution 3 - Python

I prefer to use [gevent][1] for this sort of thing:

[1]: http://www.gevent.org/ "gevent"

import gevent
from gevent import monkey; monkey.patch_all()

greenlet = gevent.spawn( function_to_download_image )
display_message()
# ... perhaps interaction with the user here

# this will wait for the operation to complete (optional)
greenlet.join()
# alternatively if the image display is no longer important, this will abort it:
#greenlet.kill()

Everything runs in one thread, but whenever a kernel operation blocks, gevent switches contexts when there are other "greenlets" running. Worries about locking, etc are much reduced, as there is only one thing running at a time, yet the image will continue to download whenever a blocking operation executes in the "main" context.

Depending on how much, and what kind of thing you want to do in the background, this can be either better or worse than threading-based solutions; certainly, it is much more scaleable (ie you can do many more things in the background), but that might not be of concern in the current situation.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDan HlavenkaView Question on Stackoverflow
Solution 1 - PythonTorelTwiddlerView Answer on Stackoverflow
Solution 2 - PythonMahmoud AbdelkaderView Answer on Stackoverflow
Solution 3 - PythonshauncView Answer on Stackoverflow