What is the use of join() in Python threading?

PythonMultithreadingPython Multithreading

Python Problem Overview


I was studying the python threading and came across join().

The author told that if thread is in daemon mode then i need to use join() so that thread can finish itself before main thread terminates.

but I have also seen him using t.join() even though t was not daemon

example code is this

import threading
import time
import logging

logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )

def daemon():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')

d = threading.Thread(name='daemon', target=daemon)
d.setDaemon(True)

def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')

t = threading.Thread(name='non-daemon', target=non_daemon)

d.start()
t.start()

d.join()
t.join()

i don't know what is use of t.join() as it is not daemon and i can see no change even if i remove it

Python Solutions


Solution 1 - Python

A somewhat clumsy ascii-art to demonstrate the mechanism: The join() is presumably called by the main-thread. It could also be called by another thread, but would needlessly complicate the diagram.

join-calling should be placed in the track of the main-thread, but to express thread-relation and keep it as simple as possible, I choose to place it in the child-thread instead.

without join:
+---+---+------------------                     main-thread
    |   |
    |   +...........                            child-thread(short)
    +..................................         child-thread(long)

with join
+---+---+------------------***********+###      main-thread
    |   |                             |
    |   +...........join()            |         child-thread(short)
    +......................join()......         child-thread(long)

with join and daemon thread
+-+--+---+------------------***********+###     parent-thread
  |  |   |                             |
  |  |   +...........join()            |        child-thread(short)
  |  +......................join()......        child-thread(long)
  +,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,     child-thread(long + daemonized)

'-' main-thread/parent-thread/main-program execution
'.' child-thread execution
'#' optional parent-thread execution after join()-blocked parent-thread could 
    continue
'*' main-thread 'sleeping' in join-method, waiting for child-thread to finish
',' daemonized thread - 'ignores' lifetime of other threads;
    terminates when main-programs exits; is normally meant for 
    join-independent tasks

So the reason you don't see any changes is because your main-thread does nothing after your join. You could say join is (only) relevant for the execution-flow of the main-thread.

If, for example, you want to concurrently download a bunch of pages to concatenate them into a single large page, you may start concurrent downloads using threads, but need to wait until the last page/thread is finished before you start assembling a single page out of many. That's when you use join().

Solution 2 - Python

Straight from the docs

> join([timeout]) Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs.

This means that the main thread which spawns t and d, waits for t to finish until it finishes.

Depending on the logic your program employs, you may want to wait until a thread finishes before your main thread continues.

Also from the docs: > A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.

A simple example, say we have this:

def non_daemon():
    time.sleep(5)
    print 'Test non-daemon'

t = threading.Thread(name='non-daemon', target=non_daemon)

t.start()

Which finishes with:

print 'Test one'
t.join()
print 'Test two'

This will output:

Test one
Test non-daemon
Test two

Here the master thread explicitly waits for the t thread to finish until it calls print the second time.

Alternatively if we had this:

print 'Test one'
print 'Test two'
t.join()

We'll get this output:

Test one
Test two
Test non-daemon

Here we do our job in the main thread and then we wait for the t thread to finish. In this case we might even remove the explicit joining t.join() and the program will implicitly wait for t to finish.

Solution 3 - Python

Thanks for this thread -- it helped me a lot too.

I learned something about .join() today.

These threads run in parallel:

d.start()
t.start()
d.join()
t.join()

and these run sequentially (not what I wanted):

d.start()
d.join()
t.start()
t.join()

In particular, I was trying to clever and tidy:

class Kiki(threading.Thread):
    def __init__(self, time):
        super(Kiki, self).__init__()
        self.time = time
        self.start()
        self.join()

This works! But it runs sequentially. I can put the self.start() in __ init __, but not the self.join(). That has to be done after every thread has been started.

join() is what causes the main thread to wait for your thread to finish. Otherwise, your thread runs all by itself.

So one way to think of join() as a "hold" on the main thread -- it sort of de-threads your thread and executes sequentially in the main thread, before the main thread can continue. It assures that your thread is complete before the main thread moves forward. Note that this means it's ok if your thread is already finished before you call the join() -- the main thread is simply released immediately when join() is called.

In fact, it just now occurs to me that the main thread waits at d.join() until thread d finishes before it moves on to t.join().

In fact, to be very clear, consider this code:

import threading
import time

class Kiki(threading.Thread):
    def __init__(self, time):
        super(Kiki, self).__init__()
        self.time = time
        self.start()

    def run(self):
        print self.time, " seconds start!"
        for i in range(0,self.time):
            time.sleep(1)
            print "1 sec of ", self.time
        print self.time, " seconds finished!"


t1 = Kiki(3)
t2 = Kiki(2)
t3 = Kiki(1)
t1.join()
print "t1.join() finished"
t2.join()
print "t2.join() finished"
t3.join()
print "t3.join() finished"

It produces this output (note how the print statements are threaded into each other.)

$ python test_thread.py
32   seconds start! seconds start!1

 seconds start!
1 sec of  1
 1 sec of 1  seconds finished!
 21 sec of
3
1 sec of  3
1 sec of  2
2  seconds finished!
1 sec of  3
3  seconds finished!
t1.join() finished
t2.join() finished
t3.join() finished
$ 

The t1.join() is holding up the main thread. All three threads complete before the t1.join() finishes and the main thread moves on to execute the print then t2.join() then print then t3.join() then print.

Corrections welcome. I'm also new to threading.

(Note: in case you're interested, I'm writing code for a DrinkBot, and I need threading to run the ingredient pumps concurrently rather than sequentially -- less time to wait for each drink.)

Solution 4 - Python

The method join()

> blocks the calling thread until the thread whose join() method is called is terminated.

Source : http://docs.python.org/2/library/threading.html

Solution 5 - Python

With join - interpreter will wait until your process get completed or terminated

>>> from threading import Thread
>>> import time
>>> def sam():
...   print 'started'
...   time.sleep(10)
...   print 'waiting for 10sec'
... 
>>> t = Thread(target=sam)
>>> t.start()
started

>>> t.join() # with join interpreter will wait until your process get completed or terminated
done?   # this line printed after thread execution stopped i.e after 10sec
waiting for 10sec
>>> done?

without join - interpreter wont wait until process get terminated,

>>> t = Thread(target=sam)
>>> t.start()
started
>>> print 'yes done' #without join interpreter wont wait until process get terminated
yes done
>>> waiting for 10sec

Solution 6 - Python

In python 3.x join() is used to join a thread with the main thread i.e. when join() is used for a particular thread the main thread will stop executing until the execution of joined thread is complete.

#1 - Without Join():
import threading
import time
def loiter():
    print('You are loitering!')
    time.sleep(5)
    print('You are not loitering anymore!')

t1 = threading.Thread(target = loiter)
t1.start()
print('Hey, I do not want to loiter!')
'''
Output without join()--> 
You are loitering!
Hey, I do not want to loiter!
You are not loitering anymore! #After 5 seconds --> This statement will be printed

'''
#2 - With Join():
import threading
import time
def loiter():
    print('You are loitering!')
    time.sleep(5)
    print('You are not loitering anymore!')

t1 = threading.Thread(target = loiter)
t1.start()
t1.join()
print('Hey, I do not want to loiter!')

'''
Output with join() -->
You are loitering!
You are not loitering anymore! #After 5 seconds --> This statement will be printed
Hey, I do not want to loiter! 

'''

Solution 7 - Python

This example demonstrate the .join() action:

import threading
import time

def threaded_worker():
    for r in range(10):
        print('Other: ', r)
        time.sleep(2)

thread_ = threading.Timer(1, threaded_worker)
thread_.daemon = True  # If the main thread is killed, this thread will be killed as well. 
thread_.start()

flag = True

for i in range(10):
    print('Main: ', i)
    time.sleep(2)
    if flag and i > 4:
        print(
            '''
            Threaded_worker() joined to the main thread. 
            Now we have a sequential behavior instead of concurrency.
            ''')
        thread_.join()
        flag = False

Out:

Main:  0
Other:  0
Main:  1
Other:  1
Main:  2
Other:  2
Main:  3
Other:  3
Main:  4
Other:  4
Main:  5
Other:  5

            Threaded_worker() joined to the main thread. 
            Now we have a sequential behavior instead of concurrency.
            
Other:  6
Other:  7
Other:  8
Other:  9
Main:  6
Main:  7
Main:  8
Main:  9

Solution 8 - Python

When making join(t) function for both non-daemon thread and daemon thread, the main thread (or main process) should wait t seconds, then can go further to work on its own process. During the t seconds waiting time, both of the children threads should do what they can do, such as printing out some text. After the t seconds, if non-daemon thread still didn't finish its job, and it still can finish it after the main process finishes its job, but for daemon thread, it just missed its opportunity window. However, it will eventually die after the python program exits. Please correct me if there is something wrong.

Solution 9 - Python

There are a few reasons for the main thread (or any other thread) to join other threads

  1. A thread may have created or holding (locking) some resources. The join-calling thread may be able to clear the resources on its behalf

  2. join() is a natural blocking call for the join-calling thread to continue after the called thread has terminated.

If a python program does not join other threads, the python interpreter will still join non-daemon threads on its behalf.

Solution 10 - Python

"What's the use of using join()?" you say. Really, it's the same answer as "what's the use of closing files, since python and the OS will close my file for me when my program exits?".

It's simply a matter of good programming. You should join() your threads at the point in the code that the thread should not be running anymore, either because you positively have to ensure the thread is not running to interfere with your own code, or that you want to behave correctly in a larger system.

You might say "I don't want my code to delay giving an answer" just because of the additional time that the join() might require. This may be perfectly valid in some scenarios, but you now need to take into account that your code is "leaving cruft around for python and the OS to clean up". If you do this for performance reasons, I strongly encourage you to document that behavior. This is especially true if you're building a library/package that others are expected to utilize.

There's no reason to not join(), other than performance reasons, and I would argue that your code does not need to perform that well.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser192362127View Question on Stackoverflow
Solution 1 - PythonDon QuestionView Answer on Stackoverflow
Solution 2 - PythondmgView Answer on Stackoverflow
Solution 3 - PythonKiki JewellView Answer on Stackoverflow
Solution 4 - PythonKetouemView Answer on Stackoverflow
Solution 5 - PythonMohideen bin MohammedView Answer on Stackoverflow
Solution 6 - PythonShishir NanotyView Answer on Stackoverflow
Solution 7 - PythonBenyamin JafariView Answer on Stackoverflow
Solution 8 - Pythonuser1342336View Answer on Stackoverflow
Solution 9 - PythonyoonghmView Answer on Stackoverflow
Solution 10 - PythonChris CogdonView Answer on Stackoverflow