Python threads all executing on a single core

PythonMultithreadingPerformance

Python Problem Overview


I have a Python program that spawns many threads, runs 4 at a time, and each performs an expensive operation. Pseudocode:

for object in list:
    t = Thread(target=process, args=(object))
    # if fewer than 4 threads are currently running, t.start(). Otherwise, add t to queue

But when the program is run, Activity Monitor in OS X shows that 1 of the 4 logical cores is at 100% and the others are at nearly 0. Obviously I can't force the OS to do anything but I've never had to pay attention to performance in multi-threaded code like this before so I was wondering if I'm just missing or misunderstanding something.

Thanks.

Python Solutions


Solution 1 - Python

Note that in many cases (and virtually all cases where your "expensive operation" is a calculation implemented in Python), multiple threads will not actually run concurrently due to Python's Global Interpreter Lock (GIL).

> The GIL is an interpreter-level lock. > This lock prevents execution of > multiple threads at once in the Python > interpreter. Each thread that wants to > run must wait for the GIL to be > released by the other thread, which > means your multi-threaded Python > application is essentially single > threaded, right? Yes. Not exactly. > Sort of. > > CPython uses what’s called “operating > system” threads under the covers, > which is to say each time a request to > make a new thread is made, the > interpreter actually calls into the > operating system’s libraries and > kernel to generate a new thread. This > is the same as Java, for example. So > in memory you really do have multiple > threads and normally the operating > system controls which thread is > scheduled to run. On a multiple > processor machine, this means you > could have many threads spread across > multiple processors, all happily > chugging away doing work. > > However, while CPython does use > operating system threads (in theory > allowing multiple threads to execute > within the interpreter > simultaneously), the interpreter also > forces the GIL to be acquired by a > thread before it can access the > interpreter and stack and can modify > Python objects in memory all > willy-nilly. The latter point is why > the GIL exists: The GIL prevents > simultaneous access to Python objects > by multiple threads. But this does not > save you (as illustrated by the Bank > example) from being a lock-sensitive > creature; you don’t get a free ride. > The GIL is there to protect the > interpreters memory, not your sanity.

See the Global Interpreter Lock section of Jesse Noller's post for more details.

To get around this problem, check out Python's multiprocessing module.

> multiple processes (with judicious use > of IPC) are[...] a much better > approach to writing apps for multi-CPU > boxes than threads.

-- Guido van Rossum (creator of Python)

Solution 2 - Python

Python has a Global Interpreter Lock, which can prevent threads of interpreted code from being processed concurrently.

http://en.wikipedia.org/wiki/Global_Interpreter_Lock

http://wiki.python.org/moin/GlobalInterpreterLock

For ways to get around this, try the multiprocessing module, as advised here:

https://stackoverflow.com/questions/992136/Running-Separate-Python-Processes-Avoid-GIL

Solution 3 - Python

AFAIK, in CPython the Global Interpreter Lock means that there can't be more than one block of Python code being run at any one time. Although this does not really affect anything in a single processor/single-core machine, on a mulitcore machine it means you have effectively only one thread running at any one time - causing all the other core to be idle.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRob LourensView Question on Stackoverflow
Solution 1 - PythonGabriel GrantView Answer on Stackoverflow
Solution 2 - PythonT.R.View Answer on Stackoverflow
Solution 3 - PythonMAKView Answer on Stackoverflow