When interviewing engineers for a C++ programming position the question of deadlocks often comes up (ask me how I know this 😉 ). What is it? And how to avoid it?
Often times a deadlock occurs due to a wrong order of acquiring locks: multiple threads need to access 2 or more shared resources; each resource requires mutual exclusion. It goes something like this: thread 1 has successfully acquires the lock for resource 1, then tries to acquire the lock for resource 2. While on another CPU core, around the same time, thread 2 has successfully acquired the lock for resource 2, and now tries to acquire the lock for resource 1. Both threads are now stuck! Thread 1 holds resource 1 and waits for resource 2. Thread 2 holds resource 2 and waits for resource 1. Nasty business! A partial implementation illustrating this bug looks something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
mutex m1, m2; thread t1([&]() { while(true) { m1.lock(); DO_SOME_WORK("Thread 1"); m2.lock(); DO_SOME_WORK("Thread 1"); m1.unlock(); m2.unlock(); } }); thread t2([&]() { while(true) { m2.lock(); DO_SOME_WORK("Thread 2"); m1.lock(); DO_SOME_WORK("Thread 2"); m1.unlock(); m2.unlock(); } }); |
Notice that thread t1 locks mutex m1 first, m2 second. Thread t2 does the opposite. Another thing I would like to point out in the above example is the explicit calls to lock() and unlock(). This is dangerous because 1) you may forget to call unlock() on a mutex you previously locked, and 2) in the presence of exceptions emitted from DO_SOME_WORK(...) the locks you acquired will not be automatically released. A perfect solution to both issues already exists: the RAII technique.
The way to improve all that is wrong with the above code is to always lock the mutex’es in the same order, and have a mutex owning local object handle the unlocking, whether exiting the function normally or due to an exception. But locking not just by explicitly writing the lock() calls in the right order; rather a more elegant, automatic solution is desired here. C++ has just the thing for you: std::lock (see here) and std::scoped_lock (and here). In short: std::lock will perform deadlock resolution magic, even if thread 1 calls std::lock(mutex1, mutex2);, while thread 2 calls std::lock(mutex2, mutex1);, but you will still need to call unlock() explicitly on the mutex’es if that is what you desire. Alternatively (and preferably) you will pass the mutex’es to std::scoped_lock which will use std::lock internally to guarantee no deadlocks take place: std::scoped_lock guard(mutex1, mutex2);. Deadlock free and exception safe (in terms of properly unlocking the mutex’es) partial implementation looks something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
mutex m1, m2; thread t1([&]() { while(true) { scoped_lock guard(m1, m2); DO_SOME_WORK("Thread 1"); } }); thread t2([&]() { while(true) { scoped_lock guard(m2, m1); DO_SOME_WORK("Thread 2"); } }); |
The order in which the mutex’es are passed to std::scoped_lock is irrelevant. Internally std::lock will do the right thing. In presence of exceptions (which I am not catching in the code above, but you should 😉 ) the destructors of local guard objects will release the locks held.
Complete listing below (and on the web at GitHub: deadlock.cpp):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
#include <iostream> #include <mutex> #include <thread> #include <chrono> #include <cstdlib> #include <ctime> using namespace std; using namespace chrono; void DO_SOME_WORK(const char* msg) { { static mutex cout_lock; auto t = system_clock::to_time_t(system_clock::now()); lock_guard guard(cout_lock); cout << msg << " @ " << ctime(&t); } this_thread::sleep_for(milliseconds(rand() % 1)); } void BAD() { mutex m1, m2; thread t1([&]() { while(true) { m1.lock(); DO_SOME_WORK("Thread 1"); m2.lock(); DO_SOME_WORK("Thread 1"); m1.unlock(); m2.unlock(); } }); thread t2([&]() { while(true) { m2.lock(); DO_SOME_WORK("Thread 2"); m1.lock(); DO_SOME_WORK("Thread 2"); m1.unlock(); m2.unlock(); } }); t1.join(); t2.join(); } void GOOD() { mutex m1, m2; thread t1([&]() { while(true) { scoped_lock guard(m1, m2); DO_SOME_WORK("Thread 1"); } }); thread t2([&]() { while(true) { scoped_lock guard(m2, m1); DO_SOME_WORK("Thread 2"); } }); t1.join(); t2.join(); } int main() { srand(time(NULL)); //BAD(); GOOD(); } |
What about starvation?
What about it would you like me to write about? Causes? Prevention? Both? 😉
Line 16: lock_guard guard(cout_lock);
Shouldn’t be lock_guard guard(cout_lock);
Edit: lock_guard guard(cout_lock);
Why not?