The following program, despite looking correct, is a race condition and has unpredictable behaviour:
#include#include #include using namespace std; const int COUNT = 10'000'000; mutex aLock; mutex bLock; struct S { unsigned int a:9; unsigned int b:7; } __attribute__((packed)); int main(int argc, char** argv) { S s{}; thread t1([&]() { scoped_lock lock(aLock); for(int i = 0; i < COUNT; ++i) { s.a = 0; s.a = 0b111111111; } }); thread t2([&]() { scoped_lock lock(bLock); for(int i = 0; i < COUNT; ++i) { s.b = 0; s.b = 0b1111111; } }); t1.join(); t2.join(); cout << "sizeof(S) = " << sizeof(S) << ", " << s.a << ", " << s.b << endl; return 1; }
The proof is in the output of running this code several times:
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 511, 0
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 511, 127
The race condition and the problem here is that the C++ standard states that adjacent bit fields are a single object in memory and may be packed to share the same byte. The CPU cannot operate on single bits, it must load at least 1 byte at a time, and because of the overlap, loading bits
a
b
a, even though protected with its own mutex, also overwrites the value
b
The fix is to use one mutex to protect all adjacent bit fields. I say all because you have no guarantee that the CPU will be able to load 1 byte at a time. It may be limited to working on 32-bit values at a time; depending on the architecture.
Corrected program:
#include#include #include using namespace std; const int COUNT = 10'000'000; mutex abLock; struct S { unsigned int a:9; unsigned int b:7; } __attribute__((packed)); int main(int argc, char** argv) { S s{}; thread t1([&]() { scoped_lock lock(abLock); for(int i = 0; i < COUNT; ++i) { s.a = 0; s.a = 0b111111111; } }); thread t2([&]() { scoped_lock lock(abLock); for(int i = 0; i < COUNT; ++i) { s.b = 0; s.b = 0b1111111; } }); t1.join(); t2.join(); cout << "sizeof(S) = " << sizeof(S) << ", " << s.a << ", " << s.b << endl; return 1; }
good point
Nice