The following program, despite looking correct, is a race condition and has unpredictable behaviour:

#include 
#include 
#include 
using namespace std;

const int COUNT = 10'000'000;

mutex aLock;
mutex bLock;

struct S
{
	unsigned int a:9;
	unsigned int b:7;
} __attribute__((packed));

int main(int argc, char** argv)
{
	S s{};

	thread t1([&]() {
		scoped_lock lock(aLock);
		for(int i = 0; i < COUNT; ++i)
		{
			s.a = 0;
			s.a = 0b111111111;
		}
	});

	thread t2([&]() {
		scoped_lock lock(bLock);
		for(int i = 0; i < COUNT; ++i)
		{
			s.b = 0;
			s.b = 0b1111111;
		}
	});

	t1.join();
	t2.join();

	cout << "sizeof(S) = " << sizeof(S) << ", " << s.a << ", " << s.b << endl;

	return 1;
}

The proof is in the output of running this code several times:

sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 511, 0
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 511, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 0, 127
sizeof(S) = 2, 511, 127

The race condition and the problem here is that the C++ standard states that adjacent bit fields are a single object in memory and may be packed to share the same byte. The CPU cannot operate on single bits, it must load at least 1 byte at a time, and because of the overlap, loading bits of

a
also loads the bits of
b
. So the write to bit field
a
, even though protected with its own mutex, also overwrites the value of
b
. And vice versa.

The fix is to use one mutex to protect all adjacent bit fields. I say all because you have no guarantee that the CPU will be able to load 1 byte at a time. It may be limited to working on 32-bit values at a time; depending on the architecture.

Corrected program:

#include 
#include 
#include 
using namespace std;

const int COUNT = 10'000'000;

mutex abLock;

struct S
{
	unsigned int a:9;
	unsigned int b:7;
} __attribute__((packed));

int main(int argc, char** argv)
{
	S s{};

	thread t1([&]() {
		scoped_lock lock(abLock);
		for(int i = 0; i < COUNT; ++i)
		{
			s.a = 0;
			s.a = 0b111111111;
		}
	});

	thread t2([&]() {
		scoped_lock lock(abLock);
		for(int i = 0; i < COUNT; ++i)
		{
			s.b = 0;
			s.b = 0b1111111;
		}
	});

	t1.join();
	t2.join();

	cout << "sizeof(S) = " << sizeof(S) << ", " << s.a << ", " << s.b << endl;

	return 1;
}

2 Replies to “Bit fields and race conditions”

Leave a Reply to kobicaCancel reply