Fun with unordered containers

I’ve been playing around with unordered_map and unordered_set to see how they grow when elements are inserted. Here’s what I learned so far: the default load factor is 1; this means the container will grow to accommodate at most 1 element per bucket. You can change the load factor with max_load_factor call followed by rehash call. I also wanted to see how different STL implementations behave, so I tested with GCC, Apple’s CLAND, and LLVM on my Mac. The test program creates an unordered_map and unordered_set. Loads it with 10'000'000 entries and prints out the container’s properties. I then clear the container and do the same test but reserve space for the entries. I print the container’s properties: size, bucket count, and load factor. Next I change the max_load_factor to 10 (allowing for up to 10 entries per bucket) and rehash with maximum 1/10th the bucket count.
The GCC implementation is more aggressive in terms of container growth; it creates more buckets than Apple’s CLANG and LLVM’s implementation which had identical results.

The takeaway here is that when using unordered containers it is good to either reserve the appropriate element count, or rehash after inserting the elements to reduce the bucket count and optimize memory use.

****************************************
* GCC -Ofast -march=native -lstdc++ *
****************************************
unordered_map
initial
size = 10000000
bucket count = 15345007
load factor = 0.651678

reserved
size = 10000000
bucket count = 10352717
load factor = 0.96593

re-hashed
size = 10000000
bucket count = 1056323
load factor = 9.4668

unordered_set
initial
size = 10000000
bucket count = 15345007
load factor = 0.651678

reserved
size = 10000000
bucket count = 10352717
load factor = 0.96593

re-hashed
size = 10000000
bucket count = 1056323
load factor = 9.4668

**********************************************
* Apple CLANG -Ofast -march=native -lc++ *
**********************************************
unordered_map
initial
size = 10000000
bucket count = 13169977
load factor = 0.759303

reserved
size = 10000000
bucket count = 10000019
load factor = 0.999998

re-hashed
size = 10000000
bucket count = 1000003
load factor = 9.99997

unordered_set
initial
size = 10000000
bucket count = 13169977
load factor = 0.759303

reserved
size = 10000000
bucket count = 10000019
load factor = 0.999998

re-hashed
size = 10000000
bucket count = 1000003
load factor = 9.99997
Program output.

Complete listing:

#include <unordered_map>
#include <unordered_set>
#include "trace.h"
using namespace std;

const int COUNT = 10'000'000;

int main(int argc, char** argv)
{
	unordered_map<int, int> m;
	for(int i = 0; i < COUNT; ++i)
		m[i] = i;

	trace("unordered_map");
	trace("initial");
	trace("size         = ", m.size());
	trace("bucket count = ", m.bucket_count());
	trace("load factor  = ", m.load_factor(), "\n");

	m.clear();
	m.reserve(COUNT);
	for(int i = 0; i < COUNT; ++i)
		m[i] = i;

	trace("reserved");
	trace("size         = ", m.size());
	trace("bucket count = ", m.bucket_count());
	trace("load factor  = ", m.load_factor(), "\n");

	m.max_load_factor(10);
	m.rehash(COUNT / 10);

	trace("re-hashed");
	trace("size         = ", m.size());
	trace("bucket count = ", m.bucket_count());
	trace("load factor  = ", m.load_factor(), "\n");

	unordered_set<int> s;
	for(int i = 0; i < COUNT; ++i)
		s.insert(i);

	trace("unordered_set");
	trace("initial");
	trace("size         = ", s.size());
	trace("bucket count = ", s.bucket_count());
	trace("load factor  = ", s.load_factor(), "\n");

	s.clear();
	s.reserve(COUNT);
	for(int i = 0; i < COUNT; ++i)
		s.insert(i);

	trace("reserved");
	trace("size         = ", s.size());
	trace("bucket count = ", s.bucket_count());
	trace("load factor  = ", s.load_factor(), "\n");

	s.max_load_factor(10);
	s.rehash(COUNT / 10);

	trace("re-hashed");
	trace("size         = ", s.size());
	trace("bucket count = ", s.bucket_count());
	trace("load factor  = ", s.load_factor(), "\n");

	return 1;
}

#include <unordered_map>

#include <unordered_set>

#include "trace.h"

using namespace std;

const int COUNT = 10'000'000;

int main(int argc, char** argv)

{

unordered_map<int, int> m;

for(int i = 0; i < COUNT; ++i)

m[i] = i;

trace("unordered_map");

trace("initial");

trace("size = ", m.size());

trace("bucket count = ", m.bucket_count());

trace("load factor = ", m.load_factor(), "\n");

m.clear();

m.reserve(COUNT);

for(int i = 0; i < COUNT; ++i)

m[i] = i;

trace("reserved");

trace("size = ", m.size());

trace("bucket count = ", m.bucket_count());

trace("load factor = ", m.load_factor(), "\n");

m.max_load_factor(10);

m.rehash(COUNT / 10);

trace("re-hashed");

trace("size = ", m.size());

trace("bucket count = ", m.bucket_count());

trace("load factor = ", m.load_factor(), "\n");

unordered_set<int> s;

for(int i = 0; i < COUNT; ++i)

s.insert(i);

trace("unordered_set");

trace("initial");

trace("size = ", s.size());

trace("bucket count = ", s.bucket_count());

trace("load factor = ", s.load_factor(), "\n");

s.clear();

s.reserve(COUNT);

for(int i = 0; i < COUNT; ++i)

s.insert(i);

trace("reserved");

trace("size = ", s.size());

trace("bucket count = ", s.bucket_count());

trace("load factor = ", s.load_factor(), "\n");

s.max_load_factor(10);

s.rehash(COUNT / 10);

trace("re-hashed");

trace("size = ", s.size());

trace("bucket count = ", s.bucket_count());

trace("load factor = ", s.load_factor(), "\n");

return 1;

}

trace.h:

#pragma once

#include <iostream>
#include <mutex>
#include "mutex.h"

namespace { static inline fast_mutex kStdOutLock; }

template<typename... Ts>
inline void trace(Ts&&... args)
{
	std::scoped_lock lock(kStdOutLock);
	(std::cout << ... << args) << std::endl;
}

#pragma once

#include <iostream>

#include <mutex>

#include "mutex.h"

namespace { static inline fast_mutex kStdOutLock; }

template<typename... Ts>

inline void trace(Ts&&... args)

{

std::scoped_lock lock(kStdOutLock);

(std::cout << ... << args) << std::endl;

}

3 Replies to “Fun with unordered containers”

kobica says:

February 16, 2019 at 7:54 pm

I’ve never used fast_mutex. Why did you pick it over a plain mutex?

Loading...

1. Martin Vorbrodt says:
  
  February 17, 2019 at 10:09 am
  
  it’s an implementation from my previous post; check it out.
  
  Loading...
  
kobica says:

February 17, 2019 at 11:42 am

thank you

Loading...

Practical Modern C++

Fun with unordered containers

Like this:

Related

3 Replies to “Fun with unordered containers”

Leave a ReplyCancel reply

Practical Modern C++

Fun with unordered containers

Share this:

Like this:

Related

3 Replies to “Fun with unordered containers”

Leave a ReplyCancel reply