std::unique_ptr, or how to express explicit ownership

When we talk about std::unique_ptr we must mention the idea of explicit resource ownership as well as the concept of a resource source and sink. By wrapping a pointer inside std::unique_ptr we state that whoever holds the std::unique_ptr owns the resource explicitly: has complete control over its lifetime. Prior to C++11 this was expressed using std::auto_ptr. Modern C++ deprecated std::auto_ptr and addressed its shortcomings by introducing the std::unique_ptr.

A source is a function that creates a resource then relinquishes its ownership; in other words, it gives the ownership to whoever called it, and from then on, the caller is responsible for releasing that resource.

A sink is a function which accepts a resource as a parameter and assumes ownership of it; in other words, it promises to release said resource once it’s done executing.

Transfer of ownership must be stated explicitly for named instances (variables of type std::unique_ptr) with std::move. Unnamed temporaries can be passed into a variable or as a function parameter without invoking std::move (this will be clearly illustrated in the code sample below).

Explicit ownership of a resource can be “upgraded” to shared ownership by std::move’ing the std::unique_ptr into a std::shared_ptr. It resets the std::unique_ptr to point back at nullptr and initializes the std::shared_ptr with reference count of 1 (also illustrated in the code below).

Worth mentioning is the fact that std::unique_ptr can own arrays of objects allocated with operator new. A partial specialization of std::unique_ptr template for array types also overloads operator[] for easy array element access; it mirrors the syntax of native arrays. However there are limitations of storing arrays inside std::unique_ptr: when creating an array of primitive types an initial value for each element cannot be specified (and each element ends up with an undefined value); when creating an array of user-defined types (classes and structs) said type must have a default constructor. Moreover the size of the array is lost: there is no way to query the std::unique_ptr and find out how many elements the array holds. For those reasons I recommend you stick with std::vector: it’s elements can be initialized with a default or custom value and you can always call size() on it.

The example program below, using inline comments, further details how to transfer the resource ownership and how to express the idea of source and sink funtions.

Complete listing on GitHub: unique.cpp and T.hpp. Merry Christmas and Happy New Year!

When exactly does the std::shared_ptr take ownership?

In a post I wrote few days ago titled “A word on std::shared_ptr” I was mistakenly arguing that, should the std::shared_ptr fail to allocate its control block (where the reference count and other information is stored), the passed raw pointer would not be deleted. Example:

auto p = std::shared_ptr<T>(new T);

Here, if the allocation and construction of T succeeded, the code would then enter shared pointer’s constructor. Inside this constructor the allocation of the control block would take place. If that failed, I argued, the newly allocated instance of T would leak. That is not what actually happens!

After re-reading the C++ specification, and looking over the shared pointer’s implementation supplied with Visual Studio 2017, it is now clear to me that the allocated instance of T will in fact be deleted.

Since learning of this mistake, I have taken down that post and redirected the original URL here, where I take a more in-depth look at the std::shared_ptr.

I would like to thank the kind people on reddit for taking the time to set me straight! It was both a humbling and an embarrassing experience, but I am a better C++ programmer because of it.

In-depth look at C++ shared pointers

First, you should use std::make_shared (most of the time) to create shared pointers. It is optimized to perform only one allocation for both the reference count / control block and the object it holds.

It is also safer to use std::make_shared in the presence of exceptions:

some_function(std::shared_ptr<T>(new T), std::shared_ptr<T>(new T));

Prior to C++17, one of the possible orders of events could have been:
1) new T
2) new T
3) std::shared_ptr<T>(...)
4) std::shared_ptr<T>(...)

So in the worse case the code could leak an instance of T if the second allocation in step #2 failed. That is not the case with C++17. It has more stringent requirements on the order of function parameter evaluation. C++17 dictates that each parameter must be evaluated completely before the next one is handled. So the new order of events becomes:
1) 1st new T
2) 1st std::shared_ptr<T>(...)
3) 2nd new T
4) 2nd std::shared_ptr<T>(...)
or:
1) 2nd new T
2) 2nd std::shared_ptr<T>(...)
3) 1st new T
4) 1st std::shared_ptr<T>(...)

Shared pointers can also be used with arrays by providing a custom deleter, like this:

auto p = std::shared_ptr<T>(new T[N], [](T* ptr) { delete [] ptr; });

Unfortunately you can not use std::make_shared for that, and it is much easier to use arrays with std::unique_ptr since std::shared_ptr does not provide an overloaded operator [], but about that in my next post 😉

Disadvantages of using std::make_shared:

std::make_shared can only construct instances of T using T’s public constructors. If you need to create an instance using private or protected constructor you have to do it from within a member or static-member function of T. It is not possible to declare std::make_shared to be a friend of T (technically you can declare it to be a friend, but during invocation of std::make_shared it will fail a static_assert inside std::is_constructible type trait class, so for all intents and purposes friendship is not possible here).

std::weak_ptr makes things even more complicated:

Finally, let’s take a look at when instances of objects held by std::shared_ptr are destroyed and their memory released. Under normal circumstances the destructor of T held by a shared pointer will be called when the last shared pointer is destroyed, and the memory where T lived will be released.

That is not the case if the shared pointer of T was constructed using std::make_shared and std::weak_ptr was made to point at it.

In order to function properly std::weak_ptr must hold a reference to the shared pointer’s control block so that it can: 1) answer the call to use_count() and 2) return a nullptr when lock() is called on it after the last shared pointer went out of scope. If the shared pointer’s control block and the instance of T lived in the same memory block, that memory can not be freed until all shared and weak pointers referencing it go away. Now, the destructor of T will be called when the last shared pointer goes away, but the memory will linger until the remaining weak pointers are gone.

I didn’t mention anything about passing shared pointers as function arguments or return values… but that’s a topic about higher level design of object ownership and lifetime; maybe I’ll write about it after I cover unique pointers.

Avoiding deadlocks the C++ way

When interviewing engineers for a C++ programming position the question of deadlocks often comes up (ask me how I know this 😉 ). What is it? And how to avoid it?

Often times a deadlock occurs due to a wrong order of acquiring locks: multiple threads need to access 2 or more shared resources; each resource requires mutual exclusion. It goes something like this: thread 1 has successfully acquires the lock for resource 1, then tries to acquire the lock for resource 2. While on another CPU core, around the same time, thread 2 has successfully acquired the lock for resource 2, and now tries to acquire the lock for resource 1. Both threads are now stuck! Thread 1 holds resource 1 and waits for resource 2. Thread 2 holds resource 2 and waits for resource 1. Nasty business! A partial implementation illustrating this bug looks something like this:

Notice that thread t1 locks mutex m1 first, m2 second. Thread t2 does the opposite. Another thing I would like to point out in the above example is the explicit calls to lock() and unlock(). This is dangerous because 1) you may forget to call unlock() on a mutex you previously locked, and 2) in the presence of exceptions emitted from DO_SOME_WORK(...) the locks you acquired will not be automatically released. A perfect solution to both issues already exists: the RAII technique.

The way to improve all that is wrong with the above code is to always lock the mutex’es in the same order, and have a mutex owning local object handle the unlocking, whether exiting the function normally or due to an exception. But locking not just by explicitly writing the lock() calls in the right order; rather a more elegant, automatic solution is desired here. C++ has just the thing for you: std::lock (see here) and std::scoped_lock (and here). In short: std::lock will perform deadlock resolution magic, even if thread 1 calls std::lock(mutex1, mutex2);, while thread 2 calls std::lock(mutex2, mutex1);, but you will still need to call unlock() explicitly on the mutex’es if that is what you desire. Alternatively (and preferably) you will pass the mutex’es to std::scoped_lock which will use std::lock internally to guarantee no deadlocks take place: std::scoped_lock guard(mutex1, mutex2);. Deadlock free and exception safe (in terms of properly unlocking the mutex’es) partial implementation looks something like this:

The order in which the mutex’es are passed to std::scoped_lock is irrelevant. Internally std::lock will do the right thing. In presence of exceptions (which I am not catching in the code above, but you should 😉 ) the destructors of local guard objects will release the locks held.

Complete listing below (and on the web at GitHub: deadlock.cpp):

Multi-hashing

Yes I totally invented this term 😛 What I mean by it is producing multiple hashes from a single key. Like this (if the syntax is unfamiliar to you read this):

Or like this (for non-template version which returns a vector):

Why? One place where I needed such sorcery was my bloom filter implementation. The idea is simple: one key, multiple hashes, repeatable (multiple calls with the same key produce the same hashes). But how? STL only comes with one hashing function. True, but it comes with multiple random number generators which can be seeded with a hash!

The solution then is to hash once, seed the random number generator, and make multiple calls to the RNG, like this (hash.hpp):

You can use it like this (multi_hash.cpp):

HashN(‘Vorbrodt’s C++ Blog’):
1977331388
699200791
437177953

HashN(‘Vorbrodt’s C++ Blog’):
1977331388
699200791
437177953

HashN(‘https://vorbrodt.blog’):
1924360287
1619619789
1594567998

HashNT(‘Vorbrodt’s C++ Blog’):
1977331388
699200791
437177953

HashNT(‘Vorbrodt’s C++ Blog’):
1977331388
699200791
437177953

HashNT(‘https://vorbrodt.blog’):
1924360287
1619619789
1594567998

Program output.

#pragma

There are two useful #pragma directives I like to use in my code: one let’s the preprocessor know that you want to include a header fine only once, and another deals with structure packing.

Instead of using the header include guards, which are ugly as sin, use #pragma once at the beginning of your header files, like this:

For structure packing, use #pragma pack directive. It tells the compiler about the default field alignment. On Clang 8.0.0 the sizeof of this structure is 32 bytes:

We can pack it down to 27 bytes by using this directive (it tells the compiler to align all member fields on one byte boundary; this is useful when designing efficient network protocols or data serialization):

You can also show, at compile time, what the current packing alignment is with #pragma pack(show). Current alignment can be pushed onto a stack, then reverted back, with #pragma pack(push, 1) followed by #pragma pack(pop).

Complete listing (pragma.cpp):

32, 27

Program output.

Exception safe assignment

Longer title: exception safe assignment operator of resource owning objects. Uff. Because the object owns a resource, how do we write an exception safe assignment operator which will have to free up the old and allocate the new resource. By exception safe I don’t mean that it will never throw, that’s not possible. Instead, I mean safe in the sense that it either succeeds OR in case of exceptions the state of assigned to object is exactly as it was prior to the assignment. Like this:

If assignment operator s1 = s2 throws an exception, we want the state of s1 and s2 to be as it was in line #3.

The trick is two fold: 1) a copy constructor is needed, and 2) noexcept swap function. Like this:

Here the copy constructor allocates the new resource first, then copies its content; the swap function just swaps pointers to the resources, which is always a noexcept operation. Having implemented a copy constructor and swap function we can now implement every assignment operator to have a strong exception guarantee like this:

Here’s how it works: we first make a temporary copy, which does the resource allocation. At this stage exceptions can be thrown, but we have not yet modified the assigned to object. Only after the resource allocation succeeds do we perform the noexcept swap. The destructor of your temporary object will take care of cleaning up the currently owned resource (that’s RAII at its best).

Complete listing (assignment.cpp):

S()
S()
operator = (const S&)
S(const S&)
~S()
~S()
~S()

Program output.

Hashing the C++ way

Modern C++ brought us std::hash template (read more about it here). In short: it’s a stateless function object that implements operator() which takes an instance of a type as parameter and returns its hash as size_t. It has specializations for all primitive types as well as some library types. You can also specialize it yourself for your own data types (don’t forget to put your specialization in namespace std). Let’s see how it works by hashing some ints, chars, floats, pointers, strings, and our own custom data type. Pay close attention to the hash values of ints and chars…

hash.cpp:

Hash of ‘1’: 1
Hash of ‘2’: 2
Hash of ‘3’: 3


Hash of ‘A’: 65
Hash of ‘B’: 66
Hash of ‘C’: 67


Hash of ‘1.1’: 1066192077
Hash of ‘1.2’: 1067030938
Hash of ‘1.3’: 1067869799


Hash of ‘0x7f95fdd000a0’: 6424303057458324486
Hash of ‘0x7f95fdd000a1’: 6736290418105006831
Hash of ‘0x7f95fdd000a2’: 13890240933949840298


Hash of ‘Vorbrodt’s C++ Blog’: 435643587581864924
Hash of ‘Vorbrodt’s C++ Blog’: 435643587581864924
Hash of ‘https://vorbrodt.blog’: 13293888041758778516


Hash of ‘Vorbrodt’s C++ Blog,https://vorbrodt.blog’: 8570762348687434484
Hash of ‘Vorbrodt’s C++ Blog,https://vorbrodt.blog’: 8570762348687434484
Hash of ‘https://vorbrodt.blog,Vorbrodt’s C++ Blog’: 13000220508453909292

Data alignment the C++ way

Before modern C++ the only way to align variables or structures on a given byte boundary was to inject padding; to align a struct to 16 bytes you had to do this:

Not any more! Modern C++ introduced a keyword just for that: alignas (read more about it here). Now you can specify struct’s alignment like this:

This can be of great help when dealing with constructive or destructive interference of L1 cache lines. You can also space local variables apart, as well as struct/class members. Here’s a complete example (alignas.cpp):

sizeof(Old): 16
sizeof(New): 16
Address of ‘x’      : 0x7ffee4a448c0
Address of ‘y’      : 0x7ffee4a448d0
Address of ‘z’      : 0x7ffee4a448e0
Distance ‘x’ to ‘y’ : 16
Distance ‘y’ to ‘z’ : 16
sizeof(Empty)  : 1
sizeof(Empty64): 64
sizeof(Full): 64

Program output.

Simple file I/O

I was playing around with file I/O the C++ way and decided to create a file hashing program using ifstream and Botan crypto library. The program reads an entire file specified as the command line argument and takes the SHA1 hash of the content. It’s amazing what you can accomplish with well designed frameworks in very little code. Here’s the program (file_hash.cpp):

Better bloom filter

Based on this implementation it supports multiple hashes for better positive hit ratio. Can be initializes with size in bits and number of hashes to perform, like this: bloom_filter bloom(128, 5);
As always, complete implementation on GitHub: bloom.hpp.

Bloom Filters

From Wikipedia:

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.

In other words, given a set of elements, bloom filter can tell you that: A) given element is definitely not in the set, or B) given element is maybe in the set. It can give you a false positive: it can say that an element is in the set when it in fact is not. But it will never give you a false negative: it will never say that an element is not in the set when in fact it is.

In my past life I used bloom filters to check whether or not I should perform an expensive database index search 🙂

In the following example I construct a bloom filter given a set of strings set1, I then verify that each element of set1 is in the set according to the bloom filter. Finally I try a different set of elements set2, and test what the bloom filter says about those elements. Given big enough bloom filter I get 100% correct answers (that non of the elements in set2 are present). Here’s the code (bloom.cpp):

Contains “Martin” : 1
Contains “Vorbrodt” : 1
Contains “C++” : 1
Contains “Blog” : 1
Contains “Not” : 0
Contains “In” : 0
Contains “The” : 0
Contains “Set” : 0

Program output.

My implementation of the bloom filter is primitive: it uses only one hashing function which increases false positives, but it is very short and clean and can be built upon; maybe at some point I’ll write a full blown implementation; though there’s plenty of examples online; I want this post to be an introduction to the topic.

In any case, here’s my implementation of the bloom filter (bloom.hpp):

C++ Attributes

C++11 introduced standard attributes: a way to mark fragments of code with useful information for the developer or optimization information for the compiler. See a complete list of standard attributes here, Clang attributes here, and Microsoft attributes here. I will go over a few of them in this post.

  • [[nodiscard]] – when specified with a function declaration, tells the compiler to emit a warning if function’s return value is ignored; when specified with a struct emits a warning wherever the struct is returned and ignored.
  • [[fallthrough]] – suppresses compiler warning about a switch-case statement without a break; in other words, a case statement that falls through into another case.
  • [[no_unique_address]] – tells the compiler to perform empty base optimization on marked data member.
  • [[deprecated]] – emits a warning when marked function, struct, namespace, or variable is used.
  • [[noreturn]] – tells the compiler that the marked function never returns; emits a warning if it does.
  • [[maybe_unused]] – suppressed a warning when marked variable, function, or argument is not used.
  • [[likely]] – tell the compiler this is the most likely path of execution; allows for optimizations.

Note: not all are supported on every compiler; I tested on LLVM 8.0.0 and GCC 8.2. Luckily the unsupported ones do not cause a compile error 🙂

Below is an example code and a screenshot of compiler messages.

attributes.cpp:


Compiler warnings; LLVM 8.0.0.

Initialization list exceptions and raw pointers

What to do when an exception is thrown on the initialization list when allocating memory for a raw pointer? The situation is easy if your class only has one raw pointer member, but it gets complicated with two or more. Here’s a code example that’s guaranteed to leak memory if the second new int throws an exception (because the destructor will not be called):

There is no way to free the memory allocated to p1 if p2(new int) throws! Let’s build on my previous example and see what happens if we use a function try-catch block on the constructor:

Still no good! Because accessing p1 and p2 in the catch block leads to undefined behavior. See here.

The only way to guarantee correct behavior is to use smart pointers. This works because 1) the initialization list allocates in pre-defined order (the order of member declaration) and 2) the destructors of already created members will be called. Here’s the correct way of allocating multiple pointers:

This is guaranteed to do proper cleanup if the second make_unique<int>() throws std::bad_alloc 🙂

Complete listing (bad_pointer.cpp):

Function try-catch blocks

Syntactic sugar or a useful feature? More than just sweet sweet sugar baby! This little known feature is a nice way of wrapping an entire function in a try catch block. So instead of writing this:

You can write this:

The meaning of the two functions is identical. Notice here I’m swallowing the exception instead of propagating it out. I could call throw to re-throw it, or in both cases I could throw a different exception inside the catch block.

The caveat is with function try-catch blocks around constructors: they have to re-throw the same or different exception. If you don’t re-throw explicitly the compiler will do it for you. It is also useful for catching exceptions emitted during the initialization of member variables, and throwing something else (or re-throwing the same). Like this:

Complete listing (try_block.cpp):

Swallowing: System error from eat_it()
Swallowing: System error from eat_it_sugar()
Inside Q::Q() caught: Logic error from P::P()
Inside main() caught: Runtime error from Q::Q()

Program output.

The #1 rule of cryptography

The #1 rule of cryptography: Don’t invent your own!

OK wiseman, now what? You want to add crypto to your program but you don’t want to code it all yourself. I’ll show you three libraries that make it possible. The choice will be yours as to which one to use.

For this example I wanted to write a simple function that accepts a std::string message and returns hex encoded SHA-1 hash. I picked the following libraries: Crypto++, WolfSSL, and Botan. All three made it pretty easy, and I don’t want to get into the business of picking winners and losers, but… Botan mad it a breeze and I think it will be my choice going forward 🙂

crypto.cpp:

Message: Vorbrodt’s C++ Blog @ https://vorbrodt.blog
Digest : 24BCAC1359AA8B773D38D6A05B22BB43DAB5B8E5

Message: Vorbrodt’s C++ Blog @ https://vorbrodt.blog
Digest : 24BCAC1359AA8B773D38D6A05B22BB43DAB5B8E5

Message: Vorbrodt’s C++ Blog @ https://vorbrodt.blog
Digest : 24BCAC1359AA8B773D38D6A05B22BB43DAB5B8E5

Program output.

{fmt}

I found this cool little text formatting library with very clean interface and wanted to share it with you. I decided the best way to introduce it to you is not through an extensive tutorial but rather code which illustrates how to use it; so I wrote a program which does the same thing in twelve different ways using this library… plus few extra examples of text coloring, formatting, and alignment. Take a look at the code and the program output and it will all make sense.

fmt.cpp:

The answer is 42
The answer is 42
The answer is 42
The answer is 42
The answer is 42
The answer is 42
The answer is 42
The answer is 42
The answer is 42.00
The answer is 42.00
The answer is 42.00
The answer is 42.00
The text is bold
The color is red and green
The date and time is 2019-03-31 09:03:45
left aligned——————
—————–right aligned
———–centered———–

Program output.
Linux screenshot.

SSO of std::string

What is short/small string optimization? It’s a way to squeeze some bytes into a std::string object without actually allocating them on the heap. It’s a hackery involving C++ unions and clever space management. Say sizeof(std::string) is, oh I don’t know, 24 bytes on Mac’s LLVM? The implementation manages to squeeze 22 characters into that (not including the terminating NULL) before having to allocate on the heap. Impressive. Less impressive is GCC’s implementation on Linux, with sizeof(std::string) being 32 bytes but only 15 can be optimized before going to the heap. I used to have this number for Visual Studio’s implementation but… see the rant above 😛 The capacity of an empty string is the give away for how much you can fit in it before going to the heap 😉

Check it out yourself on your favorite compiler with the code below!

sso.cpp:

sizeof  : 24
Capacity: 22
Small   : 22
Big     : 31

Program output (LLVM on Mac).

HTTP queries

Today I want to show you how to use cURLpp (C++ wrapper around libcURL) to make a simple HTTP query to ip-api.com in order to retrieve geolocation information of a given host or IP address. I chose cURLpp because it’s simple and easy to use; the example program would not have been any harder using libcURL C API but this is a C++ blog after-all 🙂 I will be using Boost Property Tree library to deserialize the JSON geo-ip data. All of that is achieved in 15, give or take, lines of actual code… that’s the power of simple and well designed C++ libraries!

The program starts off by setting up a RAII object of type curlpp::Cleanup which initializes and cleans up cURLpp library. We then create a request object of type curlpp::Easy and an output std::stringstream where the received data will be placed. Next we setup some options like verbosity level, URL, port, the output stream, and we execute the query. Finally we parse the JSON data using read_json and iterate over the ptree structure to print it to the console.

geoip.cpp:

*   Trying 69.195.146.130…
* TCP_NODELAY set
* Connected to ip-api.com (69.195.146.130) port 80 (#0)
> GET /json/vorbrodt.blog HTTP/1.1
Host: ip-api.com
Accept: */*

< HTTP/1.1 200 OK
< Access-Control-Allow-Origin: *
< Content-Type: application/json; charset=utf-8
< Date: Fri, 29 Mar 2019 23:17:53 GMT
< Content-Length: 284

* Connection #0 to host ip-api.com left intact


as = AS46606 Unified Layer
city = Provo
country = United States
countryCode = US
isp = Unified Layer
lat = 40.2067
lon = -111.643
org = Unified Layer
query = 162.241.253.105
region = UT
regionName = Utah
status = success
timezone = America/Denver
zip = 84606

Program output.

C-style callbacks and lambda functions

You can use a non-capturing lambda function with C-style APIs that expect a function pointer. As long as the signatures of the callback and the lambda match, the lambda will be cast to a function pointer (or you could define a “positive lambda”, one with a + in front of it; this causes automatic conversion to a function pointer). This works because the compiler converts non-capturing lambdas to actual functions and stores them inside the compiled binary. Effectively a pointer to locally defined lambda is valid for the life of the program.

In the program below I define a callback with the following signature: typedef void(*FuncPtr)(int arg) and two C-style functions that use it: void set_callback(FuncPtr fp) and void fire_callback(int arg). I then call set_callback with a positive lambda. The program works 🙂

c_api_lambda.cpp:

42

Program output.