Brief bookclub talk I gave at work introducing the visitor design pattern and multiple-dispatch in C++.
Chapter 18 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Source code:
visitor.cpp
Videos by Martin Vorbrodt
Brief bookclub talk I gave at work introducing the visitor design pattern and multiple-dispatch in C++.
Chapter 18 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Source code:
visitor.cpp
Brief bookclub talk I gave at work introducing the adapter design pattern in C++.
Chapter 17 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Source code:
adapter.cpp
Brief bookclub talk I gave at work introducing the decorator design pattern in C++.
Chapter 17 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Source code:
decorator.cpp
Brief bookclub talk I gave at work introducing policy-based design in C++.
Chapter 16 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Brief bookclub talk I gave at work introducing the singleton design pattern in C++.
Chapter 15 from Hands-On Design Patterns with C++ by Fedor G. Pikus.
Source code:
singleton.hpp | singleton.cpp
Last week I explained what a memory pool is and how to use it. This week my lesson was about implementing a simple memory pool capable of fixed size memory chunk allocations. The pool preallocates a block of memory on demand, carves out equal sized chunks from it, and returns a raw pointer each time you call
pool.malloc().
Memory chunks are described using a simple
struct chunk_t and stored in a singly-linked-list; the struct has an internal anonymous
union for size optimization: the same memory region can either represent the next pointer of the list’s node or hold user data.
P.S. Watch in HD, though I did increased the font size per your requests 🙂
Code from the class:
lesson_memory_pool_how_to.cpp
Cleaned up implementation:
memory_pool.hpp | memory_pool.cpp
In this week’s class I explained what memory pools are and how to use them to optimize frequent allocations and deallocations of objects. I also demonstrated how to overwrite
operator new /
operator new [] and
operator delete /
operator delete [] for a
struct /
class type.
The code for this class contains a simple benchmark which measures pool’s performance against the general-purpose allocator in C++ runtime. Notice how multi-threaded benchmark causes the pool to perform poorly (I suspect due to lock contention, though I did not yet investigate this). I didn’t even bother measuring array allocations; the memory pool benchmark causes massive fragmentation of the free chunks (by allocating all available chunks, shuffling them randomly, then freeing all) making continuous allocations virtually impossible without falling back onto runtime allocator.
Next week I plan on actually implementing a simple memory pool then again measure its performance against the runtime. Enjoy the video!
P.S. Remember to pick HD version of the video. It is also best to watch in full screen mode due to the font size used. I will try to remember to increase it during my next class. Apologies to anyone finding it hard to read the code.
Code from the class:
lesson_pool_allocator.cpp | lesson_pool_allocator_benchmark.cpp
During my last week’s class I discussed a source-code method of finding memory leaks in C++. What do I mean by that? Source-code method is not a way of debugging a process like one could do with Valgrind on Linux or Deleaker on Windows (which I’ve used recently and really enjoyed its integration with Visual Studio). What I am referring to is a combination of
#define MACRO magic and overwriting the
void* operator new. A technique to keep track of all your
new /
new[] and
delete /
delete[] statements and finding A) leaks due to missing delete statements and B) mismatches between array and non array versions of said operators.
Simply by including a header file in your source code you enable this tracing capability. Though it has a limitation: unlike Deleaker or Valgrind it only works for plain vanilla
new /
new[] operator; my solution does not support any other form of it.
Code from the class:
newtrace.hpp (C++17 or earlier) | newtrace.cpp | README.txt
Using boost::stacktrace:
newtrace.st.hpp | newtrace.st.cpp
During my most recent class I discussed implementations of simple, multi-threaded, producer-consumer queues: unbounded (can grow in size indefinitely) and bounded (can only grow to a certain size). In short: the unbounded queue will block consumers’ pop() calls if it happens to be empty until a producer pushes something onto it. Bounded queue will do the same and in addition it will block producers’ push() calls if the queue fills up to its maximum capacity, unblocking only after a consumer frees up a spot with pop(). Finally I reworked the simple thread pool from last week’s class to use said queues, greatly reducing its complexity.
In the past I have implemented both types of queues (you can find it in my archives), but the bounded one used less than optimal approach: semaphores for counting free and occupied spots. This was costly because it required three std::mutex and two std::condition_variable objects. The one I implemented during this class requires only a single std::mutex and two std::condition_variable. The unbounded queue requires only one of each.
Code from the class:
lesson_queue_how_to.cpp
By popular demand, and at the request of my coworkers, this week’s class was about implementing a simple thread pool using only standard C++ components. It would have taken 1h-30m if I didn’t completely forget how to use a condition variable. It’s 2h long!
Bad
std::condition_variable.wait(...) predicate caused a deadlock and I spent 30 minutes trying to figure it out. If you’re interested in the step-by-step implementation only then fast forward from 1h-15m-15s to 1h-45m-0s. Otherwise watch me reason through it. I eventually realized that the predicate should have returned
true if the queue contained any elements, instead it tried to answer if 1 element was added to it.
For me the lesson was…
If you want to master something, teach it.
Richard Feynman
Code from the class:
lesson_thread_pool_how_to.cpp
During my latest mentoring session with coworkers I was asked to explain how modern RPC frameworks work under the hood. What better way to illustrate this than to build one from scratch using components I blogged about previously (event objects, serializer framework, and TCP sockets).
Although not part of the video recording (which I will share later in this post) I explained that RPC frameworks generally work in two ways:
I did not design an IDL nor did I build a parser and code-generator for my students; there wasn’t nearly enough time for that. What I wanted to illustrate were different components which would have been generated by a modern RPC framework and how they would fit together. Starting with the client interface and code created to connect and issue calls to the server, serialization of parameters, network transport, command parsing and dispatching on the server side, actual command logic (this part would not be generated by a framework), and finally everything needed to send responses back to the client.
The starting point of creating a simple calculator server capable of adding and subtracting numbers would have been the server’s interface declaration using some form of an IDL:
1 2 3 4 5 |
interface Calculator { int Add(int, int); int Sub(int, int); } |
This would then be processed by the framework’s generator to produce actual C++ code (or any other language) that would allow the programmer to easily connect, issue calls, and receive responses from the server:
1 2 3 |
auto c = RPCClient(server_host_name, server_port); auto a = c.Add(2, 2); auto s = c.Sub(10, 5); |
Creating instances of the RPC server would be equally easy, but would also require providing the core logic of the server procedures:
1 2 3 4 |
auto s = RPCServer(listening_port); s.SetAddHandler([](int x, int y) { return x + y; }); s.SetSubHandler([](int x, int y) { return x - y; }); s.Run(); |
In order for this to work a lot more code would have been generated then compiled against the static parts of the RPC framework: serialization, network transport, compression, encryption, authentication code, etc.
During my class I implemented the parts that would have been generated based on the desired set of procedure calls (the IDL), and reused code I developed for this blog in the past in place of the framework’s static parts.
Keep in mind that some of the supporting code I have already prepared before recording the class and I had it open on another screen, but the IDL based code I was coming up with on the fly, aka freestyling it 🙂
I figured it’s worth a mention in case my delivery appears less than… perfect.
Reworked and cleaned up code from the class:
Protocol: lrpc_proto.hpp. Client: lrpc_c.cpp. Server: lrpc_s.cpp.
Serializer: lsf.hpp. Network: socket.hpp. Event: event.hpp.
In my last post I described a light-weight serialization framework I recently implemented (video where I go over it in detail). After receiving great feedback from r/cpp I have implemented several improvements that were suggested to me: better unpack transform signature, safer unpacking code, unpack casting iterator just to name a few.
The latest code is available on GitHub; the serializer source lsf.hpp, and usage example lsf.cpp.
Video where I describe the changes in greater detail:
UPDATE: Better serializer
In my very first YouTube Channel video I discuss in detail the design of a serialization framework I came up with in response to a question I was recently asked. In this post I want to give a brief overview of how it works and how to get started using it.
The primary goal was to create a serialization framework that is uncoupled from the types it operates on, meaning no class nor struct needs to inherit any sort of
1 2 3 4 5 6 7 |
add_pack_transform<std::uint16_t>( [](std::uint16_t v, buffer_output_t& it) { pack_value(it, htons(v)); }); add_unpack_transform<std::uint16_t>( [](buffer_input_t& it, std::uint16_t* v) { *v = ntohs(unpack_value<std::uint16_t>(it)); }); |
By default every type is packed by doing bitwise copy from its memory location into a
1 2 3 |
auto buf_1 = pack(std::uint16_t(0x1234)); auto tup_1 = unpack<std::uint16_t>(buf_1); |
The code above packs a single
It is possible to perform type conversions inside the pack and unpack transforms:
1 2 3 4 5 |
add_pack_transform<std::size_t>([](std::size_t v, buffer_output_t& it) { pack_value(it, std::uint16_t(v)); }); add_unpack_transform<std::size_t>([](buffer_input_t& it, std::size_t* v) { *v = unpack_value<std::uint16_t>(it); }); |
Above code illustrates it: every
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
add_pack_transform<const char*>( [](const char* v, buffer_output_t& it) { auto len = std::strlen(v); pack_type(it, len); pack_bytes(it, v, len); }); add_pack_transform<std::string>( [](const std::string& v, buffer_output_t& it) { pack_type(it, v.length()); pack_bytes(it, v.data(), v.length()); }); add_unpack_transform<std::string>( [](buffer_input_t& it, std::string* v) { auto len = unpack_type<std::string::size_type>(it); new (v) std::string(len, std::string::value_type()); unpack_bytes(it, v->data(), len); }); |
Here I define how to serialize
Having defined transforms for simple types we can now add more complex types, let’s say
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
using strings_t = std::vector<std::string>; add_pack_transform<strings_t>( [](const strings_t& vs, buffer_output_t& it) { pack_type(it, vs.size()); std::for_each(std::begin(vs), std::end(vs), [&](const std::string& s) { pack_type(it, s); }); }); add_unpack_transform<strings_t>( [](buffer_input_t& it, strings_t* vs) { auto size = unpack_type<strings_t::size_type>(it); new (vs) strings_t(); vs->reserve(size); std::generate_n(std::back_inserter(*vs), size, [&]() { return unpack_type<std::string>(it); }); }); |
The packing code is trivial: pack the size of the vector followed by packing each entry one at a time. Notice here I am calling
One thing I have not yet mentioned is an optimization I introduced into the code: before types are packed the exact amount of memory is reserved inside the resulting
1 2 |
static auto k_default_pack_size_proc = pack_size_proc_t( [](const void*) { return sizeof(T); }); |
There is a void pointer parameter but in this case it is ignored. However it is required to compute sizes of more complex types like strings; that is because in order to store the pack size procs inside an internal data structure each one needs to be wrapped inside the default lambda. This lambda gets the memory location of each type as
1 2 3 4 5 6 7 |
add_pack_size_proc<const char*>( [](const char* v) { return pack_size(std::strlen(v)) + std::strlen(v); }); add_pack_size_proc<std::string>( [](const std::string& v) { return pack_size(v.length()) + v.length(); }); |
Here I tell the framework how to compute the number of bytes needed to serialize strings.
1 2 3 4 5 6 7 |
add_pack_size_proc<strings_t>( [](const strings_t& v) { return pack_size(v.size()) + std::accumulate(std::begin(v), std::end(v), 0, [](auto sum, const std::string& v) { return sum + pack_size(v); }); }); |
I initially tagged each type with a unique value so that I could verify it during unpacking to catch errors like packing an int but trying to unpack a string. This type tagging became more complicated than I first anticipated so I got rid of it, though I may revisit it and add an additional packing and unpacking functions that perform this verification optionally.
As always, the code is available on my GitHub page.
The framework is a single header file: lsf.hpp. Example program: lsf.cpp.