Understanding the "std::variant"/"std::visit" Pattern Matching Like Construct in C++
Modern C++ features coming together
2023-05-07 • 8 min read
- Background
- Lambda Is Really A Class with
operator()
- List Initialization
- Template Parameter Pack
- Template Deduction Guide
- Aggregate Initialization
- Putting Everything Together: Function Overloading
- Summary
Background
I have recently been using
std::variant
for some
abstraction that I am working on. As someone who picked up modern C++ in the
past 1.5 year, I am glad there is a vocabulary sum type in the C++ standard
library. And the standard library provides a handy function std::visit
allowing us to perform operation on the variant. However, there is a very
interesting example listed in the std::visit
:
// the variant to visit
using var_t = std::variant<int, long, double, std::string>;
// helper type for the visitor #4
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
// explicit deduction guide (not needed as of C++20)
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
int main() {
std::vector<var_t> vec = {10, 15l, 1.5, "hello"};
for (auto& v: vec) {
// 4. another type-matching visitor: a class with 3 overloaded operator()'s
// Note: The `(auto arg)` template operator() will bind to `int` and `long`
// in this case, but in its absence the `(double arg)` operator()
// *will also* bind to `int` and `long` because both are implicitly
// convertible to double. When using this form, care has to be taken
// that implicit conversions are handled correctly.
std::visit(overloaded {
[](auto arg) { std::cout << arg << ' '; },
[](double arg) { std::cout << std::fixed << arg << ' '; },
[](const std::string& arg) { std::cout << std::quoted(arg) << ' '; }
}, v);
}
}
First, let’s take a look at the syntax of the callsite:
std::visit(overloaded {
[](auto arg) { std::cout << arg << ' '; },
[](double arg) { std::cout << std::fixed << arg << ' '; },
[](const std::string& arg) { std::cout << std::quoted(arg) << ' '; }
}, v);
This reminds me of pattern matching in some other languages. If we were to express the same thing in OCaml:
match v with
| String arg -> print_string arg
| Double arg -> print_float arg
| _ arg -> (** some how polymorphically print this fallback, but I don't know how to *)
The pattern matching like construct is not the same as pattern matching in other languages. For one, we cannot match against the structure of the object, but only the type of the object. However, this is not the focus of this article.
I want to understand, from a language feature perspective, how C++ enabled this pattern matching like pattern. It amazes me how elegant the construct is written out. It is easy to write since it is not very verbose. It is also very easy to read since each lambda corresponds to one case. This pattern make a lot of sense to me yet at the same time, I fail to understand how the pattern gets put together. So in this article, we will break down the pattern piece by piece and understand all the C++ language features that combined to give such beautiful pattern.
Lambda Is Really A Class with operator()
First, let’s understand what a lambda really is. In C++, there are 2 types of
entity that are callable: normal functions, and any object with operator()
defined on it (also known as function objects or functor). Lambda is actually
just an object with operator()
that the compiler creates for us.
auto lambda = [](int i) { return i * i };
// will get translated to something like
struct __some_compiler_generated_name
{
int operator()(int i) const
{
return i * i;
}
};
__some_compiler_generated_name lambda = __some_compiler_generated_name{};
So it is simply a syntactic sugar for a quickly making a function object. The important fact to note here is that lambda is just an object and therefore a value.
List Initialization
One of the part of this pattern that trips me is the syntax of overloaded
with the lambdas:
overloaded {
[](auto arg) { std::cout << arg << ' '; },
[](double arg) { std::cout << std::fixed << arg << ' '; },
[](const std::string& arg) { std::cout << std::quoted(arg) << ' '; }
}
What is this syntax with a type name bracket with a list of lambdas? To understand this, we first look at something simpler:
std::vector{1, 2, 3}
This initializes a vector using the list initialization syntax, meaning these values are passed to the constructor that accepts these arguments and the object is constructed that way.
Lambdas are objects after all, so we can pass them as values in list
initialization. However, we never defined a constructor for the overloaded
class. What constructor is it calling?
// calling some weird constructor that takes 3 lambdas?
overloaded {lambda1, lambda2, lambda3}
Template Parameter Pack
Now we analyze how the overloaded class is defined:
// helper type for the visitor #4
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
// explicit deduction guide (not needed as of C++20)
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
Wow, this is pretty complicated. Let’s first analyze the ellipsis:
template <class... Ts>
The ellipsis in the template parameter is the template parameter pack. This is also known as variadic template parameter. It allows us to pass in an arbitrary number of template argument when using this class, meaning that these are all valid:
// assuming class A, B, C are defined:
using overloaded_A = overloaded<A>;
using overloaded_A_B = overloaded<A, B>;
using overloaded_A_B_C = overloaded<A, B, C>;
And how does the template argument get used in overloaded
?
template<class... Ts>
struct overloaded : Ts...
{
using Ts::operator()...;
};
They are used as the base class of the overloaded
class, meaning that
overloaded will inherit from the parameter types. They are also used to
reference the operator()
from the parameter types. So all the operator()
will get introduced into the overloaded
class.
Here is how overloaded<A, B, C>
gets expanded:
struct overloaded : public A, public B, public C
{
using A::operator(), B::operator(), C::operator();
}
Template Deduction Guide
Now, let’s look at second part of the overloaded
class.
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
This is known as a template deduction guide. It is meant to help the compiler derive template argument from the constructor call. This template deduction guide is telling the compiler that for each of the argument being passed in the constructor, use their types as the template arguments. For example,
overloaded overloaded_A_B_C_obj { A(), B(), C() };
The overloaded
template class will receive the concrete template arguments
because of the deduction guide:
overloaded<A, B, C> overloaded_A_B_C_obj { A(), B(), C() };
Aggregate Initialization
Now we know fully how the class will be defined. But we still did not
define a constructor. So how does the list initialization syntax works
for the overloaded
class?
When an class is defined without any constructor, the class is known as an aggregate in C++. C++ will allow us to list-initialize an aggregate’s data members. And this special list-initialization is called aggregate initialization.
Here is a simple example:
struct Person
{
std::string name;
int age;
};
Person p { "Galen", 23 };
std::cout << p.name << " " << p.age; // outputs Galen 23
But our overloaded class has no data members. Turns out, C++ will allow us to list-initialize the aggregate’s base class with list-initialization as well:
struct A {};
struct B {};
struct C : public A, public B {};
C c_obj {A(), B()};
Therefore, our initialization implicitly has the initializer list constructor that accepts the values to copy/move initialize in the base class:
overloaded<A, B, C> overloaded_A_B_C_obj {
A(), // copy or move initialize the A base class
B(), // copy or move initialize the B base class
C() // copy or move initialize the C base class
};
Putting Everything Together: Function Overloading
Now, we have everything we need to piece together how this pattern works:
overloaded {
[](auto arg) { std::cout << arg << ' '; },
[](double arg) { std::cout << std::fixed << arg << ' '; },
[](const std::string& arg) { std::cout << std::quoted(arg) << ' '; }
}
Expanding the lambda to function objects:
struct __lambda_1 {
void operator()(auto arg) const { std::cout << arg << ' '; }
};
struct __lambda_2 {
void operator()(double arg) const { std::cout << std::fixed << arg << ' ' ; }
};
struct __lambda_3 {
void operator()(const std::string& arg) const { std::cout << std::quoted(arg) << ' '; }
};
overloaded { __lambda_1(), __lambda_2(), __lambda3() }
Then, using the template deduction guide we deduce the exact concrete template implemention:
struct overloaded : public __lambda_1, public __lambda_2, public __lambda_3
{
using __lambda_1::operator(), __lambda_2::operator(), __lambda_3::operator()
};
which is effectively equivalent to:
struct overloaded_function_class
{
void operator()(auto arg) const { std::cout << arg << ' '; }
void operator()(double arg) const { std::cout << std::fixed << arg << ' ' ; }
void operator()(const std::string& arg) const { std::cout << std::quoted(arg) << ' '; }
};
Therefore,
overloaded {
[](auto arg) { std::cout << arg << ' '; },
[](double arg) { std::cout << std::fixed << arg << ' '; },
[](const std::string& arg) { std::cout << std::quoted(arg) << ' '; }
}
// is behaviorally equivalent to
overloaded_function_class obj;
Then, whenever our obj is called, the implementation will be selected based on function overloading resolution. For example:
overloaded_function_class overloaded_functions;
double d = 2.4;
overloaded_functions(2.4); // this will call the double overload
std::string s = "hello world!";
overloaded_functions(s); // this will call the string overload
And in the context of std::visit
, std::visit
will extract the underlying
type stored inside the std::variant
and call the function passed to it with
the underlying object. And the correct implementation will be called using
overload resolution.
Summary
Isn’t this so cool? All these C++ features coming together to enable this cute pattern of overloaded lambdas to emulate the pattern matching syntax.
Hope this article is helpful to you, and you are able to learn something new through my analysis of this pattern. Here are all the C++ features we talked about today:
- Lambda
- List initialization and aggregate initialization
- Template parameter pack
- Template argument deduction guide
- Function overloading