Wednesday 14 August 2013

Capturing lvalue references in C++11 lambdas

Recently the question "what is the type of an lvalue reference when captured by reference in a C++11 lambda?" was asked. It turns out that it's a reference to whatever the original reference was too. This is just like taking a reference to an existing reference, e.g.

int foo = 7;
int& rfoo = foo;
int& rfoo1 = rfoo;
int& rfoo2 = rfoo1;

All references refer to foo rather than rfoo2->rfoo1->rfoo->foo meaning the following code

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 << ", rfoo2:" << rfoo2 
          << '\n';
++foo;

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 << ", rfoo2:" << rfoo2 
          << '\n';

std::cout << "&foo:" << &foo << ", &rfoo:" << &rfoo 
          << ", &rfoo1:" << &rfoo1 << ", &rfoo2:" << &rfoo2 
          << '\n';

Which gives:

foo:7, rfoo:7, rfoo1:7, rfoo2:7
foo:8, rfoo:8, rfoo1:8, rfoo2:8
&foo:00D3FB0C, &rfoo:00D3FB0C, &rfoo1:00D3FB0C, &rfoo2:00D3FB0C

I.e. all the references are aliases for the original foo hence the same value is displayed including when the original is modified and that the address of each variable is the same, that of foo.

There is nothing surprising here it's just basic C++ but it's along time since I've thought about it which is why with lambdas, l-value, r-value and universal references I sometimes I do a double take on what was once obvious.

The same happens with lambda capture but it's a slightly more interesting story. Take the following example:

int foo = 99;
int& rfoo = foo;
int& rfoo1 = foo;

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 
          << '\n';

std::cout << "&foo:" << &foo << ", &rfoo:" << &rfoo 
          << ", &rfoo1:" << &rfoo1 
          << '\n';

auto l = [foo, rfoo, &rfoo1]()
{
    std::cout << "foo:" << foo << '\n';
    std::cout << "rfoo:" << rfoo << '\n';
    std::cout << "rfoo1:" << rfoo1 << '\n';

    std::cout << "&foo:" << &foo << ", &rfoo:" 
              << &rfoo << ", &rfoo1:" << &rfoo1 
              << '\n';
};

foo = 100;

l();

Which gives:

foo:99, rfoo:99, rfoo1:99
&foo:00D3FB0C, &rfoo:00D3FB0C, &rfoo1:00D3FB0C
foo:99
rfoo:99
rfoo1:100
&foo:00D3FAE0, &rfoo:00D3FAE4, &rfoo1:00D3FB0C

To begin with it behaves as per the first example in that foo, rfoo and rfoo1 all give the same value as rfoo and rfoo1 are effectively aliases for foo as shown when displaying their addresses; they're all the same.

However, when these same variables are captured it's a different story: The capture of foo is of no surprise as this is by-value so displays the captured value of 99 despite the original foo being changed to 100 prior to the lambda being invoked. Its address is that of a new variable; a member of the lambda.

It starts to get interesting with the capture of rfoo. When the lambda is invoked this too displays 99, the original captured value. Also, its address is not that of the original foo. It seems that the reference itself has not been captured but rather what it refers too, in this case an int with the value of 99. It appears to have been magically dereferenced as part of the capture.

This is the correct behaviour and when thought about becomes somewhat obvious. It's just like assigning a variable from a reference, e.g.

int foo = 7;
int& rfoo = foo;
int bar = rfoo;

bar doesn't become an int& and  rfoo is magically dereferenced except in this scenario there is nothing magical at all, it's as expected. If int were replaced with auto, e.g.

auto bar = rfoo;

then it would be expected that bar is an int as auto strips of CV and reference qualifiers.

Finally, there is rfoo1. This too is odd as it is attempting to take a reference to a reference. As seen in the first example this is perfectly fine. The end effect is that there can't be a reference to reference and so on and all are aliases of the original variable.

This is pretty much what's happening here. It's irrelevant that the target of the capture is a reference. In the end the capture by reference is capture by reference of the underlying variable, i.e. what rfoo1 refers too, in this case foo not rfoo1 itself. This is demonstrated twofold by rfoo1 within the lambda displaying the updated value of foo and also that the address of rfoo1 within the lambda is that of foo outside it.

This is as per the standard section 5.1.2 Lambda expression sub-note 14:

An entity is captured by copy if it is implicitly captured and the capture-default is = or if it is explicitly
captured with a capture that does not include an &. For each entity captured by copy, an unnamed nonstatic
data member is declared in the closure type. The declaration order of these members is unspecified.
The type of such a data member is the type of the corresponding captured entity if the entity is not a
reference to an object, or the referenced type otherwise. [ Note: If the captured entity is a reference to a
function, the corresponding data member is also a reference to a function. —end note ]

The sentence in bold states that for a reference captured by value then the type of the captured value is the type referred to, i.e. the reference aspect as been removed the crucial part being "or the referenced type otherwise". (NOTE: I haven't experimented with references to functions).

Finally, a vivid example showing that a reference captured by value involves a dereference.

class Bar
{
private:
int mValue;

public:
Bar(const Bar&) : mValue(9999)
{
}

public:
Bar(const int value) : mValue(value) {}
int GetValue() const { return mValue; }
void SetValue(const int value) { mValue = value; }
};

Bar bar(1);
Bar& rbar = bar;
Bar& rbar1 = bar;

std::cout << "&bar:" << &bar << ", &rbar:" << &rbar<< ", &rbar1:" << &rbar1 << '\n';

auto l2 = [bar, rbar, &rbar1]()
{
std::cout << "bar:" << bar.GetValue() << '\n';
std::cout << "rbar:" << rbar.GetValue() << '\n';
std::cout << "rbar1:" << rbar1.GetValue() << '\n';

std::cout << "&bar:" << &bar << ", &rbar:" << &rbar<< ", &rbar1:" << &rbar1 << '\n';
};

bar.SetValue(2);

l2();

The class bar provides a crude copy-constructor that sets the stored value to 9999. The following output is similar to that in the previous example in that the addresses of bar and rbar in the lambda differ from that of bar showing they're copies whilst rbar1 is the same. Secondly, the value of mValue stored within Bar is shown as 9999 for the first two captured variables meaning they were copy-constructed.

&bar:00D3FB0C, &rbar:00D3FB0C, &rbar1:00D3FB0C
bar:9999
rbar:9999
rbar1:2
&bar:00D3FAE0, &rbar:00D3FAE4, &rbar1:00D3FB0C

Making the copy-construct private (by commenting out the seemingly unnecessary 'public:') prevents compilation.

1>------ Build started: Project: References, Configuration: Debug Win32 ------
1>  main.cpp
1>c:\users\pete\desktop\references\references\main.cpp(85): error C2248: 'Bar::Bar' : cannot access private member declared in class 'Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(59) : see declaration of 'Bar::Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(54) : see declaration of 'Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(59) : see declaration of 'Bar::Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(54) : see declaration of 'Bar'

Writing this post has clarified the situation for me, I hope it helps you as well.

The sample code is available here.

No comments: