Prevent low-likelihood hang on shutdown (RIPD-1392):

Calling OverlayImpl::list_[].second->stop() may cause list_ to be
modified (OverlayImpl::remove() may be called on this same thread).
So iterating directly over OverlayImpl::list_ to call
OverlayImpl::list_[].second->stop() could give undefined behavior.
On MacOS that undefined behavior exhibited as a hang.

Therefore we copy all of the weak/shared ptrs out of
OverlayImpl::list_ before we start calling stop() on them.  That
guarantees OverlayImpl::remove() won't be called until
OverlayImpl::stop() completes.
This commit is contained in:
Scott Schurr
2017-02-17 13:40:41 -08:00
parent 1d482eeecb
commit 9ff9fa0aea

View File

@@ -970,17 +970,32 @@ OverlayImpl::remove (Child& child)
void
OverlayImpl::stop()
{
std::lock_guard<decltype(mutex_)> lock(mutex_);
if (work_)
// Calling list_[].second->stop() may cause list_ to be modified
// (OverlayImpl::remove() may be called on this same thread). So
// iterating directly over list_ to call child->stop() could lead to
// undefined behavior.
//
// Therefore we copy all of the weak/shared ptrs out of list_ before we
// start calling stop() on them. That guarantees OverlayImpl::remove()
// won't be called until vector<> children leaves scope.
std::vector<std::shared_ptr<Child>> children;
{
std::lock_guard<decltype(mutex_)> lock(mutex_);
if (!work_)
return;
work_ = boost::none;
for (auto& _ : list_)
children.reserve (list_.size());
for (auto const& element : list_)
{
auto const child = _.second.lock();
// Happens when the child is about to be destroyed
if (child != nullptr)
child->stop();
children.emplace_back (element.second.lock());
}
} // lock released
for (auto const& child : children)
{
if (child != nullptr)
child->stop();
}
}