MB-20351: Fix lock-order inversion in ~CheckpointManager
As identified by TSan. Seen whilst testing sherlock->watson merge,
analysed the code and it seems this is a latent issue and hard to
re-produce.
The issue is that when the executor thread does a reset on the current
task, the VBCBAdapator is the last one holding the ref-counted
vbucket, so destruction occurs and ~VBucket calls the destructor of
the checkpoint manager, which is the reverse locks ordering of a
previous code path.
Number of threads in play here, but main ones of interest:
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=170834)
Cycle in lock order graph: M_checkpoint (0x7d640002e9a8) => M_exepool (0x7d4c00008288) => M_taskqueue (0x7d4400008e80) => M_exethread (0x7d380000df60) => M_checkpoint
Mutex M_exepool acquired here while holding mutex M_checkpoint in thread T35:
#0 pthread_mutex_lock <null> (engine_testapp+0x000000486760)
#1 std::mutex::lock() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/x86_64-linux-gnu/c++/4.9/bits/gthr-default.h:748 (ep.so+0x0000000f47b0)
#2 ExecutorPool::wake(unsigned long) ep-engine/src/executorpool.cc:355 (ep.so+0x0000000f48f1)
#3 Flusher::wake() ep-engine/src/flusher.cc:155 (ep.so+0x000000101ee6)
#4 NotifyFlusherCB::callback(unsigned short&) ep-engine/src/flusher.h:88 (ep.so+0x00000010d194)
#5 Checkpoint::queueDirty(SingleThreadedRCPtr<Item> const&, CheckpointManager*) ep-engine/src/checkpoint.h:675 (ep.so+0x0000000271b0)
#6 CheckpointManager::closeOpenCheckpoint_UNLOCKED() ep-engine/src/checkpoint.cc:454 (ep.so+0x000000028dcb)
#7 CheckpointManager::addNewCheckpoint_UNLOCKED(unsigned long, unsigned long, unsigned long) ep-engine/src/checkpoint.cc:371 (ep.so+0x00000002881f)
#8 CheckpointManager::checkOpenCheckpoint_UNLOCKED(bool, bool) ep-engine/src/checkpoint.cc:361 (ep.so+0x00000002bd71)
#9 CheckpointVisitor::visitBucket(RCPtr<VBucket>&) ep-engine/src/checkpoint_remover.cc:43 (ep.so+0x00000003c3bd)
#10 VBCBAdaptor::run() ep-engine/src/ep.cc:3924 (ep.so+0x0000000a6174)
#11 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000fe1b6)
#12 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000fdd15)
#13 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x0000000057fb)
...
Mutex M_checkpoint acquired here while holding mutex M_exethread in thread T36:
#0 pthread_mutex_lock <null> (engine_testapp+0x000000486760)
#1 CheckpointManager::~CheckpointManager() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/x86_64-linux-gnu/c++/4.9/bits/gthr-default.h:748 (ep.so+0x000000027fdd)
#2 VBucket::~VBucket() ep-engine/src/vbucket.cc:152 (ep.so+0x00000014a018)
#3 PagingVisitor::~PagingVisitor() ep-engine/src/atomic.h:190 (ep.so+0x00000010a5e6)
#4 PagingVisitor::~PagingVisitor() ep-engine/src/item_pager.cc:43 (ep.so+0x00000010a645)
#5 std::_Sp_counted_ptr<PagingVisitor*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/shared_ptr_base.h:373 (ep.so+0x00000010a2b0)
#6 VBCBAdaptor::~VBCBAdaptor() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/shared_ptr_base.h:149 (ep.so+0x0000000aea7e)
#7 ExecutorThread::run() ep-engine/src/atomic.h:325 (ep.so+0x0000000fdee4)
#8 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000fdd15)
#9 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x0000000057fb)
Change-Id: I0a966b3d112963243e17647184123fd8b3200656
Reviewed-on: http://review.couchbase.org/66357
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>