ep-engine.git
4 years agoMB-19636: Initialise failovers correctly from 2.5.x vbstate 56/64156/6
Jim Walker [Tue, 17 May 2016 16:41:10 +0000 (17:41 +0100)]
MB-19636: Initialise failovers correctly from 2.5.x vbstate

(Note: backport of MB-19635 to 3.0.x branch).

When loading a vb file, don't force the failover table data
to be ("[{\"id\":0,\"seq\":0}]"); if the file doesn't contain
any data.

Change-Id: I41673bf848fcbab9b616edec5c7fd2ab9a3ddd6b
Reviewed-on: http://review.couchbase.org/64156
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMB-19673: Log the actual last seqno sent before closing the stream. 57/64157/5
Manu Dhundi [Mon, 16 May 2016 23:33:09 +0000 (16:33 -0700)]
MB-19673: Log the actual last seqno sent before closing the stream.

(Note: backport of MB-19627 to 3.0.x)

When a DCP stream closes, we log the last sent seqno at the time when
stream transitions to dead state. However, we further stream items in
the readyQ from  dead state as well. This commit adds the correct
last seqno sent.

Change-Id: I0f0bfd199544dc5bf20e0ca97b3c5ea8d207c6a8
Reviewed-on: http://review.couchbase.org/64157
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
4 years agoMB-19503: Fix ConnMap so notifications don't go missing [2] 15/64115/2
Jim Walker [Mon, 16 May 2016 15:24:35 +0000 (16:24 +0100)]
MB-19503: Fix ConnMap so notifications don't go missing [2]

Previous patch[1] cleared the isNotificationScheduled flag
at the wrong place and meant things could then never
again get scheduled.

This is because we only cleared the flag if tp->isPaused()
yet we still pop the notification from the queue, so we
left tp->isNotificationScheduled yet the queue is empty.
Now no more notifications will ever get scheduled!

So we need to clear the notification scheduled boolean
unconditionally of the other flags on tp.

[1] - Commit 0856e0b3d3fc6

Change-Id: I11c9fd72f4b35102328022bd4c334a9e09a61cd0
Reviewed-on: http://review.couchbase.org/64115
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMB-19503: Fix ConnMap so notifications don't go missing. 34/63934/4
Jim Walker [Wed, 11 May 2016 15:26:47 +0000 (16:26 +0100)]
MB-19503: Fix ConnMap so notifications don't go missing.

There's a reliance on an atomic bool and cmpxchg to
prevent the producer of notification from queueing
himself if he's already got a notification scheduled.

There's an ordering issue though where the producers code
can execute, see the flag is true and not bother queueing
a notification, yet the consumer side is about to clear the
flag and finish. The notification thus never gets queued
and the producer side thinks he will get a notification.

In my terminology:
producer is ConnMap::notifyPausedConnection
consumer is ConnMap::notifyAllPausedConnections

Change-Id: Id324b6369c5ee3a6b6758a7a93e017a4ff7c4a78
Reviewed-on: http://review.couchbase.org/63934
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19404: [BP] Address data race in DCP-Producer seen while making a stats request 40/63440/4
abhinavdangeti [Thu, 28 Apr 2016 00:18:50 +0000 (17:18 -0700)]
MB-19404: [BP] Address data race in DCP-Producer seen while making a stats request

WARNING: ThreadSanitizer: data race (pid=82258)
  Read of size 1 at 0x7d4c0000a208 by thread T11 (mutexes: write M2483, write M19044):
    #0 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/dcp/producer.cc:601 (ep.so+0x000000063e2d)
    #1 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:3903 (ep.so+0x0000000d6931)
    #2 EventuallyPersistentEngine::doDcpStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:4160 (ep.so+0x0000000b904a)
    #3 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:4580 (ep.so+0x0000000bcba4)
    #4 EventuallyPersistentStore::snapshotStats() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep.cc:1700 (ep.so+0x000000088386)
    #5 StatSnap::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/tasks.cc:98 (ep.so+0x00000012ba26)
    #6 ExecutorThread::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/executorthread.cc:115 (ep.so+0x0000000eaeed)
    #7 launch_executor_thread(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000eaab5)
    #8 platform_thread_wrap(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/platform/src/cb_pthreads.cc:53 (libplatform.so.0.1.0+0x0000000048bb)

  Previous write of size 1 at 0x7d4c0000a208 by main thread:
    #0 DcpProducer::handleResponse(protocol_binary_response_header*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/dcp/producer.cc:547 (ep.so+0x000000063231)
    #1 EvpDcpResponseHandler(engine_interface*, void const*, protocol_binary_response_header*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:1765 (ep.so+0x0000000ae08b)
    #2 mock_dcp_response_handler(engine_interface*, void const*, protocol_binary_response_header*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/memcached/programs/engine_testapp/engine_testapp.cc:796 (engine_testapp+0x0000004c68e5)
    #3 sendDcpAck(engine_interface*, engine_interface_v1*, void const*, protocol_binary_command, protocol_binary_response_status, unsigned int) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/tests/ep_test_apis.cc:983 (ep_testsuite.so+0x0000000a6a22)
    #4 test_dcp_noop(engine_interface*, engine_interface_v1*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/tests/ep_testsuite.cc:3975 (ep_testsuite.so+0x000000068deb)
    #5 execute_test(test, char const*, char const*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000004c4192)
    #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

(Reviewed-on: http://review.couchbase.org/56306)

Change-Id: Ice7236da5cc885d9e7612894ba3d37e357e13b4a
Reviewed-on: http://review.couchbase.org/63440
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19405: [BP] Address possible data races in PassiveStream context 46/63446/3
abhinavdangeti [Thu, 28 Apr 2016 01:08:56 +0000 (18:08 -0700)]
MB-19405: [BP] Address possible data races in PassiveStream context

WARNING: ThreadSanitizer: data race (pid=3212)

  Write of size 8 at 0x7d5000016908 by thread T5 (mutexes: write M26478):
    #0 PassiveStream::reconnectStream(RCPtr<VBucket>&, unsigned int, unsigned long) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:1097 (ep.so+0x000000076c0f)
    #1 DcpConsumer::doRollback(unsigned int, unsigned short, unsigned long) /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:676 (ep.so+0x00000005db67)
    #2 RollbackTask::run() /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:574 (ep.so+0x00000005d9d4)
    #3 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f8916)
    #4 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f84b5)
    #5 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous read of size 8 at 0x7d5000016908 by main thread (mutexes: write M1367):
    #0 PassiveStream::setDead_UNLOCKED(end_stream_status_t, LockHolder*) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:1046 (ep.so+0x0000000759ca)
    #1 PassiveStream::setDead(end_stream_status_t) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:1056 (ep.so+0x0000000766d7)
    #2 DcpConsumer::closeAllStreams() /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:860 (ep.so+0x00000005a006)
    #3 DcpConnMap::disconnect_UNLOCKED(void const*) /home/abhinav/couchbase/ep-engine/src/connmap.cc:1137 (ep.so+0x000000049972)
    #4 DcpConnMap::disconnect(void const*) /home/abhinav/couchbase/ep-engine/src/connmap.cc:1111 (ep.so+0x00000004969b)
    #5 EventuallyPersistentEngine::handleDisconnect(void const*) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6224 (ep.so+0x0000000d3bea)
    #6 EvpHandleDisconnect(void const*, ENGINE_EVENT_TYPE, void const*, void const*) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1783 (ep.so+0x0000000b7046)
    #7 mock_perform_callbacks /home/abhinav/couchbase/memcached/programs/engine_testapp/mock_server.c:296 (engine_testapp+0x0000000bd420)
    #8 test_rollback_to_zero(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:5434 (ep_testsuite.so+0x00000007f45f)
    #9 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #10 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

(Reviewed-on: http://review.couchbase.org/55785)

Change-Id: I287bd95f8b03cb207419d0a0e57ca71be6058b19
Reviewed-on: http://review.couchbase.org/63446
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19359: [3] Address lock inversion with vb's state lock and snapshot lock 79/63379/4
abhinavdangeti [Tue, 26 Apr 2016 23:07:40 +0000 (16:07 -0700)]
MB-19359: [3] Address lock inversion with vb's state lock and snapshot lock

+ [Not a backport, this code was altered/removed in master]
+ Address this lock inversion by moving the code that reads the vbucket
  snapshot range to outside the vbucket's state lock context.

15:30:43 WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=235352)
15:30:43   Cycle in lock order graph: M21536 (0x7d640002f720) => M21533 (0x7d640002f5f0) => M21536
15:30:43
15:30:43   Mutex M21533 acquired here while holding mutex M21536 in thread T17:
15:30:43     #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000462260)
15:30:43     #1 cb_rw_reader_enter <null> (libplatform.so.0.1.0+0x000000004800)
15:30:43     #2 RWLock::readerLock() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/rwlock.h:38 (ep.so+0x0000001327e0)
15:30:43     #3 ReaderLockHolder::ReaderLockHolder(RWLock&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:167 (ep.so+0x0000000f84c7)
15:30:43     #4 EventuallyPersistentStore::addTAPBackfillItem(Item const&, unsigned char, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:851 (ep.so+0x0000000d9c67)
15:30:43     #5 PassiveStream::commitMutation(MutationResponse*, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-stream.cc:1370 (ep.so+0x00000029e25c)
15:30:43     #6 PassiveStream::processMutation(MutationResponse*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-stream.cc:1346 (ep.so+0x00000029d0a0)
15:30:43     #7 PassiveStream::processBufferedMessages(unsigned int&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-stream.cc:1286 (ep.so+0x00000029c9f2)
15:30:43     #8 DcpConsumer::processBufferedItems() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-consumer.cc:599 (ep.so+0x0000002632d4)
15:30:43     #9 Processer::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-consumer.cc:48 (ep.so+0x000000262ecf)
15:30:43     #10 ExecutorThread::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:109 (ep.so+0x0000001e3dc1)
15:30:43     #11 launch_executor_thread(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e33ea)
15:30:43     #12 platform_thread_wrap /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000377c)
15:30:43
15:30:43   Mutex M21536 acquired here while holding mutex M21533 in main thread:
15:30:43     #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e9e0)
15:30:43     #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x0000000039c0)
15:30:43     #2 Mutex::acquire() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e28ee)
15:30:43     #3 LockHolder::lock() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:71 (ep.so+0x000000080bc3)
15:30:43     #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:48 (ep.so+0x000000080832)
15:30:43     #5 VBucket::getCurrentSnapshot(unsigned long&, unsigned long&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/vbucket.h:233 (ep.so+0x0000000fb245)
15:30:43     #6 ActiveStream::ActiveStream(EventuallyPersistentEngine*, SingleThreadedRCPtr<DcpProducer>, std::string const&, unsigned int, unsigned int, unsigned short, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-stream.cc:293 (ep.so+0x000000291276)
15:30:43     #7 DcpProducer::streamRequest(unsigned int, unsigned int, unsigned short, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long*, ENGINE_ERROR_CODE (*)(vbucket_failover_t*, unsigned long, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-producer.cc:259 (ep.so+0x00000027b9a3)
15:30:43     #8 EvpDcpStreamReq(engine_interface*, void const*, unsigned int, unsigned int, unsigned short, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long*, ENGINE_ERROR_CODE (*)(vbucket_failover_t*, unsigned long, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:1471 (ep.so+0x0000001395e3)
15:30:43     #9 mock_dcp_stream_req /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004caf81)
15:30:43     #10 dcp_stream(engine_interface*, engine_interface_v1*, char const*, void const*, unsigned short, unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, int, int, int, int, int, bool, bool, unsigned long, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_testsuite.cc:3427 (ep_testsuite.so+0x0000000b357e)
15:30:43     #11 test_dcp_replica_stream_backfill(engine_interface*, engine_interface_v1*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_testsuite.cc:5311 (ep_testsuite.so+0x00000008e78a)
15:30:43     #12 execute_test /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e9f)
15:30:43     #13 main crtstuff.c (engine_testapp+0x0000004c2e01)

Change-Id: Idc09ce9af98669f74f28d1fd4b1cc15f7d8b1152
Reviewed-on: http://review.couchbase.org/63379
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19383: [BP] Address possible data race with startuptime 19/63419/4
abhinavdangeti [Tue, 26 Apr 2016 21:03:11 +0000 (14:03 -0700)]
MB-19383: [BP] Address possible data race with startuptime

WARNING: ThreadSanitizer: data race (pid=14344)

  Read of size 8 at 0x7d780000fa58 by thread T6:
    #0 void STATWRITER_NAMESPACE::add_casted_stat<long>(char const*, long const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/statwriter.h:45 (ep.so+0x000000037ff5)
    #1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3557 (ep.so+0x0000000be990)
    #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4554 (ep.so+0x0000000c5c8c)
    #3 EventuallyPersistentStore::snapshotStats() /home/abhinav/couchbase/ep-engine/src/ep.cc:1671 (ep.so+0x00000008f1fe)
    #4 StatSnap::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:97 (ep.so+0x00000013cea6)
    #5 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f94e3)
    #6 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9065)
    #7 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 8 at 0x7d780000fa58 by main thread:
    #0 EventuallyPersistentEngine::initialize(char const*) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:2167 (ep.so+0x0000000b728a)
    #1 EvpInitialize(engine_interface*, char const*) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:133 (ep.so+0x0000000b4aa8)
    #2 init_engine_instance /home/abhinav/couchbase/memcached/utilities/engine_loader.c:157 (libmcd_util.so.1.0.0+0x0000000058bb)
    #3 create_bucket(bool, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:980 (engine_testapp+0x0000000b9e12)
    #4 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1083 (engine_testapp+0x0000000b93db)
    #5 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

(Reviewed-on: http://review.couchbase.org/55776)

Change-Id: Ibec6c267f9138aab626359c703fc067f91e1ee43
Reviewed-on: http://review.couchbase.org/63419
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: abhinav dangeti <abhinav@couchbase.com>
4 years agoMB-19380: Address data race observed with vb's pendingBGFetches 69/63369/4
abhinavdangeti [Tue, 26 Apr 2016 19:17:22 +0000 (12:17 -0700)]
MB-19380: Address data race observed with vb's pendingBGFetches

[Not a backport, this code was altered/removed in master]

11:56:19   Read of size 8 at 0x7d6400050df8 by main thread (mutexes: write M45364, write M44294):
11:56:19     #0 std::_Hashtable<std::string, std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > >, std::allocator<std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > > >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::size() const /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/hashtable.h:500 (ep.so+0x00000008d98e)
11:56:19     #1 std::unordered_map<std::string, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > > > >::size() const /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/unordered_map.h:264 (ep.so+0x000000085be0)
11:56:19     #2 VBucket::numPendingBGFetchItems() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/vbucket.h:333 (ep.so+0x000000101789)
11:56:19     #3 EventuallyPersistentStore::bgFetch(std::string const&, unsigned short, unsigned long, void const*, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:1651 (ep.so+0x0000000d71f7)
11:56:19     #4 EventuallyPersistentStore::getInternal(std::string const&, unsigned short, void const*, bool, bool, vbucket_state_t, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:1708 (ep.so+0x0000000e3e21)
11:56:19     #5 EventuallyPersistentStore::get(std::string const&, unsigned short, void const*, bool, bool, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.h:242 (ep.so+0x00000019eee9)
11:56:19     #6 EventuallyPersistentEngine::get(void const*, void**, void const*, int, unsigned short, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.h:259 (ep.so+0x00000016fe75)
11:56:19     #7 EvpGet(engine_interface*, void const*, void**, void const*, int, unsigned short) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:202 (ep.so+0x000000136911)
11:56:19     #8 mock_get /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c6e84)
11:56:19     #9 get_item_info(engine_interface*, engine_interface_v1*, item_info*, char const*, unsigned short) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_test_apis.cc:356 (ep_testsuite.so+0x0000000e2f2d)
11:56:19     #10 check_key_value(engine_interface*, engine_interface_v1*, char const*, char const*, unsigned long, unsigned short) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_testsuite.cc:155 (ep_testsuite.so+0x0000000b2590)
11:56:19     #11 test_duplicate_items_disk(engine_interface*, engine_interface_v1*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_testsuite.cc:7839 (ep_testsuite.so+0x00000005a4b6)
11:56:19     #12 execute_test /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e9f)
11:56:19     #13 main crtstuff.c (engine_testapp+0x0000004c2e01)
11:56:19
11:56:19   Previous write of size 8 at 0x7d6400050df8 by thread T1 (mutexes: write M44318):
11:56:19     #0 std::_Hashtable<std::string, std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > >, std::allocator<std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > > >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/hashtable.h:1943 (ep.so+0x000000087ba8)
11:56:19     #1 std::unordered_map<std::string, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > > > >::clear() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/unordered_map.h:528 (ep.so+0x000000086f40)
11:56:19     #2 VBucket::getBGFetchItems(std::unordered_map<std::string, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::list<VBucketBGFetchItem*, std::allocator<VBucketBGFetchItem*> > > > >&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/vbucket.cc:294 (ep.so+0x0000002b4026)
11:56:19     #3 BgFetcher::run(GlobalTask*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/bgfetcher.cc:155 (ep.so+0x00000008449f)
11:56:19     #4 BgFetcherTask::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/tasks.cc:89 (ep.so+0x00000025205e)
11:56:19     #5 ExecutorThread::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:109 (ep.so+0x0000001e38f1)
11:56:19     #6 launch_executor_thread(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2f1a)
11:56:19     #7 platform_thread_wrap /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000377c)

Change-Id: I66e3c2af1f58448a68fbfedf3dfa030a657ed9a7
Reviewed-on: http://review.couchbase.org/63369
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19360: Init mock server in stream module tests 67/63367/2
abhinavdangeti [Tue, 26 Apr 2016 18:12:11 +0000 (11:12 -0700)]
MB-19360: Init mock server in stream module tests

This needs to be done so that time_mutex in mock_server
gets initialized to enable crash-free invocations of
mock_get_current_time and mock_time_travel apis.

Change-Id: I06e6469a227df1108892c9616344ff3789c72cb8
Reviewed-on: http://review.couchbase.org/63367
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19382: [BP] Create a variable to get correct locking scope 18/63418/3
abhinavdangeti [Tue, 26 Apr 2016 19:48:22 +0000 (12:48 -0700)]
MB-19382: [BP] Create a variable to get correct locking scope

A mistake in 495e00acc24 means that no variable is
created for the ReaderLockHolder, the compiler either
optimises away the lock constructor/destructor or the lock
scope is wrong.

Either way we need to create a variable.

Includes some lock ordering changes as per ThreadSantitiser
warnings.

(Reviewed-on: http://review.couchbase.org/56978)

This will address the following lock inversion:

11:56:19 WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=51509)
11:56:19   Cycle in lock order graph: M21441 (0x7d780000f450) => M21477 (0x7d640005edf0) => M21441
11:56:19
11:56:19   Mutex M21477 acquired here while holding mutex M21441 in main thread:
11:56:19     #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000462260)
11:56:19     #1 cb_rw_reader_enter <null> (libplatform.so.0.1.0+0x000000004800)
11:56:19     #2 RWLock::readerLock() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/rwlock.h:38 (ep.so+0x000000132360)
11:56:19     #3 ReaderLockHolder::ReaderLockHolder(RWLock&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:167 (ep.so+0x0000000f8087)
11:56:19     #4 EventuallyPersistentStore::getAndUpdateTtl(std::string const&, unsigned short, void const*, long) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:1970 (ep.so+0x0000000e6b45)
11:56:19     #5 EventuallyPersistentEngine::touch(void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:4619 (ep.so+0x000000155fe8)
11:56:19     #6 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:1126 (ep.so+0x000000163a9b)
11:56:19     #7 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:1312 (ep.so+0x000000137365)
11:56:19     #8 mock_unknown_command /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c8f1a)
11:56:19     #9 gat(engine_interface*, engine_interface_v1*, char const*, unsigned short, unsigned int, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_test_apis.cc:348 (ep_testsuite.so+0x0000000e2d6b)
11:56:19     #10 test_expired_item_with_item_eviction(engine_interface*, engine_interface_v1*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/tests/ep_testsuite.cc:11401 (ep_testsuite.so+0x0000000acbd4)
11:56:19     #11 execute_test /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e9f)
11:56:19     #12 main crtstuff.c (engine_testapp+0x0000004c2e01)
11:56:19
11:56:19   Mutex M21441 acquired here while holding mutex M21477 in thread T8:
11:56:19     #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e9e0)
11:56:19     #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x0000000039c0)
11:56:19     #2 Mutex::acquire() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e241e)
11:56:19     #3 LockHolder::lock() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:71 (ep.so+0x000000080bc3)
11:56:19     #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/locks.h:48 (ep.so+0x000000080832)
11:56:19     #5 HashTable::getLockedBucket(int, int*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/stored-value.h:1266 (ep.so+0x00000008280a)
11:56:19     #6 HashTable::getLockedBucket(std::string const&, int*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/stored-value.h:1295 (ep.so+0x00000007c61b)
11:56:19     #7 EventuallyPersistentStore::deleteExpiredItem(unsigned short, std::string&, long, unsigned long) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:481 (ep.so+0x0000000d4d80)
11:56:19     #8 ExpiredItemsCallback::callback(compaction_ctx&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:1258 (ep.so+0x000000123ecb)
11:56:19     #9 CouchKVStore::compactVBucket(unsigned short, compaction_ctx*, Callback<compaction_ctx>&, Callback<KVStatsCtx>&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/couch-kvstore/couch-kvstore.cc:862 (ep.so+0x0000003159f7)
11:56:19     #10 EventuallyPersistentStore::compactVBucket(unsigned short, compaction_ctx*, void const*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep.cc:1326 (ep.so+0x0000000df2ec)
11:56:19     #11 CompactVBucketTask::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/tasks.cc:76 (ep.so+0x000000251ed1)
11:56:19     #12 ExecutorThread::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:109 (ep.so+0x0000001e38f1)
11:56:19     #13 launch_executor_thread(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2f1a)
11:56:19     #14 platform_thread_wrap /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000377c)

Change-Id: I5d5ca33fdd3c17df2be9d2b2d6acc8c254f1cb2d
Reviewed-on: http://review.couchbase.org/63418
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19359: [2] Address lock inversion with vb's state lock and snapshot lock 66/63366/6
abhinavdangeti [Tue, 26 Apr 2016 18:03:18 +0000 (11:03 -0700)]
MB-19359: [2] Address lock inversion with vb's state lock and snapshot lock

+ [Not a backport, this code was altered/removed in master]
+ Address this lock inversion by moving reading the vbucket snapshot
  range to outside the vbucket's state lock context.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=245522)
  Cycle in lock order graph: M21518 (0x7d640003e220) => M21515 (0x7d640003e0f0) => M21518

  Mutex M21515 acquired here while holding mutex M21518 in thread T17:
    #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000462260)
    #1 cb_rw_reader_enter <null> (libplatform.so.0.1.0+0x000000004800)
    #2 RWLock::readerLock() ep-engine/src/rwlock.h:38 (ep.so+0x000000132360)
    #3 ReaderLockHolder::ReaderLockHolder(RWLock&) ep-engine/src/locks.h:167 (ep.so+0x0000000f8087)
    #4 EventuallyPersistentStore::addTAPBackfillItem(Item const&, unsigned char, bool) ep-engine/src/ep.cc:851 (ep.so+0x0000000d9ba7)
    #5 PassiveStream::commitMutation(MutationResponse*, bool) ep-engine/src/dcp-stream.cc:1370 (ep.so+0x00000029dd8c)
    #6 PassiveStream::processMutation(MutationResponse*) ep-engine/src/dcp-stream.cc:1346 (ep.so+0x00000029cbd0)
    #7 PassiveStream::processBufferedMessages(unsigned int&) ep-engine/src/dcp-stream.cc:1286 (ep.so+0x00000029c522)
    #8 DccpConsumer::processBufferedItems() ep-engine/src/dcp-consumer.cc:599 (ep.so+0x000000262e04)
    #9 Processer::run() ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002629ff)
    #10 ExecutorThread::run() ep-engine/src/executorthread.cc:109 (ep.so+0x0000001e38f1)
    #11 launch_executor_thread(void*) ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2f1a)
    #12 platform_thread_wrap platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000377c)

  Mutex M21518 acquired here while holding mutex M21515 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e9e0)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x0000000039c0)
    #2 Mutex::acquire() ep-engine/src/mutex.cc:31 (ep.so+0x0000001e241e)
    #3 LockHolder::lock() ep-engine/src/locks.h:71 (ep.so+0x000000080bc3)
    #4 LockHolder::LockHolder(Mutex&, bool) ep-engine/src/locks.h:48 (ep.so+0x000000080832)
    #5 VBucket::getCurrentSnapshot(unsigned long&, unsigned long&) ep-engine/src/vbucket.h:233 (ep.so+0x0000000fae05)
    #6 EventuallyPersistentEngine::doSeqnoStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), char const*, int) ep-engine/src/ep_engine.cc:4255 (ep.so+0x00000014f202)
    #7 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4372 (ep.so+0x000000150bcb)
    #8 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:214 (ep.so+0x000000136a72)
    #9 mock_get_stats memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c8403)
    #10 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) ep-engine/tests/ep_test_apis.cc:771 (ep_testsuite.so+0x0000000e21ea)
    #11 wait_for_stat_to_be(engine_interface*, engine_interface_v1*, char const*, int, char const*) ep-engine/tests/ep_test_apis.cc:860 (ep_testsuite.so+0x0000000e8d2b)
    #12 test_dcp_replica_stream_backfill(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:5306 (ep_testsuite.so+0x00000008e601)
    #13 execute_test memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e9f)
    #14 main crtstuff.c (engine_testapp+0x0000004c2e01)

Change-Id: Ia4dd34ab152d1cc1d1658ebe957da7c3b8d32c06
Reviewed-on: http://review.couchbase.org/63366
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19359: [1] Address lock inversion with vb's state lock and snapshot lock 19/63319/7
abhinavdangeti [Mon, 25 Apr 2016 17:51:07 +0000 (10:51 -0700)]
MB-19359: [1] Address lock inversion with vb's state lock and snapshot lock

+ [Not a backport, this code was altered/removed in master]
+ This scenario should not occur in real operation.
+ Most DCP unit tests would however flag this as an issue because
  of how we do things in the tests --> [setting vbucket's state to
  replica at the very beginning (by the main thread)].
+ Suppressing this lock inversion, by moving the function call to
  update the vbucket's snapshot range to outside the state lock
  context in setState(), as it isn't necessary to acquire the state
  lock to update the snapshot range.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=39750)
  Cycle in lock order graph: M43306 (0x7d640000fcf0) => M43309 (0x7d640000fe18) => M43306

  Mutex M43309 acquired here while holding mutex M43306 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x000000474420)
    #1 cb_mutex_enter /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:85 (libplatform.so.0.1.0+0x0000000034a0)
    #2 Mutex::acquire() /home/daver/repos/couchbase/server/ep-engine/src/mutex.cc:31 (ep.so+0x0000001c611e)
    #3 LockHolder::lock() /home/daver/repos/couchbase/server/ep-engine/src/locks.h:71 (ep.so+0x00000006a4e3)
    #4 LockHolder /home/daver/repos/couchbase/server/ep-engine/src/locks.h:48 (ep.so+0x00000006a172)
    #5 VBucket::setCurrentSnapshot(unsigned long, unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/vbucket.h:217 (ep.so+0x0000000e5ee5)
    #6 VBucket::setState(vbucket_state_t, server_handle_v1_t*) /home/daver/repos/couchbase/server/ep-engine/src/vbucket.cc:196 (ep.so+0x0000002932e9)
    #7 EventuallyPersistentStore::setVBucketState(unsigned short, vbucket_state_t, bool, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:1060 (ep.so+0x0000000c0b61)
    #8 EventuallyPersistentEngine::setVBucketState(unsigned short, vbucket_state_t, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.h:628 (ep.so+0x000000188a12)
    #9 setVBucket(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:824 (ep.so+0x00000014aaaa)
    #10 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:1118 (ep.so+0x000000147707)
    #11 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:1312 (ep.so+0x00000011a055)
    #12 mock_unknown_command /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:335 (engine_testapp+0x0000004be97a)
    #13 set_vbucket_state(engine_interface*, engine_interface_v1*, unsigned short, vbucket_state_t) /home/daver/repos/couchbase/server/ep-engine/tests/ep_test_apis.cc:484 (ep_testsuite.so+0x0000000e1562)
    #14 test_dcp_replica_stream_backfill(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_testsuite.cc:5278 (ep_testsuite.so+0x00000008c84a)
    #15 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1042 (engine_testapp+0x0000004ba8ff)
    #16 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

  Mutex M43306 acquired here while holding mutex M43309 in thread T17:
    #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000457ca0)
    #1 cb_rw_reader_enter /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:264 (libplatform.so.0.1.0+0x0000000042e0)
    #2 RWLock::readerLock() /home/daver/repos/couchbase/server/ep-engine/src/rwlock.h:38 (ep.so+0x000000115cf0)
    #3 ReaderLockHolder /home/daver/repos/couchbase/server/ep-engine/src/locks.h:167 (ep.so+0x0000000dbbe7)
    #4 EventuallyPersistentStore::addTAPBackfillItem(Item const&, unsigned char, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:851 (ep.so+0x0000000be35d)
    #5 PassiveStream::commitMutation(MutationResponse*, bool) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1370 (ep.so+0x00000027e8cc)
    #6 PassiveStream::processMutation(MutationResponse*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1346 (ep.so+0x00000027d680)
    #7 PassiveStream::processBufferedMessages(unsigned int&) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1286 (ep.so+0x00000027cfbc)
    #8 DcpConsumer::processBufferedItems() /home/daver/repos/couchbase/server/ep-engine/src/dcp-consumer.cc:599 (ep.so+0x0000002454cc)
    #9 Processer::run() /home/daver/repos/couchbase/server/ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002450ef)
    #10 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:109 (ep.so+0x0000001c76d9)
    #11 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001c6caa)
    #12 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x00000000325c)

Change-Id: I2f3cf88e6aebbf6078f533fb1ed87bd9fe618616
Reviewed-on: http://review.couchbase.org/63319
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19343: Use cb_gmtime_r instead of gmtime_r 14/63314/2
Trond Norbye [Tue, 18 Nov 2014 07:49:11 +0000 (08:49 +0100)]
MB-19343: Use cb_gmtime_r instead of gmtime_r

Backport / cherry-pick from: bc660d479709b5eee74357920a1940294c786216
to fix Windows build break.

Change-Id: I49d19bbea22e31bd600f694acf89d98ffa3a62f3
Reviewed-on: http://review.couchbase.org/63314
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years ago[BP] MB-16366: Obtain vbstate readlock in numerous operations 65/62965/4
Jim Walker [Fri, 9 Oct 2015 14:14:28 +0000 (15:14 +0100)]
[BP] MB-16366: Obtain vbstate readlock in numerous operations

Any KV update operations grab the lock early and test that VB state
is active, they keep the lock until complete, this certainly protects
queueDirty from colliding with a VB state change and also any other
paths we're unaware of.

The GET operations only use the read lock if the GET has triggered a
expiry/queueDirty.

A couple of other locations that trigger queueDirty are also interlocked
with VB state changes.

(Already Reviewed-on: http://review.couchbase.org/55868)

Change-Id: Icaee69520da230a9fdde6eb85365a7ddae790fd6
Reviewed-on: http://review.couchbase.org/62965
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19280: Fix data race in CouchKVStore stats access 81/63081/6
Dave Rigby [Wed, 7 Oct 2015 15:27:03 +0000 (15:27 +0000)]
MB-19280: Fix data race in CouchKVStore stats access

As reported by ThreadSanitizer. CouchKVStore maintains a map of
vBucketID to counter - dbFileRevMap. This is read by some of the stats
functions (e.g. getNumPersistedDeletes) without a lock and hence there
is a potential race.

Solve this by changing the type of these counters to RelaxedAtomic<>.

WARNING: ThreadSanitizer: data race (pid=10155)
  Read of size 8 at 0x7d9000008000 by main thread (mutexes: write M21730):
    #0 CouchKVStore::getNumPersistedDeletes(unsigned short) ep-engine/src/couch-kvstore/couch-kvstore.cc:2095 (ep.so+0x000000326779)
    #1 EventuallyPersistentEngine::doDcpVbTakeoverStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), std::string&, unsigned short) ep-engine/src/ep_engine.cc:5312 (ep.so+0x000000155ca5)
    #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4462 (ep.so+0x000000154622)

  Previous write of size 8 at 0x7d9000008000 by thread T8 (mutexes: write M15079):
    #0 CouchKVStore::updateDbFileMap(unsigned short, unsigned long) ep-engine/src/couch-kvstore/couch-kvstore.cc:1306 (ep.so+0x000000311d3c)
    #1 CouchKVStore::openDB(unsigned short, unsigned long, _db**, unsigned long, unsigned long*) ep-engine/src/couch-kvstore/couch-kvstore.cc:1336 (ep.so+0x00000030f7ae)
    #2 CouchKVStore::setVBucketState(unsigned short, vbucket_state&, Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:981 (ep.so+0x00000031a557)
    #3 CouchKVStore::snapshotVBucket(unsigned short, vbucket_state&, Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:891 (ep.so+0x00000031a11c)
    #4 EventuallyPersistentStore::snapshotVBuckets(Priority const&, unsigned short) ep-engine/src/ep.cc:949 (ep.so+0x0000000dce69)

Change-Id: I83db17ffce0d0a49cfe80f23a34e5dac25ede719
Reviewed-on: http://review.couchbase.org/63081
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19279: Fix race in use of gmtime() 80/63080/5
Dave Rigby [Wed, 12 Nov 2014 17:05:56 +0000 (17:05 +0000)]
MB-19279: Fix race in use of gmtime()

As identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=17259)
      Write of size 8 at 0x7fec86e44de0 by main thread (mutexes: write M1161):
#0 gmtime ??:0 (libtsan.so.0+0x000000025135)
#1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:3369 (ep.so+0x00000010f4be)
#2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4339 (ep.so+0x000000113c35)
#3 EvpGetStats /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#4 mock_get_stats /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#5 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000832d8)
#6 wait_for_stat_change(engine_interface*, engine_interface_v1*, char const*, int, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:846 (ep_testsuite.so+0x0000000838d6)
#7 test_setup /repos/couchbase/server/source/ep-engine/tests/ep_testsuite.cc:178 (ep_testsuite.so+0x00000001ee69)
#8 execute_test /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1050 (exe+0x000000005970)
#9 main /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

      Previous write of size 8 at 0x7fec86e44de0 by thread T7:
#0 gmtime ??:0 (libtsan.so.0+0x000000025135)
#1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:3369 (ep.so+0x00000010f4be)
#2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4339 (ep.so+0x000000113c35)
#3 EventuallyPersistentStore::snapshotStats() /repos/couchbase/server/source/ep-engine/src/ep.cc:1465 (ep.so+0x0000000e150e)
#4 StatSnap::run() /repos/couchbase/server/source/ep-engine/src/tasks.cc:79 (ep.so+0x000000174db2)
#5 ExecutorThread::run() /repos/couchbase/server/source/ep-engine/src/executorthread.cc:110 (ep.so+0x00000014a0e7)
#6 launch_executor_thread /repos/couchbase/server/source/ep-engine/src/executorthread.cc:34 (ep.so+0x000000149930)
#7 platform_thread_wrap /repos/couchbase/server/source/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#8 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

Switch to gmtime_r which is thread-safe.

Change-Id: Id0773df65f4fc569c0a173b6185b1ef8bd91862d
Reviewed-on: http://review.couchbase.org/63080
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19113: Suppress test_mb16357 when on thread sanitizer 22/62922/10
abhinavdangeti [Fri, 15 Apr 2016 18:59:14 +0000 (11:59 -0700)]
MB-19113: Suppress test_mb16357 when on thread sanitizer

This is to suppress a false positive thrown by thread
sanitizer regarding a lock inversion that would never
occur in operation.
    The lock inversion pointed out is between the front
end work load thread, that grabs the hash table partition
lock and then the vbucket snapshot lock, while the dcp
consumer processer task grabs the snapshot lock and then
the hash table partition lock. Note that the first thread
always works on an active vbucket, while the second thread
always works on a replica vbucket, and the vbucket cannot
be in active and replica state(s) at the same time.

Change-Id: I5e42e14a2333b0720d8c43c9e2a4d7190696f5e9
Reviewed-on: http://review.couchbase.org/62922
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>

4 years agoMB-19278: Fix lock-order inversion on ActiveStream::streamMutex 33/63033/6
Dave Rigby [Tue, 19 Apr 2016 13:41:06 +0000 (14:41 +0100)]
MB-19278: Fix lock-order inversion on ActiveStream::streamMutex

ThreadSanitizer has identified a potential deadlock due to a cycle in
  the lock order graph: Cycle in lock order graph:

    M43515         => M36787      => M36848            => M43515
   [ActiveStream::   [TaskQueue::   [ExecutorThread::    [ActiveStream::
    streamMutex]      mutex]         currentTaskMutex]    streamMutex]

The crux of the problem appears to be the acquisition of streamMutex
in the destructor of ActiveStream. This is ultimately a Bad Idea - if
you still have multiple threads accessing an object when it's been
deleted then you are already into undefined behaviour.

Change-Id: I2353b5a8ed93a4f9e8cc036cb85680c185cbcc2f
Reviewed-on: http://review.couchbase.org/63033
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Daniel Owen <owend@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19277: Set executorThread's waketime to atomic 29/63029/6
abhinavdangeti [Thu, 22 Oct 2015 01:55:28 +0000 (18:55 -0700)]
MB-19277: Set executorThread's waketime to atomic

This one looks benign - we only perform the dirty read when
calculating the %s:waketime stat which is not used by anyone apart
from end-users.

WARNING: ThreadSanitizer: data race (pid=41666)
  Read of size 8 at 0x7d4400008370 by main thread (mutexes: write M21616):
    #0 ExecutorThread::getWaketime() ep-engine/src/executorthread.h:120 (ep.so+0x0000001cac0e)
    #1 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:692 (ep.so+0x0000001b6f06)
    #2 ExecutorPool::doWorkerStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:706 (ep.so+0x0000001b6734)
    #3 EventuallyPersistentEngine::doDispatcherStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4139 (ep.so+0x00000015223e)
    #4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4375 (ep.so+0x000000155336)

  Previous write of size 8 at 0x7d4400008370 by thread T6 (mutexes: write M14985):
    #0 TaskQueue::_fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:125 (ep.so+0x00000025b3fd)
    #1 TaskQueue::fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:161 (ep.so+0x00000025bfcf)
    #2 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:217 (ep.so+0x0000001afc67)
    #3 ExecutorPool::nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:232 (ep.so+0x0000001afe3f)
    #4 ExecutorThread::run() ep-engine/src/executorthread.cc:81 (ep.so+0x0000001e85c9)

Change-Id: I34b12681dd9dfc87c889f301692ca714f04d2a82
Reviewed-on: http://review.couchbase.org/63029
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19276: Fix data race on ExecutorThread::taskStart 28/63028/6
abhinavdangeti [Tue, 8 Dec 2015 21:48:20 +0000 (13:48 -0800)]
MB-19276: Fix data race on ExecutorThread::taskStart

WARNING: ThreadSanitizer: data race (pid=236996)
  Write of size 8 at 0x7d380000dbd8 by thread T5:
    #0 ExecutorThread::run() ep-engine/src/executorthread.cc:105 (ep.so+0x0000000ee0cc)
    #1 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000edd75)
    #2 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x000000004e7b)

  Previous read of size 8 at 0x7d380000dbd8 by main thread (mutexes: write M266835151185571736):
    #0 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorthread.h:108 (ep.so+0x0000000ea8dc)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4363 (ep.so+0x0000000c05dd)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:219 (ep.so+0x0000000af40e)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:240 (engine_testapp+0x0000004cc71d)
    #4 test_worker_stats(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:9464 (ep_testsuite.so+0x00000003c038)
    #5 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1091 (engine_testapp+0x0000004cb315)
    #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: If01aba3cf6b3591f19c5bb62119e40998f12c8ff
Reviewed-on: http://review.couchbase.org/63028
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19275: Address data race on a DCP stream's state 27/63027/6
abhinavdangeti [Wed, 28 Oct 2015 21:58:55 +0000 (14:58 -0700)]
MB-19275: Address data race on a DCP stream's state

WARNING: ThreadSanitizer: data race (pid=139161)
  Read of size 4 at 0x7d480000b150 by thread T31 (mutexes: write M51120):
    #0 DCPBackfill::scan() ep-engine/src/dcp/stream.h:126 (ep.so+0x000000053391)
    #1 DCPBackfill::run() ep-engine/src/dcp/backfill.cc:118 (ep.so+0x000000052737)
    #2 BackfillManager::backfill() ep-engine/src/dcp/backfill-manager.cc:240 (ep.so+0x00000004cf65)
    #3 BackfillManagerTask::run() ep-engine/src/dcp/backfill-manager.cc:43 (ep.so+0x00000004cb8f)
    #4 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000eb94d)
    #5 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000eb515)
    #6 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:53 (libplatform.so.0.1.0+0x0000000048ab)

  Previous write of size 4 at 0x7d480000b150 by main thread (mutexes: write M1241, write M32448, write M51071, write M51087):
    #0 ActiveStream::transitionState(stream_state_t) ep-engine/src/dcp/stream.cc:829 (ep.so+0x00000006accb)
    #1 ActiveStream::endStream(end_stream_status_t) ep-engine/src/dcp/stream.cc:688 (ep.so+0x00000006a8c2)
    #2 ActiveStream::setDead(end_stream_status_t) ep-engine/src/dcp/stream.cc:654 (ep.so+0x00000006f27b)
    #3 DcpProducer::setDisconnect(bool) ep-engine/src/dcp/producer.cc:835 (ep.so+0x000000065605)
    #4 DcpConnMap::disconnect_UNLOCKED(void const*) ep-engine/src/connmap.cc:1116 (ep.so+0x000000045d6c)
    #5 DcpConnMap::disconnect(void const*) ep-engine/src/connmap.cc:1109 (ep.so+0x000000045c8b)
    #6 EventuallyPersistentEngine::handleDisconnect(void const*) ep-engine/src/ep_engine.cc:6265 (ep.so+0x0000000ca38a)
    #7 EvpHandleDisconnect(void const*, ENGINE_EVENT_TYPE, void const*, void const*) ep-engine/src/ep_engine.cc:1802 (ep.so+0x0000000af976)
    #8 destroy_mock_cookie memcached/programs/engine_testapp/mock_server.cc:325 (engine_testapp+0x0000004f4082)
    #9 dcp_stream_req(engine_interface*, engine_interface_v1*, unsigned int, unsigned short, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, ENGINE_ERROR_CODE) ep-engine/tests/ep_testsuite.cc:4331 (ep_testsuite.so+0x000000090b06)
    #10 test_failover_log_dcp(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:14127 (ep_testsuite.so+0x00000007ce7a)
    #11 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1091 (engine_testapp+0x0000004cb315)
    #12 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: Icfc82fa877999d128184c9cac8f8c0e1cafc4e67
Reviewed-on: http://review.couchbase.org/63027
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19273: Fix data race on PassiveStream::buffer.{bytes,items} 26/63026/6
Dave Rigby [Tue, 19 Apr 2016 10:38:37 +0000 (11:38 +0100)]
MB-19273: Fix data race on PassiveStream::buffer.{bytes,items}

As reported by ThreadSanitizer (see below), there is a dirty read on
{{buffer.items}} & {{buffer.bytes}} during stat writing, due to
PassiveStream::addStats not acquiring the {{bufMutex}} lock before
reading.

This appears benign as the stat is only user sent to users, not used
by ns_server etc for any calculation.

WARNING: ThreadSanitizer: data race (pid=28418)
  Read of size 8 at 0x7d5000018128 by main thread (mutexes: write M23810, write M969):
    #0 void STATWRITER_NAMESPACE::add_casted_stat<unsigned long>(char const*, unsigned long const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/statwriter.h:47 (ep.so+0x0000000afe89)
    #1 PassiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/dcp-stream.cc:1512 (ep.so+0x0000002a04ba)
    #2 DcpConsumer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/dcp-consumer.cc:555 (ep.so+0x00000026e867)

  Previous write of size 8 at 0x7d5000018128 by thread T18 (mutexes: write M23762):
    #0 PassiveStream::processBufferedMessages(unsigned int&) ep-engine/src/dcp-stream.cc:1311 (ep.so+0x00000029e196)
    #1 DcpConsumer::processBufferedItems() ep-engine/src/dcp-consumer.cc:599 (ep.so+0x0000002647e9)
    #2 Processer::run() ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002643ef)

Change-Id: I443e85d59ffda3827b670e747794b3fcb69c4f7c
Reviewed-on: http://review.couchbase.org/63026
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19260: Make cookie atomic to serialize set/get in ConnHandler 78/62978/8
abhinavdangeti [Tue, 6 Oct 2015 01:34:06 +0000 (18:34 -0700)]
MB-19260: Make cookie atomic to serialize set/get in ConnHandler

WARNING: ThreadSanitizer: data race (pid=20056)

  Write of size 8 at 0x7d600000f038 by main thread (mutexes: write M1412):
    #0 ConnHandler::setCookie(void const*) /home/abhinav/couchbase/ep-engine/src/tapconnection.h:344 (ep.so+0x000000042367)
    #1 EventuallyPersistentEngine::createTapQueue(void const*, std::string&, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:2655 (ep.so+0x0000000b86da)
    #2 EvpGetTapIterator(engine_interface*, void const*, void const*, unsigned long, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1462 (ep.so+0x0000000b46a3)
    #3 mock_get_tap_iterator(engine_interface*, void const*, void const*, unsigned long, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:467 (engine_testapp+0x0000000bae3e)
    #4 test_tap_ack_stream(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:7341 (ep_testsuite.so+0x000000050416)
    #5 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #6 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous read of size 8 at 0x7d600000f038 by thread T9 (mutexes: write M1411):
    #0 ConnHandler::getCookie() const /home/abhinav/couchbase/ep-engine/src/tapconnection.h:340 (ep.so+0x00000004067c)
    #1 bool TapConnMap::performOp<Item*>(std::string const&, TapOperation<Item*>&, Item*) /home/abhinav/couchbase/ep-engine/src/connmap.h:389 (ep.so+0x00000001fa08)
    #2 ItemResidentCallback::callback(CacheLookup&) /home/abhinav/couchbase/ep-engine/src/backfill.cc:63 (ep.so+0x00000001d9ca)
    #3 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1654 (ep.so+0x000000180ca0)
    #4 recordDbDumpC(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:66 (ep.so+0x00000017fe95)
    #5 lookup_callback(couchfile_lookup_request*, _sized_buf const*, _sized_buf const*) /home/abhinav/couchbase/couchstore/src/couch_db.cc:767 (libcouchstore.so+0x00000000d7e5)
    #6 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:99 (libcouchstore.so+0x00000000b5a2)
    #7 btree_lookup /home/abhinav/couchbase/couchstore/src/btree_read.cc:131 (libcouchstore.so+0x00000000affc)
    #8 couchstore_changes_since /home/abhinav/couchbase/couchstore/src/couch_db.cc:812 (libcouchstore.so+0x00000000d5f1)
    #9 CouchKVStore::scan(ScanContext*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1264 (ep.so+0x00000017f94e)
    #10 BackfillDiskLoad::run() /home/abhinav/couchbase/ep-engine/src/backfill.cc:131 (ep.so+0x00000001e449)
    #11 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f8956)
    #12 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f84f5)
    #13 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I8a668f17013c95abc9786d853ed2c6462cae5320
Reviewed-on: http://review.couchbase.org/62978
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19259: Fix data race on DcpConsumer::backoffs 77/62977/8
Dave Rigby [Thu, 8 Oct 2015 13:55:09 +0000 (13:55 +0000)]
MB-19259: Fix data race on DcpConsumer::backoffs

This member is read by stats processing concurrently with DcpConsumer
updating it. Change to RelaxedAtomic<>

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=24629)
  Read of size 4 at 0x7d5000016b9c by main thread (mutexes: write M27034, write M2479):
    #0 void ConnHandler::addStat<unsigned int>(), void const*) ep-engine/src/tapconnection.h:291:18 (ep.so+0x00000004e993)
    #1 DcpConsumer::addStats(), void const*) ep-engine/src/dcp/consumer.cc:731:5 (ep.so+0x00000005a2c3)
    ...

  Previous write of size 4 at 0x7d5000016b9c by thread T10:
    #0 DcpConsumer::processBufferedItems() ep-engine/src/dcp/consumer.cc:755:17 (ep.so+0x0000000539c3)
    #1 Processer::run() ep-engine/src/dcp/consumer.cc:57:13 (ep.so+0x00000005376b)
    #2 ExecutorThread::run() ep-engine/src/executorthread.cc:115:26 (ep.so+0x0000000e944e)
    ...

Change-Id: Iabddcc06213fbb80815d4b464c459adb922a0eff
Reviewed-on: http://review.couchbase.org/62977
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19258: Address data race with replicationThrottle parameters 76/62976/8
abhinavdangeti [Thu, 28 Jan 2016 20:31:56 +0000 (12:31 -0800)]
MB-19258: Address data race with replicationThrottle parameters

The detected usage is just in stats, however there is an indirect
usage of this variable (persistenceQueueSmallEnough, via
stats.replicationThrottleWriteQueueCap) which /might/ result in an
incorrect queue size being used.

WARNING: ThreadSanitizer: data race (pid=31345)
  Read of size 8 at 0x7d08000066c0 by thread T12:
    #0 ReplicationThrottle::adjustWriteQueueCap(unsigned long) ep-engine/src/replicationthrottle.cc:49 (ep.so+0x000000114c7f)
    #1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:3100 (ep.so+0x0000000bb424)
    #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4597 (ep.so+0x0000000c54d5)
    #3 EventuallyPersistentStore::snapshotStats() ep-engine/src/ep.cc:1744 (ep.so+0x000000090f2e)
    #4 StatSnap::run() ep-engine/src/tasks.cc:100 (ep.so+0x000000136dd6)
    #5 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000f6966)
    #6 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f6515)
    #7 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x00000000551b)
12:23:35
  Previous write of size 8 at 0x7d08000066c0 by main thread (mutexes: write M2287262534014660504):
    #0 EPStoreValueChangeListener::sizeValueChanged(std::string const&, unsigned long) ep-engine/src/replicationthrottle.h:42 (ep.so+0x0000000b35ee)
    #1 Configuration::setParameter(std::string const&, long) ep-engine/src/configuration.cc:225 (ep.so+0x000000197ac5)
    #2 Configuration::setReplicationThrottleQueueCap(long const&) build/ep-engine/src/generated_configuration.cc:459 (ep.so+0x0000001a6fa8)
    #3 setTapParam(EventuallyPersistentEngine*, char const*, char const*, std::string&) ep-engine/src/ep_engine.cc:323 (ep.so+0x0000000d7f3f)
    #4 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1365 (ep.so+0x0000000b4f68)
    #5 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:382 (engine_testapp+0x0000004ce149)
    #6 set_param(engine_interface*, engine_interface_v1*, protocol_binary_engine_param_t, char const*, char const*) ep-engine/tests/ep_test_apis.cc:597 (ep_testsuite_dcp.so+0x000000038e17)
    #7 test_consumer_backoff_stat(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite_dcp.cc:2184 (ep_testsuite_dcp.so+0x000000017411)
    #8 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1131 (engine_testapp+0x0000004cc600)
    #9 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: Ie4ff039603f2ddfc5b44d5d7f217544307655d31
Reviewed-on: http://review.couchbase.org/62976
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19281: [BP] Add template class RelaxedAtomic<> 69/63169/4
Dave Rigby [Thu, 21 Apr 2016 08:38:54 +0000 (09:38 +0100)]
MB-19281: [BP] Add template class RelaxedAtomic<>

Backport of the RelaxedAtomic template class from
platform/watson. Changed from C++11 std::atomic<> to our own
AtomicValue<> as 3.x doesn't have C++11 support on all platforms, and
moved to ep-engine as AtomicValue is an ep-engine specific class.

Doesn't include unit tests as they depend on GTest which isn't present
in 3.0.x.

Merge of the following platform commits:

* http://review.couchbase.org/54973 - Add template class RelaxedAtomic<>
* http://review.couchbase.org/55870 - RelaxedAtomic: Allow construction from template type
* http://review.couchbase.org/55889 - RelaxedAtomic: Remove 'explicit' definition for copy constructor

Change-Id: I16a5e2ebe85201aae85592329a2212c8a5c3a464
Reviewed-on: http://review.couchbase.org/63169
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19257: Fix data race on ExecutorThread::now 75/62975/6
Dave Rigby [Thu, 8 Oct 2015 15:49:52 +0000 (15:49 +0000)]
MB-19257: Fix data race on ExecutorThread::now

This value is accessed without a lock from addWorkerStats. Change to
atomic to fix race.

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=18761)
  Read of size 8 at 0x7d4400007fa8 by main thread (mutexes: write M19371):
    #0 ExecutorThread::getCurTime() /home/couchbase/server/ep-engine/src/executorthread.h:129:46 (ep.so+0x0000000e67a0)
    #1 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/executorpool.cc:748 (ep.so+0x0000000e67a0)
    #2 ExecutorPool::doWorkerStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/executorpool.cc:760 (ep.so+0x0000000e67a0)
    #3 EventuallyPersistentEngine::doDispatcherStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:4352:5 (ep.so+0x0000000bc72d)
    #4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:4588 (ep.so+0x0000000bc72d)
    #5 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:214:38 (ep.so+0x0000000ab70e)
    #6 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:239:19 (engine_testapp+0x0000004c553d)
    #7 test_worker_stats(engine_interface*, engine_interface_v1*) /home/couchbase/server/ep-engine/tests/ep_testsuite.cc:8901:24 (ep_testsuite.so+0x000000039908)
    #8 execute_test(test, char const*, char const*) /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:1090:19 (engine_testapp+0x0000004c4142)
    #9 main /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:1439 (engine_testapp+0x0000004c4142)

  Previous write of size 8 at 0x7d4400007fa8 by thread T7 (mutexes: write M12645):
    #0 TaskQueue::_doSleep(ExecutorThread&) /home/couchbase/server/ep-engine/src/taskqueue.cc:78:5 (ep.so+0x00000012eb21)
    #1 TaskQueue::_fetchNextTask(ExecutorThread&, bool) /home/couchbase/server/ep-engine/src/taskqueue.cc:117:21 (ep.so+0x00000012ed66)
    #2 TaskQueue::fetchNextTask(ExecutorThread&, bool) /home/couchbase/server/ep-engine/src/taskqueue.cc:160:17 (ep.so+0x00000012f907)
    #3 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) /home/couchbase/server/ep-engine/src/executorpool.cc:226:17 (ep.so+0x0000000dfa6f)
    #4 ExecutorPool::nextTask(ExecutorThread&, unsigned char) /home/couchbase/server/ep-engine/src/executorpool.cc:241:21 (ep.so+0x0000000dfac6)
    #5 ExecutorThread::run() /home/couchbase/server/ep-engine/src/executorthread.cc:81:28 (ep.so+0x0000000e9cfe)
    #6 launch_executor_thread(void*) /home/couchbase/server/ep-engine/src/executorthread.cc:33:9 (ep.so+0x0000000e9b05)
    #7 platform_thread_wrap /home/couchbase/server/platform/src/cb_pthreads.c:23:5 (libplatform.so.0.1.0+0x000000003dc1)

Change-Id: Idcaea9a157293dbf95ca236354673556a2f3c4ac
Reviewed-on: http://review.couchbase.org/62975
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19256: Address possible data race on VBCBAdaptor::currentvb 74/62974/6
abhinavdangeti [Mon, 5 Oct 2015 20:06:13 +0000 (13:06 -0700)]
MB-19256: Address possible data race on VBCBAdaptor::currentvb

WARNING: ThreadSanitizer: data race (pid=12312)

Read of size 2 at 0x7d400000fff8 by main thread (mutexes: write M12542):
    #0 VBCBAdaptor::getDescription() /home/abhinav/couchbase/ep-engine/src/ep.h:128 (ep.so+0x0000000a7fe1)
    #1 ExecutorPool::_stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:562 (ep.so+0x0000000f3c21)
    #2 ExecutorPool::stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:585 (ep.so+0x0000000f3f5e)
    #3 ~EventuallyPersistentStore /home/abhinav/couchbase/ep-engine/src/ep.cc:468 (ep.so+0x0000000836c6)
    #4 ~EventuallyPersistentEngine /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6326 (ep.so+0x0000000d4eda)
    #5 EvpDestroy(engine_interface*, bool) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000b4b8c)
    #6 mock_destroy(engine_interface*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000ba027)
    #7 destroy_bucket(engine_interface*, engine_interface_v1*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000b952e)
    #8 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Previous write of size 2 at 0x7d400000fff8 by thread T10:
    #0 VBCBAdaptor::run() /home/abhinav/couchbase/ep-engine/src/ep.cc:3776 (ep.so+0x00000009d7e3)
    #1 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f94d3)
    #2 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9055)
    #3 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I316fb1a845fbee09f634d39e64057c170fab4795
Reviewed-on: http://review.couchbase.org/62974
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19253: Fix race in void ExecutorPool::doWorkerStat 73/62973/6
Dave Rigby [Tue, 6 Oct 2015 14:29:34 +0000 (14:29 +0000)]
MB-19253: Fix race in void ExecutorPool::doWorkerStat

As reported by ThreadSanitizer (see below), there is a race between
setting the current task associated with a ExecutorThread and reading
the name of that thread.

Unfortunately there doesn't seem to be a straightforward way to solve
this without adding a new mutex; currentTask (the variable the race is
on) is a SingleThreadedRCPtr, which is non-trivial to make thread-safe
(i.e. atomic). I did consider changing currenTask (and all other
ExTask variables) to be a std::shared_ptr as in C++11 this has support
for updating atomically, however the support in mainstream compilers
apparently isn't quite there yet.

Therefore I've just added a mutex to guard currentTask.

ThreadSanitizer report:

WARNING: ThreadSanitizer: data race (pid=27332)
  Read of size 8 at 0x7d340000c8f0 by main thread (mutexes: write M19366):
    #0 ExecutorThread::getTaskableName() const /home/couchbase/couchbase/ep-engine/src/atomic.h:309 (ep.so+0x0000000e6178)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:4346 (ep.so+0x0000000bc4dd)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000ab49e)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000004c54ad)
    #4 test_worker_stats(engine_interface*, engine_interface_v1*) /home/couchbase/couchbase/ep-engine/tests/ep_testsuite.cc:8901 (ep_testsuite.so+0x000000039768)
    #5 execute_test(test, char const*, char const*) /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000004c40b2)
    #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

  Previous write of size 8 at 0x7d340000c8f0 by thread T5:
    #0 ExecutorThread::run() /home/couchbase/couchbase/ep-engine/src/atomic.h:322 (ep.so+0x0000000e9906)
    #1 launch_executor_thread(void*) /home/couchbase/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000e9795)
    #2 platform_thread_wrap /home/couchbase/.ccache/tmp/cb_pthread.tmp.00b591814417.18511.i (libplatform.so.0.1.0+0x000000003d91)

Change-Id: Id02f7a98b40b952a415cf9027a8f2243af38fc4d
Reviewed-on: http://review.couchbase.org/62973
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19252: Fix data race on Stream::readyQueueMemory 72/62972/6
Dave Rigby [Mon, 18 Apr 2016 13:47:19 +0000 (14:47 +0100)]
MB-19252: Fix data race on Stream::readyQueueMemory

As detected by TSan:

WARNING: ThreadSanitizer: data race (pid=17244)
  Read of size 8 at 0x7d480000b370 by main thread (mutexes: write M24165, write M969, read M24121):
    #0 Stream::getReadyQueueMemory() /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:234 (ep.so+0x00000028f51e)
    #1 ActiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:563 (ep.so+0x00000029452f)
    #2 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-producer.cc:551 (ep.so+0x00000027f1a0)
    #3 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:3696 (ep.so+0x000000182d54)

  Previous write of size 8 at 0x7d480000b370 by thread T16 (mutexes: write M24143):
    #0 Stream::pushToReadyQ(DcpResponse*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:211 (ep.so+0x00000028f4a6)
    #1 ActiveStream::backfillReceived(Item*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:407 (ep.so+0x00000028d6e5)
    #2 CacheCallback::callback(CacheLookup&) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:87 (ep.so+0x00000028d4b3)
    #3 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/daver/repos/couchbase/server/ep-engine/src/couch-kvstore/couch-kvstore.cc:1563 (ep.so+0x00000031dec5)

See also: http://review.couchbase.org/54314 which originally fixed
this issue in watson; however it also fixed a couple of other issues
in the same patch.

Change-Id: Iae6a34403394e54c9d7213a7c2703be761e7dc0f
Reviewed-on: http://review.couchbase.org/62972
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19251: Fix race in updating Vbucket.file{SpaceUsed,Size} 71/62971/6
Dave Rigby [Wed, 12 Nov 2014 17:52:41 +0000 (17:52 +0000)]
MB-19251: Fix race in updating Vbucket.file{SpaceUsed,Size}

These variables are used in the calculation of ep_db_data_size and
ep_db_file_size stats, and crucially those stats are used by ns_server
when determining if a bucket should be compacted.

It is possible that due to this issue, compaction may not be triggered
when expected, or triggered when it shouldn't.

As identified by ThreadSantizer:

    WARNING: ThreadSanitizer: data race (pid=4009)
      Write of size 8 at 0x7d440000c5b0 by thread T6 (mutexes: write M55):
#0 KVStatsCallback::callback(KVStatsCtx&) /repos/couchbase/server/source/ep-engine/src/ep.cc:933 (ep.so+0x0000000f0a22)
#1 CouchKVStore::commit2couchstore(Callback<KVStatsCtx>*, unsigned long, unsigned long) /repos/couchbase/server/source/ep-engine/src/couch-kvstore/couch-kvstore.cc:1697 (ep.so+0x0000001aa8c6)
#2 CouchKVStore::commit(Callback<KVStatsCtx>*, unsigned long, unsigned long) /repos/couchbase/server/source/ep-engine/src/couch-kvstore/couch-kvstore.cc:1040 (ep.so+0x0000001a6483)
#3 EventuallyPersistentStore::flushVBucket(unsigned short) /repos/couchbase/server/source/ep-engine/src/ep.cc:2909 (ep.so+0x0000000e780b)
#4 Flusher::flushVB() /repos/couchbase/server/source/ep-engine/src/flusher.cc:283 (ep.so+0x00000013f363)
#5 Flusher::step(GlobalTask*) /repos/couchbase/server/source/ep-engine/src/flusher.cc:174 (ep.so+0x00000013e9c8)
#6 FlusherTask::run() /repos/couchbase/server/source/ep-engine/src/tasks.cc:44 (ep.so+0x000000174a85)
#7 ExecutorThread::run() /repos/couchbase/server/source/ep-engine/src/executorthread.cc:110 (ep.so+0x00000014a0c1)
#8 launch_executor_thread /repos/couchbase/server/source/ep-engine/src/executorthread.cc:34 (ep.so+0x00000014990a)
#9 platform_thread_wrap /repos/couchbase/server/source/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#10 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

      Previous read of size 8 at 0x7d440000c5b0 by main thread (mutexes: write M193510842443017784):
#0 VBucketCountVisitor::visitBucket(RCPtr<VBucket>&) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2889 (ep.so+0x00000010c631)
#1 VBucketCountAggregator::visitBucket(RCPtr<VBucket>&) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2926 (ep.so+0x000000121392)
#2 EventuallyPersistentStore::visit(VBucketVisitor&) /repos/couchbase/server/source/ep-engine/src/ep.cc:3278 (ep.so+0x0000000e99d9)
#3 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2955 (ep.so+0x00000010cb46)
#4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4344 (ep.so+0x000000113c0f)
#5 EvpGetStats /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#6 mock_get_stats /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#7 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000832d8)
#8 wait_for_flusher_to_settle(engine_interface*, engine_interface_v1*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:900 (ep_testsuite.so+0x000000083cd4)
#9 wait_for_persisted_value(engine_interface*, engine_interface_v1*, char const*, char const*, unsigned short) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:917 (ep_testsuite.so+0x000000083de5)
#10 test_io_stats /repos/couchbase/server/source/ep-engine/tests/ep_testsuite.cc:6279 (ep_testsuite.so+0x0000000468b3)
#11 execute_test /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000059fc)
#12 main /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

Change to AtomicValues to fix this.

Change-Id: Ie7f9a403f809e751ff1802cceb5d2a77a483a586
Reviewed-on: http://review.couchbase.org/62971
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19249: Address possible data races in ConnHandler context 70/62970/6
abhinavdangeti [Mon, 5 Oct 2015 22:55:34 +0000 (15:55 -0700)]
MB-19249: Address possible data races in ConnHandler context

WARNING: ThreadSanitizer: data race (pid=2443)

  Read of size 1 at 0x7d5000016a58 by thread T10:
    #0 ConnHandler::doDisconnect() /home/abhinav/couchbase/ep-engine/src/tapconnection.h:375 (ep.so+0x000000058416)
    #1 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f87f6)
    #2 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8395)
    #3 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 1 at 0x7d5000016a58 by main thread:
    [failed to restore the stack]

  Location is heap block of size 480 at 0x7d5000016a00 allocated by main thread:
    #0 operator new(unsigned long) <null>:0 (engine_testapp+0x00000005084d)
    #1 DcpConnMap::newConsumer(void const*, std::string const&) /home/abhinav/couchbase/ep-engine/src/connmap.cc:969 (ep.so+0x000000048384)
    #2 EventuallyPersistentEngine::dcpOpen(void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6189 (ep.so+0x0000000d3668)
    #3 EvpDcpOpen(engine_interface*, void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1494 (ep.so+0x0000000b4e5f)
    #4 mock_dcp_open(engine_interface*, void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:488 (engine_testapp+0x0000000bb015)
    #5 test_dcp_consumer_flow_control_aggressive(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:3826 (ep_testsuite.so+0x00000006ecfd)
    #6 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #7 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Change-Id: Id5223e93c416e5e5287c137d561aea1e453cbd41
Reviewed-on: http://review.couchbase.org/62970
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19248: Fix race in TaskQueue.{ready,future,pending}Queue access 69/62969/6
Dave Rigby [Wed, 12 Nov 2014 17:26:17 +0000 (17:26 +0000)]
MB-19248: Fix race in TaskQueue.{ready,future,pending}Queue access

Fix race as identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=4243)
      Read of size 8 at 0x7d04000fde60 by main thread (mutexes: write M1367):
#0 std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >::operator++() /usr/include/c++/4.8/bits/stl_list.h:235 (ep.so+0x00000013a129)
#1 std::iterator_traits<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >::difference_type std::__distance<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >(std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::input_iterator_tag) /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:82 (ep.so+0x000000138b67)
#2 std::iterator_traits<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >::difference_type std::distance<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >(std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >) /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:118 (ep.so+0x000000136b63)
#3 std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::size() const /usr/include/c++/4.8/bits/stl_list.h:874 (ep.so+0x000000135538)
#4 ExecutorPool::doTaskQStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:654 (ep.so+0x0000001331b6)
#5 EventuallyPersistentEngine::doWorkloadStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4198 (ep.so+0x000000112ed3)
#6 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4454 (ep.so+0x000000114cb4)
#7 EvpGetStats ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#8 mock_get_stats memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#9 test_workload_stats ep-engine/tests/ep_testsuite.cc:7094 (ep_testsuite.so+0x00000004e931)
#10 execute_test memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000059fc)
#11 main memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

      Previous write of size 8 at 0x7d04000fde60 by thread T5 (mutexes: write M45):
#0 RCValue::_rc_decref() const ep-engine/src/atomic.h:293 (ep.so+0x000000096797)
#1 __gnu_cxx::new_allocator<std::_List_node<SingleThreadedRCPtr<GlobalTask> > >::allocate(unsigned long, void const*) /usr/include/c++/4.8/ext/new_allocator.h:104 (ep.so+0x00000017b42b)
#2 std::_List_base<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_get_node() /usr/include/c++/4.8/bits/stl_list.h:334 (ep.so+0x00000017b0dc)
#3 std::_List_node<SingleThreadedRCPtr<GlobalTask> >* std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_create_node<SingleThreadedRCPtr<GlobalTask> const&>(SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:502 (ep.so+0x00000017a25a)
#4 void std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_insert<SingleThreadedRCPtr<GlobalTask> const&>(std::_List_iterator<SingleThreadedRCPtr<GlobalTask> >, SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:1561 (ep.so+0x000000178e7b)
#5 std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::push_back(SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:1016 (ep.so+0x00000017817b)
#6 TaskQueue::_fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:126 (ep.so+0x000000176932)
#7 TaskQueue::fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:142 (ep.so+0x000000176acf)
#8 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:214 (ep.so+0x000000130707)
#9 ExecutorPool::nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:229 (ep.so+0x0000001307a5)
#10 ExecutorThread::run() ep-engine/src/executorthread.cc:78 (ep.so+0x000000149da0)
#11 launch_executor_thread ep-engine/src/executorthread.cc:34 (ep.so+0x00000014990a)
#12 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#13 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

Fix by adding new helper methods which will return the size of the
various queues (while holding the TaskQueue's mutex).

Change-Id: If5e8d357e45803d78c4ba6ed1475e6e1a90e1c89
Reviewed-on: http://review.couchbase.org/62969
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19247: Fix possible data race in workload.h: workloadPattern 68/62968/6
abhinavdangeti [Mon, 5 Oct 2015 21:22:38 +0000 (14:22 -0700)]
MB-19247: Fix possible data race in workload.h: workloadPattern

This variable is reported in the ThreadSanitizer output as being read
by a stats function, which could result in incorrect stats printed to
the user.  However, from code inspection the compactor
(EventuallyPersistentStore::scheduleCompaction) also reads this
variable, and hence could result in incorrect task scheduling.

WARNING: ThreadSanitizer: data race (pid=24180)

  Read of size 4 at 0x7d040000f608 by main thread (mutexes: write M1308043):
    #0 WorkLoadPolicy::stringOfWorkLoadPattern() ep-engine/src/workload.h:65 (ep.so+0x0000000bee15)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4554 (ep.so+0x0000000c5cac)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b4dee)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #4 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aebb1)
    #5 test_access_scanner(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:8569 (ep_testsuite.so+0x00000002efd7)
    #6 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #7 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous write of size 4 at 0x7d040000f608 by thread T10:
    #0 WorkLoadPolicy::setWorkLoadPattern(workload_pattern_t) ep-engine/src/workload.h:76 (ep.so+0x00000013d75b)
    #1 ExecutorThread::run() ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f9503)
    #2 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9085)
    #3 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: If1dd4885a7beefc804e425d077ff18b117be8bdd
Reviewed-on: http://review.couchbase.org/62968
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19246: Fix potentially incorrect persist_time in OBSERVE response 67/62967/6
abhinavdangeti [Mon, 5 Oct 2015 23:20:08 +0000 (16:20 -0700)]
MB-19246: Fix potentially incorrect persist_time in OBSERVE response

There is a data race in updating EPStore::lastTransTimePerItem. This
could result in an incorrect value for the persist_time field in an
OBSERVE response.

WARNING: ThreadSanitizer: data race (pid=4590)

  Write of size 8 at 0x7d540000fe88 by thread T8 (mutexes: write M11599):
    #0 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3307 (ep.so+0x00000009954f)
    #1 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ff32f)
    #2 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fd825)
    #3 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:63 (ep.so+0x00000013bbb2)
    #4 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f89b6)
    #5 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8555)
    #6 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 8 at 0x7d540000fe88 by thread T6 (mutexes: write M11602):
    #0 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3307 (ep.so+0x00000009954f)
    #1 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ff32f)
    #2 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fd825)
    #3 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:63 (ep.so+0x00000013bbb2)
    #4 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f89b6)
    #5 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8555)
    #6 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Iccf1b0eacba495a8147fe81922361d566cb1d6a0
Reviewed-on: http://review.couchbase.org/62967
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19229: Address possible data race in vbucket.cc: numHpChks 66/62966/5
abhinavdangeti [Mon, 5 Oct 2015 21:21:54 +0000 (14:21 -0700)]
MB-19229: Address possible data race in vbucket.cc: numHpChks

WARNING: ThreadSanitizer: data race (pid=23450)

  Write of size 8 at 0x7d680001f678 by thread T5 (mutexes: write M11600, write M15531):
    #0 VBucket::notifyCheckpointPersisted(EventuallyPersistentEngine&, unsigned long, bool) /home/abhinav/couchbase/ep-engine/src/vbucket.cc:356 (ep.so+0x00000014b3e3)
    #1 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3334 (ep.so+0x000000099b9f)
    #2 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ffe7f)
    #3 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fe375)
    #4 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:62 (ep.so+0x00000013c928)
    #5 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f9503)
    #6 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9085)
    #7 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous read of size 8 at 0x7d680001f678 by main thread (mutexes: write M910146210457775232):
    #0 VBucket::getHighPriorityChkSize() /home/abhinav/couchbase/ep-engine/src/vbucket.cc:401 (ep.so+0x00000014b7d9)
    #1 VBucketCountVisitor::visitBucket(RCPtr<VBucket>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3030 (ep.so+0x0000000ba940)
    #2 VBucketCountAggregator::visitBucket(RCPtr<VBucket>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3067 (ep.so+0x0000000baf43)
    #3 EventuallyPersistentStore::visit(VBucketVisitor&) /home/abhinav/couchbase/ep-engine/src/ep.cc:3719 (ep.so+0x000000089917)
    #4 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3093 (ep.so+0x0000000bb5ab)
    #5 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4554 (ep.so+0x0000000c5cac)
    #6 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b4dee)
    #7 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #8 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/abhinav/couchbase/ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aebb1)
    #9 test_access_scanner(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:8569 (ep_testsuite.so+0x00000002efd7)
    #10 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #11 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Change-Id: I98021a535bd78d34e31d428e364192bb2ef33dcf
Reviewed-on: http://review.couchbase.org/62966
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19228: Address possible data races in ActiveStream context 18/62918/6
abhinavdangeti [Mon, 5 Oct 2015 22:22:48 +0000 (15:22 -0700)]
MB-19228: Address possible data races in ActiveStream context

Address possible data races in ActiveStream context when gathering
stats.

WARNING: ThreadSanitizer: data race (pid=27028)

  Read of size 8 at 0x7d480000b1f8 by main thread (mutexes: write M32941632, write M1367, write M32940809):
    #0 void STATWRITER_NAMESPACE::add_casted_stat<unsigned long>(char const*, unsigned long const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/statwriter.h:45 (ep.so+0x000000037825)
    #1 ActiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:477 (ep.so+0x000000071d16)
    #2 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/dcp/producer.cc:602 (ep.so+0x000000068057)
    #3 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3887 (ep.so+0x0000000e13e1)
    #4 EventuallyPersistentEngine::doDcpStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4144 (ep.so+0x0000000c151a)
    #5 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4564 (ep.so+0x0000000c5405)
    #6 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b422e)
    #7 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #8 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/abhinav/couchbase/ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aeb81)
    #9 dcp_stream(engine_interface*, engine_interface_v1*, char const*, void const*, unsigned short, unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, int, int, int, int, bool, bool, unsigned char, bool, unsigned long*, bool) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:4090 (ep_testsuite.so+0x00000009790c)
    #10 test_dcp_producer_stream_req_dgm(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:4564 (ep_testsuite.so+0x000000077604)
    #11 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #12 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous write of size 8 at 0x7d480000b1f8 by thread T9 (mutexes: write M32940880, write M32940855):
    #0 ActiveStream::backfillReceived(Item*, backfill_source_t) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:287 (ep.so+0x00000007054e)
    #1 DiskCallback::callback(GetValue&) /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:94 (ep.so+0x000000056067)
    #2 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1757 (ep.so+0x00000018103f)
    #3 recordDbDumpC(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:66 (ep.so+0x00000017fcc5)
    #4 lookup_callback(couchfile_lookup_request*, _sized_buf const*, _sized_buf const*) /home/abhinav/couchbase/couchstore/src/couch_db.cc:767 (libcouchstore.so+0x00000000d7f5)
    #5 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:99 (libcouchstore.so+0x00000000b5b2)
    #6 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:69 (libcouchstore.so+0x00000000b370)
    #7 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:69 (libcouchstore.so+0x00000000b370)
    #8 btree_lookup /home/abhinav/couchbase/couchstore/src/btree_read.cc:131 (libcouchstore.so+0x00000000b00c)
    #9 couchstore_changes_since /home/abhinav/couchbase/couchstore/src/couch_db.cc:812 (libcouchstore.so+0x00000000d601)
    #10 CouchKVStore::scan(ScanContext*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1264 (ep.so+0x00000017f77e)
    #11 DCPBackfill::scan() /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:193 (ep.so+0x000000057672)
    #12 DCPBackfill::run() /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:118 (ep.so+0x000000056647)
    #13 BackfillManager::backfill() /home/abhinav/couchbase/ep-engine/src/dcp/backfill-manager.cc:240 (ep.so+0x0000000508d5)
    #14 BackfillManagerTask::run() /home/abhinav/couchbase/ep-engine/src/dcp/backfill-manager.cc:43 (ep.so+0x00000005052f)
    #15 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f8796)
    #16 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8335)
    #17 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I166917524b5fcad285b3623ff160e875c316d983
Reviewed-on: http://review.couchbase.org/62918
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19227: Fix race in ConnNotifier.task access 17/62917/4
Dave Rigby [Mon, 17 Nov 2014 15:22:22 +0000 (15:22 +0000)]
MB-19227: Fix race in ConnNotifier.task access

Race identified by ThreadSanitizer:

      Read of size 8 at 0x7d08000c1430 by thread T10:
        #0 ConnNotifier::notifyConnections() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:127 (ep.so+0x0000002537e8)
        #1 ConnNotifierCallback::run() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:80 (ep.so+0x000000277e8b)
        #2 ExecutorThread::run() /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:110 (ep.so+0x0000002163af)
        #3 launch_executor_thread(void*) /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000002158ba)
        #4 platform_thread_wrap /home/vagrant/couchbase-server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000033f4)

      Previous write of size 8 at 0x7d08000c1430 by main thread:
        #0 ConnNotifier::start() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:99 (ep.so+0x00000025336f)
        #1 ConnMap::initialize(conn_notifier_type) /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:194 (ep.so+0x0000002544a6)
        #2 EventuallyPersistentEngine::initialize(char const*) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:2060 (ep.so+0x000000179b4a)
        #3 EvpInitialize(engine_interface*, char const*) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:135 (ep.so+0x000000173675)
        #4 init_engine /home/vagrant/couchbase-server/memcached/utilities/engine_loader.c:116 (libmcd_util.so.1.0.0+0x000000003fac)
        #5 start_your_engines /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:913 (exe+0x0000000a2fb5)
        #6 execute_test /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1048 (exe+0x0000000a3d29)
        #7 main /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x0000000a1d84)

Change-Id: I16cfdff1ea363bbb07a62a92f09f829483276b3d
Reviewed-on: http://review.couchbase.org/62917
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19226: Address potential data races in the warmup code 16/62916/4
abhinavdangeti [Thu, 16 Jul 2015 23:59:01 +0000 (16:59 -0700)]
MB-19226: Address potential data races in the warmup code

Context: warmupState, estimatedWarmupCount,
warmup, metadata

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=7023)
  Write of size 8 at 0x7d240000d5e0 by thread T7:
    #0 Warmup::checkForAccessLog() /home/daver/repos/couchbase/server/ep-engine/src/warmup.cc:590 (ep.so+0x0000002d1ffc)
    #1 WarmupCheckforAccessLog::run() /home/daver/repos/couchbase/server/ep-engine/src/warmup.h:303 (ep.so+0x0000002e2bfb)
    #2 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:106 (ep.so+0x0000001e34f9)
    #3 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2b7a)
    #4 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Previous read of size 8 at 0x7d240000d5e0 by main thread (mutexes: write M699180732592992152):
  #0 Warmup::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) const /home/daver/repos/couchbase/server/ep-engine/src/warmup.cc:893 (ep.so+0x0000002d6086)
  #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:4422 (ep.so+0x000000151813)
  #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:214 (ep.so+0x0000001367b2)
  #3 mock_get_stats /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:194 (engine_testapp+0x0000004bde63)
  #4 wait_for_warmup_complete(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_test_apis.cc:898 (ep_testsuite.so+0x0000000ead95)
  #5 test_setup(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_testsuite.cc:168 (ep_testsuite.so+0x0000000237d3)
  #6 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1037 (engine_testapp+0x0000004ba82b)
  #7 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

Change-Id: If96933b3b8b0aa1ed75073a0d8d629f138da081f
Reviewed-on: http://review.couchbase.org/62916
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19225: Fix data race on Flusher::taskId 15/62915/4
Dave Rigby [Wed, 15 Jul 2015 08:40:25 +0000 (08:40 +0000)]
MB-19225: Fix data race on Flusher::taskId

Flusher::taskId is accessed from different threads without correct
synchronization - see ThreadSanitizer report:

  WARNING: ThreadSanitizer: data race (pid=13527)
    Write of size 8 at 0x7d4400009650 by thread T14 (mutexes: write M5523):
      #0 Flusher::step(GlobalTask*) couchbase/ep-engine/src/flusher.cc:200 (ep.so+0x0000000e78ae)
      #1 FlusherTask::run() couchbase/ep-engine/src/tasks.cc:61 (ep.so+0x0000001202e2)
      #2 ExecutorThread::run() couchbase/ep-engine/src/executorthread.cc:124 (ep.so+0x0000000e2e05)
      #3 launch_executor_thread(void*) couchbase/ep-engine/src/executorthread.cc:34 (ep.so+0x0000000e28b9)
      #4 platform_thread_wrap .ccache/tmp/cb_pthread.tmp.7e5bc917ff0e.45579.i:0 (libplatform.so.0.1.0+0x000000003891)

    Previous read of size 8 at 0x7d4400009650 by main thread:
      #0 Flusher::wait() couchbase/ep-engine/src/flusher.cc:41 (ep.so+0x0000000e66cf)
      #1 EventuallyPersistentStore::~EventuallyPersistentStore() couchbase/ep-engine/src/ep.cc:514 (ep.so+0x000000071386)
      #2 EventuallyPersistentEngine::~EventuallyPersistentEngine() couchbase/ep-engine/src/ep_engine.cc:6201 (ep.so+0x0000000bfeea)
      #3 EvpDestroy(engine_interface*, bool) couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000a0e9c)
      #4 mock_destroy(engine_interface*, bool) couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000c4b87)
      #5 execute_test(test, char const*, char const*) couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000c4076)
      #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Given that taskId is a simple primitive type (size_t) fix by removing
the mutex (which wasn't acquired for all accesses) and replace taskId
with an atomic type.

Change-Id: Idc75278ed2882abd173297b77bdb72834cbe4163
Reviewed-on: http://review.couchbase.org/62915
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19225: Fix race in Flusher._state 14/62914/4
Dave Rigby [Mon, 17 Nov 2014 15:01:16 +0000 (15:01 +0000)]
MB-19225: Fix race in Flusher._state

Fix race identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=10664)
      Read of size 4 at 0x7d4400009888 by main thread (mutexes: write M18321):
        #0 Flusher::stateName() const /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:119 (ep.so+0x0000001fc22e)
        #1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:3025 (ep.so+0x000000180649)
        #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:4374 (ep.so+0x00000018d960)
        #3 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:214 (ep.so+0x000000174062)
        #4 mock_get_stats /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:196 (exe+0x0000000a741d)
        #5 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/vagrant/couchbase-server/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000dae46)
        #6 verify_curr_items(engine_interface*, engine_interface_v1*, int, char const*) /home/vagrant/couchbase-server/ep-engine/tests/ep_test_apis.cc:834 (ep_testsuite.so+0x0000000e1d31)
        #7 test_dcp_producer_stream_req_disk(engine_interface*, engine_interface_v1*) /home/vagrant/couchbase-server/ep-engine/tests/ep_testsuite.cc:3589 (ep_testsuite.so+0x000000094c9a)
        #8 execute_test /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000a3e83)
        #9 main /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x0000000a1d84)

      Previous write of size 4 at 0x7d4400009888 by thread T5:
        #0 Flusher::transition_state(flusher_state) /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:114 (ep.so+0x0000001fb7df)
        #1 Flusher::step(GlobalTask*) /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:167 (ep.so+0x0000001fc9d1)
        #2 FlusherTask::run() /home/vagrant/couchbase-server/ep-engine/src/tasks.cc:44 (ep.so+0x00000027870e)
        #3 ExecutorThread::run() /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:110 (ep.so+0x0000002160ff)
        #4 launch_executor_thread(void*) /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:34 (ep.so+0x00000021560a)
        #5 platform_thread_wrap /home/vagrant/couchbase-server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000033f4)

Change-Id: Iaeb60efdc1032de7ba344e04cce454cc1d876d40
Reviewed-on: http://review.couchbase.org/62914
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19224: Address possible data race with global task's waketime 13/62913/4
abhinavdangeti [Mon, 5 Oct 2015 19:18:47 +0000 (12:18 -0700)]
MB-19224: Address possible data race with global task's waketime

WARINING: ThreadSanitizer: data race (pid=7728)

Write of size 8 at 0x7d140000e8a8 by main thread (mutexes: write M11519, write M11535):
0 TaskQueue::_wake(SingleThreadedRCPtr<GlobalTask>&) /home/abhinav/couchbase/ep-engine/src/taskqueue.cc:272 (ep.so+0x000000142b59)
1 TaskQueue::wake(SingleThreadedRCPtr<GlobalTask>&) /home/abhinav/couchbase/ep-engine/src/taskqueue.cc:299 (ep.so+0x00000014382e)
2 ExecutorPool::_stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:568 (ep.so+0x0000000f2f46)
3 ExecutorPool::stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:585 (ep.so+0x0000000f31ee)
4 ~EventuallyPersistentStore /home/abhinav/couchbase/ep-engine/src/ep.cc:468 (ep.so+0x0000000830f6)
5 ~EventuallyPersistentEngine /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6326 (ep.so+0x0000000d42ba)
6 EvpDestroy(engine_interface*, bool) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000b3fbc)
7 mock_destroy(engine_interface*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000ba027)
8 destroy_bucket(engine_interface*, engine_interface_v1*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000b952e)
9 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Previous write of size 8 at 0x7d140000e8a8 by thread T10:
0 GlobalTask::snooze(double) /home/abhinav/couchbase/ep-engine/src/tasks.cc:56 (ep.so+0x00000013b6fa)
1 ConnManager::run() /home/abhinav/couchbase/ep-engine/src/connmap.cc:151 (ep.so+0x00000005032e)
2 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f86da)
3 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f82e5)
4 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Ib11f9b3cd6919e292f84cc08260eabd8e1381aa6
Reviewed-on: http://review.couchbase.org/62913
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19223: Switch to hrtime from timeval in Global Thread Pool 12/62912/4
Sundar Sridharan [Thu, 23 Jul 2015 22:39:59 +0000 (15:39 -0700)]
MB-19223: Switch to hrtime from timeval in Global Thread Pool

This has small improvements in memory and cpu usage.
Also fixes several ThreadSanitizer races from unit tests - for example:

WARNING: ThreadSanitizer: data race (pid=21672)
  Write of size 8 at 0x7d140000e7b8 by main thread (mutexes: write M14972, write M14985):
    #0 memcpy <null> (engine_testapp+0x000000453040)
    #1 TaskQueue::_wake(SingleThreadedRCPtr<GlobalTask>&) /home/daver/repos/couchbase/server/ep-engine/src/taskqueue.cc:255 (ep.so+0x0000002577f6)
    #2 TaskQueue::wake(SingleThreadedRCPtr<GlobalTask>&) /home/daver/repos/couchbase/server/ep-engine/src/taskqueue.cc:282 (ep.so+0x000000257c73)
    #3 ExecutorPool::_wake(unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/executorpool.cc:320 (ep.so+0x0000001acc76)
    #4 ExecutorPool::wake(unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/executorpool.cc:328 (ep.so+0x0000001ace13)
    #5 Flusher::wait() /home/daver/repos/couchbase/server/ep-engine/src/flusher.cc:41 (ep.so+0x0000001cc4ff)
    #6 EventuallyPersistentStore::stopFlusher() /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:402 (ep.so+0x0000000d54d5)
    #7 ~EventuallyPersistentStore /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:364 (ep.so+0x0000000d49cb)
    #8 ~EventuallyPersistentEngine /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:5778 (ep.so+0x000000161043)
    #9 EvpDestroy(engine_interface*, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:143 (ep.so+0x000000135efa)
    #10 mock_destroy /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:61 (engine_testapp+0x0000004bb9d6)
    #11 destroy_engine /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:998 (engine_testapp+0x0000004bb646)
    #12 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1048 (engine_testapp+0x0000004baa11)
    #13 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

  Previous read of size 8 at 0x7d140000e7b8 by thread T14:
    #0 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:106 (ep.so+0x0000001e3488)
    #1 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2a5a)
    #2 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Change-Id: I78fdddb832251fc062058c04f75f8d22c4c2f68d
Reviewed-on: http://review.couchbase.org/62912
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19222: Fix race condition in TaskQueue shutdown 11/62911/4
Dave Rigby [Wed, 12 Nov 2014 14:58:28 +0000 (14:58 +0000)]
MB-19222: Fix race condition in TaskQueue shutdown

There is a bug in the use of ExecutorThread.state when sleeping a
TaskQueue - TaskQueue::_doSleep() doesn't atomically transition the
state from RUNNING -> SLEEPING. This can cause a deadlock when
shutting down a ExecutorThread:

    Thread A:                           Thread B:
    --------------------------------    ------------------------------
    if (t.state == RUNNING) {  // true
                                        t.state = SHUTDOWN
        t.state = SLEEPING              cb_join_thread(Thread A)
                                        // wait forever
    ...
    if (t.state == SHUTDOWN) { // FALSE
      exit(0) // NEVER REACHED
    }

Fix by changing ExecutorThread.state to be an AtomicValue, and use
compare-and-exchange to move from RUNNING -> SLEEPING (and SLEEPING ->
RUNNING).

Change-Id: I9fab90a83978ae2aa6a0dcdd3b079a1c2f369402
Reviewed-on: http://review.couchbase.org/62911
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19220: Ensure HashTable::size is atomic 10/62910/3
Dave Rigby [Thu, 14 Apr 2016 10:31:27 +0000 (11:31 +0100)]
MB-19220: Ensure HashTable::size is atomic

Fix data race as reported by ThreadSanitizer. I'm pretty sure this is
beniegn, as size can only be modified if *all* HashTable mutexes are
acquired (see HashTable::resize()), and all callers of
getBucketForHash() either do so with at least 1 HashTable mutex
acquired, or they perform double-checked locking (as per this
instance).

TSan report:
    WARNING: ThreadSanitizer: data race (pid=11329)
      Write of size 8 at 0x7ffe3a0ef1f8 by thread T9 (mutexes: write M45069, write M45070, write M45071):
        #0 HashTable::resize(unsigned long) ep-engine/src/stored-value.cc:370 (ep-engine_hash_table_test+0x00000050173c)
        #1 AccessGenerator::resize() ep-engine/tests/module_tests/hash_table_test.cc:314 (ep-engine_hash_table_test+0x0000004df4d9)
        #2 AccessGenerator::operator()() ep-engine/tests/module_tests/hash_table_test.cc:304 (ep-engine_hash_table_test+0x0000004df3fd)
        #3 SyncTestThread<bool>::run() ep-engine/tests/module_tests/threadtests.h:94 (ep-engine_hash_table_test+0x0000004e6bd8)
        #4 _ZL23launch_sync_test_threadIbEvPv ep-engine/tests/module_tests/threadtests.h:66 (ep-engine_hash_table_test+0x0000004c9ca8)
        #5 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

      Previous read of size 8 at 0x7ffe3a0ef1f8 by thread T8:
        #0 HashTable::getBucketForHash(int) ep-engine/src/stored-value.h:1470 (ep-engine_hash_table_test+0x0000004d32b6)
        #1 HashTable::getLockedBucket(int, int*) ep-engine/src/stored-value.h:1265 (ep-engine_hash_table_test+0x0000004d2e69)
        #2 HashTable::getLockedBucket(std::string const&, int*) ep-engine/src/stored-value.h:1295 (ep-engine_hash_table_test+0x0000004d277b)
        #3 HashTable::del(std::string const&) ep-engine/src/stored-value.h:1370 (ep-engine_hash_table_test+0x0000004d8f88)
        #4 AccessGenerator::operator()() ep-engine/tests/module_tests/hash_table_test.cc:306 (ep-engine_hash_table_test+0x0000004df430)
        #5 SyncTestThread<bool>::run() ep-engine/tests/module_tests/threadtests.h:94 (ep-engine_hash_table_test+0x0000004e6bd8)
        #6 _ZL23launch_sync_test_threadIbEvPv ep-engine/tests/module_tests/threadtests.h:66 (ep-engine_hash_table_test+0x0000004c9ca8)
        #7 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Change-Id: I97c189310aafa8a002299f73cab9cbfb0e619768
Reviewed-on: http://review.couchbase.org/62910
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: ep_testsuite: Don't release the item while we're using it 09/62909/3
Trond Norbye [Wed, 5 Nov 2014 19:24:03 +0000 (20:24 +0100)]
MB-19204: ep_testsuite: Don't release the item while we're using it

Fixes a number of issues detected by ThreadSanitizer.

Change-Id: I6ab6c9fee2497d0843af47647f33bcce73111f76
Reviewed-on: http://review.couchbase.org/62909
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: Address data race in ep_test_apis/testsuite 08/62908/3
Dave Rigby [Thu, 14 Apr 2016 13:08:17 +0000 (14:08 +0100)]
MB-19204: Address data race in ep_test_apis/testsuite

WARNING: ThreadSanitizer: data race (pid=18824)

  Write of size 4 at 0x7fcc31350244 by thread T12 (mutexes: write M44413):
    #0 add_response ep-engine/tests/ep_test_apis.cc:75 (ep_testsuite.so+0x0000000acbcc)
    #1 sendResponse(bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*), void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*) ep-engine/src/ep_engine.cc:92 (ep.so+0x0000000d004c)
    #2 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1266 (ep.so+0x0000000d603c)
    #3 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1387 (ep.so+0x0000000b3f58)
    #4 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:380 (engine_testapp+0x0000000bab29)
    #5 del_with_meta(engine_interface*, engine_interface_v1*, char const*, unsigned long, unsigned int, ItemMetaData*, unsigned long, bool, bool, long, unsigned char, void const*) ep-engine/tests/ep_test_apis.cc:360 (ep_testsuite.so+0x0000000ae69f)
    #6 multi_del_with_meta(void*) ep-engine/tests/ep_testsuite.cc:13299 (ep_testsuite.so+0x00000009572e)
    #7 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 4 at 0x7fcc31350244 by thread T11 (mutexes: write M1638603450185076640):
    #0 add_response ep-engine/tests/ep_test_apis.cc:75 (ep_testsuite.so+0x0000000acbcc)
    #1 sendResponse(bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*), void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*) ep-engine/src/ep_engine.cc:92 (ep.so+0x0000000ced72)
    #2 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1258 (ep.so+0x0000000d5718)
    #3 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1387 (ep.so+0x0000000b3f58)
    #4 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:380 (engine_testapp+0x0000000bab29)
    #5 set_with_meta(engine_interface*, engine_interface_v1*, char const*, unsigned long, char const*, unsigned long, unsigned int, ItemMetaData*, unsigned long, bool, unsigned char, bool, long, unsigned char, void const*) ep-engine/tests/ep_test_apis.cc:702 (ep_testsuite.so+0x0000000b24db)
    #6 multi_set_with_meta(void*) ep-engine/tests/ep_testsuite.cc:13279 (ep_testsuite.so+0x000000094e33)
    #7 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Ic53d401fb674dbe161aa73381e2c08c5995f262a
Reviewed-on: http://review.couchbase.org/62908
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: ep_testsuite: Use std::string for last_key/body 07/62907/4
Dave Rigby [Thu, 2 Jul 2015 15:25:10 +0000 (15:25 +0000)]
MB-19204: ep_testsuite: Use std::string for last_key/body

Replace the manually-managed char* for last_body and last_key with
std::string. This solves the issue of leaving these two buffers
un-free'd at the end of a test; and gives simplifies managing and
testing the last body & key values.

Change-Id: Ic1c64032e34e7abbe5ba8de3e16c115a78a6632f
Reviewed-on: http://review.couchbase.org/62907
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: Remove alarm() call from atomic_ptr_test, reduce iteration count 06/62906/2
Dave Rigby [Thu, 14 Apr 2016 11:14:39 +0000 (12:14 +0100)]
MB-19204: Remove alarm() call from atomic_ptr_test, reduce iteration count

This test runs slower under ThreadSanitizer than normal. Given we
already have CTest enforcing timeouts, remove the explicit alarm calls
and handle the timeout at the CTest level.

Also reduce the iteration count by 10x, so the test runs in a more
resonable time.

Change-Id: Ia6914c10a3073f5fea121cb7e600568ca5081beb
Reviewed-on: http://review.couchbase.org/62906
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19204: hash_table_test: Fix TSan issues 05/62905/2
Dave Rigby [Tue, 6 Oct 2015 10:53:21 +0000 (10:53 +0000)]
MB-19204: hash_table_test: Fix TSan issues

Fix issues with hash_table_test on 3.x:

* The default number of HashTable locks (193) causes problems for
  ThreadSanitizer as it exceeds the maximum number of acquired locks
  it can track. Given that the tests where we do not already set the
  lock count are single-threaded, change these to have 1 lock.

* Remove alarm() calls - the tests take longer when run under TSan,
  and given that CTest already enforeces a test-level timeout these
  are redundent inside the test functions.

* Fix data race on AccessGenerator::size test harness.

Change-Id: Ib30b36bbd6517f1326660ae578a12d93e4d828c7
Reviewed-on: http://review.couchbase.org/62905
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call 89/62889/5 v3.1.5
Manu Dhundi [Fri, 15 Apr 2016 15:38:44 +0000 (08:38 -0700)]
MB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call

For replica vbucket we must send snapshotEnd received in the last snapshotMarker
as the high seqno. Sending lastClosedChkSeqno can cause problems for view engine
which builds an index from replica vbucket.

Previously this was sent correctly in seqno stats, now adding it for
GET_ALL_VB_SEQNOS as well.

Change-Id: I58dd168f9248263172759616bc53e751b536e5e3
Reviewed-on: http://review.couchbase.org/62889
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-19153: Break circular dependency while deleting bucket 99/62699/4
abhinavdangeti [Tue, 12 Apr 2016 01:28:37 +0000 (18:28 -0700)]
MB-19153: Break circular dependency while deleting bucket

As part of unregistering the last bucket, when stopTaskGroup
is invoked, all the running threads will cancelled. In this
issue reported, when DcpBackfill was closed, the ref count of
the DcpProducer whose reference it was holding on to became
zero, causing its destructor to be invoked. In the DcpProducer's
destructor, an attempt was made to cancel the checkpoint creator
task which needed to acquire the executorpool's tMutex that
unregisterBucket had already acquired.

Reproduction steps:
<delete_bucket> --> <unregister_bucket> --> <stop_task_group>
    --> <acquire tMutex> --> .. --> <cancel DcpBackfill> -->
    --> <destroy DcpBackfill> --> <destroy DcpProducer>
    --> <cancel Checkpoint creator task> --> [tries to acquire tMutex]

The fix here would be to not attempt to kill the task within
the DcpProducer's destructor, but to do so when the producer is
being disconnected.

+ Unit test case that reproduces the hang.

Change-Id: Ia3c0597e3d8f85a1b40ef56e251e38339023b471
Reviewed-on: http://review.couchbase.org/62699
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-19113: Address false positive lock inversion seen with test_mb16357 67/62567/7
abhinavdangeti [Fri, 8 Apr 2016 20:57:42 +0000 (13:57 -0700)]
MB-19113: Address false positive lock inversion seen with test_mb16357

The inversion being pointed out by this test case is between the snapshot
lock and the hash table lock and this scenario would never happen
during regular operation, this is because the first thread points out
that the vbucket is active while the second indicates that the vbucket
is replica. These 2 operations can never occur simulataneously.

ThreadSanitizer assumes that this is a lock inversion because the 2
operations are done by different threads (main_thread and dcp_thread).

The fix here suppresses this thread sanitizer warning by getting rid of
the dcp_thread, and instead have the main_thread perform the activity
that the dcp thread is responsible for.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=5899)
  Cycle in lock order graph: M21372 (0x7d780000f510) => M21408 (0x7d640000f920) => M21372

  Mutex M21408 acquired here while holding mutex M21372 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e970)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x000000003870)
    #2 Mutex::acquire() /home/couchbase/couchbase/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e287e)
    #3 LockHolder::lock() /home/couchbase/couchbase/ep-engine/src/locks.h:71 (ep.so+0x000000082543)
    #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/couchbase/ep-engine/src/locks.h:48 (ep.so+0x0000000821b2)
    #5 VBucket::getSnapshotLock() /home/couchbase/couchbase/ep-engine/src/vbucket.h:212 (ep.so+0x000000104c72)
    #6 EventuallyPersistentStore::queueDirty(RCPtr<VBucket>&, StoredValue*, LockHolder*, bool, bool, bool) /home/couchbase/couchbase/ep-engine/src/ep.cc:2863 (ep.so+0x0000000d7123)
    #7 EventuallyPersistentStore::set(Item const&, void const*, bool, unsigned char) /home/couchbase/couchbase/ep-engine/src/ep.cc:683 (ep.so+0x0000000d9dfa)
    #8 EventuallyPersistentEngine::store(void const*, void*, unsigned long*, ENGINE_STORE_OPERATION, unsigned short) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:2128 (ep.so+0x00000013d538)
    #9 EvpStore(engine_interface*, void const*, void*, unsigned long*, ENGINE_STORE_OPERATION, unsigned short) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:229 (ep.so+0x00000013712d)
    #10 mock_store /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c7304)
    #11 storeCasVb11(engine_interface*, engine_interface_v1*, void const*, ENGINE_STORE_OPERATION, char const*, char const*, unsigned long, unsigned int, void**, unsigned long, unsigned short, unsigned int, unsigned char) /home/couchbase/couchbase/ep-engine/tests/ep_test_apis.cc:659 (ep_testsuite.so+0x0000000e8d17)
    #12 store(engine_interface*, engine_interface_v1*, void const*, ENGINE_STORE_OPERATION, char const*, char const*, void**, unsigned long, unsigned short, unsigned int, unsigned char) /home/couchbase/couchbase/ep-engine/tests/ep_test_apis.cc:631 (ep_testsuite.so+0x0000000e654a)
    #13 test_mb16357(engine_interface*, engine_interface_v1*) /home/couchbase/couchbase/ep-engine/tests/ep_testsuite.cc:11713 (ep_testsuite.so+0x0000000afc36)
    #14 execute_test /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e2f)
    #15 main crtstuff.c (engine_testapp+0x0000004c2d91)

  Mutex M21372 acquired here while holding mutex M21408 in thread T10:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e970)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x000000003870)
    #2 Mutex::acquire() /home/couchbase/couchbase/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e287e)
    #3 LockHolder::lock() /home/couchbase/couchbase/ep-engine/src/locks.h:71 (ep.so+0x000000082543)
    #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/couchbase/ep-engine/src/locks.h:48 (ep.so+0x0000000821b2)
    #5 HashTable::getLockedBucket(int, int*) /home/couchbase/couchbase/ep-engine/src/stored-value.h:1266 (ep.so+0x00000008418a)
    #6 HashTable::getLockedBucket(std::string const&, int*) /home/couchbase/couchbase/ep-engine/src/stored-value.h:1295 (ep.so+0x00000007df9b)
    #7 EventuallyPersistentStore::setWithMeta(Item const&, unsigned long, void const*, bool, bool, unsigned char, bool, bool) /home/couchbase/couchbase/ep-engine/src/ep.cc:1827 (ep.so+0x0000000e6b4f)
    #8 PassiveStream::commitMutation(MutationResponse*, bool) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1369 (ep.so+0x00000029ba21)
    #9 PassiveStream::processMutation(MutationResponse*) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1341 (ep.so+0x00000029a7a0)
    #10 PassiveStream::processBufferedMessages(unsigned int&) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1281 (ep.so+0x00000029a0f2)
    #11 DcpConsumer::processBufferedItems() /home/couchbase/couchbase/ep-engine/src/dcp-consumer.cc:599 (ep.so+0x000000262a23)
    #12 Processer::run() /home/couchbase/couchbase/ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002625ff)
    #13 ExecutorThread::run() /home/couchbase/couchbase/ep-engine/src/executorthread.cc:110 (ep.so+0x0000001e3dd9)
    #14 launch_executor_thread(void*) /home/couchbase/couchbase/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e32ea)
    #15 platform_thread_wrap /home/couchbase/couchbase/platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000362c)

Change-Id: I6c7b1fadf76529a044341a4a9b6ed0ea829c4999
Reviewed-on: http://review.couchbase.org/62567
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-19093 [BP]: [ActiveStream] Address potential lock-inversion scenarios 98/62498/4
abhinavdangeti [Fri, 15 Jan 2016 16:36:52 +0000 (08:36 -0800)]
MB-19093 [BP]: [ActiveStream] Address potential lock-inversion scenarios

Acquire vbucket state lock only when really necessary
in the ActiveStream context. Also avoid acquiring one
lock within the other wherever possible in the ActiveStream
context again.

This change is to avert potential deadlocks due to
lock inversion that will be induced by upcoming changes,
here are the scenarios:
(i)     Locking between streamsMutex, streamMutex and
        vb_stateLock in the set operation - handle
        response scenario.
        (http://factory.couchbase.com/job/ep-engine-threadsanitizer-master/1225/console)
(ii)    In case of a set operation, vb_stateLock is
        acquired and then streamMutex is acquired for
        notification. During markDiskSnapshot, the
        streamMutex is acquired before the vb_stateLock
        lock is acquired.
        (http://factory.couchbase.com/job/ep-engine-threadsanitizer-master/1268/console)

(Already reviewed at: http://review.couchbase.org/58557)

Change-Id: I5e5a3e2cc5ba9ae17090e1a3ee4bde100d305f1c
Reviewed-on: http://review.couchbase.org/62498
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-19075: Remove printing of empty string in CouchKVStore::getMulti() 25/62325/7
Sriram Ganesan [Fri, 1 Apr 2016 23:12:10 +0000 (16:12 -0700)]
MB-19075: Remove printing of empty string in CouchKVStore::getMulti()

In case of an error in opening a file, an error message is logged.
But the string that is supposed to hold the name of the file is
not populated, thus resulting in an empty string getting printed.
Remove the string from printed as openDB already prints the name
of the file in case of an open failure.

Change-Id: Ife3aec8381ead4f2e0b84c921a3781efa39a2126
Reviewed-on: http://review.couchbase.org/62325
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17889: No notify from streamRequest 52/60352/3 v3.1.4
Jim Walker [Mon, 22 Feb 2016 21:01:09 +0000 (21:01 +0000)]
MB-17889: No notify from streamRequest

The 3.1.3 (i.e. before the DCP churn) didn't ever notify from
streamRequest, so that is reverted and helps to bring
view_query latency down (view engine is constantly creating
streams).

A second tweak is to not call stream->next whilst holding
the streamMutex. This can block streamRequest again
affecting the view-engine's DCP stream requests.

Change-Id: I5b57fd7998003251fb32897f37c8a2f15f687a13
Reviewed-on: http://review.couchbase.org/60352
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17889: Address view_query latency regression 50/59750/3
Jim Walker [Wed, 10 Feb 2016 16:33:10 +0000 (16:33 +0000)]
MB-17889: Address view_query latency regression

Some of the stale=false query latency tests show a 2x increase
in the 80 and 95 percentile. This is observed when moving from
3.1.4 1835 to 1836.

The prime suspect is that the snapshot yield parameter is now
stalling the view engine's DCP data.

In the change from 1835 to 1836 this value was tuned based upon
queue lengths observed during rebalance, moving from 10 to 256.

It could be that the task now runs for longer periods blocking
DCP backfill from running, and view engine drives many backfills
due to the way it frequently closes and opens streams.

Note that DCP Backfill and the snapshot task both share the same
task type, so can block each other.

This patch moves this config value back to 10 (as it was in 1835).

The view-query latency is very difficult to relibably reproduce, observe
and tune, but there is evidence (a trend) that with this config value
at 10, the performance (view latency) is improved.

Some small scale rebalance tests (3 node cluster, swap 1 node for 1)
showed that with 10 rebalance was not adversly affected, but it's a risk.

A latency comparison is attached to the MB which hints that
the latency is better.

https://issues.couchbase.com/secure/attachment/29513/29513_benchmark.png

Change-Id: I6ecf8ff950f77638eb03e4fedaefb700cf945d54
Reviewed-on: http://review.couchbase.org/59750
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17889: DCP producer can leave an operation behind 96/60196/3
Jim Walker [Thu, 18 Feb 2016 15:14:28 +0000 (15:14 +0000)]
MB-17889: DCP producer can leave an operation behind

In low-traffic setups there's a case in DcpProducer::getNextItem
where the function exits and pauses the task, yet an item was
waiting to be sent.

getNextItem() looked like

  1. setpaused=false;
  2. while(ready.pop(vbucket)) {
  3.   process(vbucket);
  4. }
  5. setpaused=true;
  6. return NULL;

The notifier

  a. if(ready.pushUnique(vbucket))
  b.   wakeupIfPaused;

If a and b execute between 4 and 5, then the producer will sleep
and not process the vbucket until the next wakeup (which maybe never).

This is not good, the first operation will have a long latency before
it can be seen on DCP. As long as it takes for the second operation.

If a and b occur between 5 and 6, that's fine, wakupIfPaused will re-wake
the producer.

a,b,5,6 is bad
5,a,6,b is ok
5,6,a,b is ok
5,a,b,6 is ok
5,6,a,b is ok
...

The fix.

getNextItem()

  0. do {
  1.   setpaused=false;
  2.   while(ready.pop(vbucket)) {
  3.     process(vbucket);
  4.   }
  5.   setpaused=true;
  6.  } while(!ready.empty());
  7. return NULL;

Now if ab occurs after 4, but before 5, it's ok as 6 will now consume
the vbucket.

5,a,b,6 is ok, as 6 will loop and consume
5,a,6,b is ok, "    "    "    "    "
6,a,b,7 is ok, paused is true (5), b will wake the task

Change-Id: Ib412a85ee10de0e2a2ca4116d0cc85bbad538da2
Reviewed-on: http://review.couchbase.org/60196
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-18171: Break cyclic reference between ActiveStream & ChkptProcesser 60/60060/4
abhinavdangeti [Tue, 16 Feb 2016 18:49:09 +0000 (10:49 -0800)]
MB-18171: Break cyclic reference between ActiveStream & ChkptProcesser

Removing circular dependency between ActiveStream and
ActiveStreamCheckpointProcesserTask where each holds a reference
to the other causing a memory leak during shutdown.

Also explicitly clear the queues of checkpointProcessor task upon
disconnection of the DcpProducer, so as to remove a cyclic reference
between DcpProducer, ActiveStream, and ActiveStreamCheckpointProcesserTask.

Change-Id: Ifac03a40132431476a6b5000725ce972068b47f4
Reviewed-on: http://review.couchbase.org/60060
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Set maxNumAuxIO in stream_test to zero 59/60059/2
abhinavdangeti [Tue, 16 Feb 2016 19:12:53 +0000 (11:12 -0800)]
MB-17766: Set maxNumAuxIO in stream_test to zero

Setting maxNumAuxIO to zero will ensure that the producer's
ActiveStreamCheckpointProcesserTask will never run causing
unexpected results in the test context.

Change-Id: I5e7f4b18b1b72af1f99e83cadc5ee979dbcd4cae
Reviewed-on: http://review.couchbase.org/60059
Tested-by: buildbot <build@couchbase.com>
Well-Formed: Hari Kodungallur <hari.kodungallur@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years ago[BP] MB-17766: Fix intermittant stream_test failure 58/60058/2
Dave Rigby [Tue, 16 Feb 2016 12:54:38 +0000 (12:54 +0000)]
[BP] MB-17766: Fix intermittant stream_test failure

Address two issues:

1) end sequence numbers were incorrect, which could result in not
   having any items in our cursor.
2) Don't check CheckpointMamager::registerCursor() return falue, we
   don't actually care if any other cursors are already registered for
   a given checkpoint (persistence cursor sometimes registers before
   us).

Change-Id: I1145d5fb61c0c12f019154c979afdd50b4060509
Reviewed-on: http://review.couchbase.org/60058
Tested-by: buildbot <build@couchbase.com>
Well-Formed: Hari Kodungallur <hari.kodungallur@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17766: Regression test that checks for race during takeover 66/59666/8
Dave Rigby [Tue, 9 Feb 2016 18:21:25 +0000 (18:21 +0000)]
MB-17766: Regression test that checks for race during takeover

Module test: ep-engine_stream_test

Change-Id: I8e11722b1ed1029c8b969dcb88000c5903fbb0ca
Reviewed-on: http://review.couchbase.org/59666
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Incorrect ordering of messages during ActiveStream's takeover-send phase 27/59627/7
abhinavdangeti [Tue, 9 Feb 2016 20:18:01 +0000 (12:18 -0800)]
MB-17766: Incorrect ordering of messages during ActiveStream's takeover-send phase

A race between the step() and ActiveStreamCheckpointProcessorTask
can cause mutations to be queued into the readyQ after the
setVBucketState(active) message.

Here's the scenario in chronology (T: Front-end thread, BT: IO thread):
1. T1: ActiveStream::setVBucketStateAckRecieved()
2. T1: transitionState(takeoverSend) => schedules ActiveStreamCheckpointProcessorTask
3. BT1: manageConnections() notifies memcached about specific conn (max idle time: 5s)
4. BT2: ActiveStreamCheckpointProcessorTask runs, gets all Items For Cursor
5. T1: step() -> takeoverSendPhase() -> readyQ is empty
        => nextCheckpointItem() return false, as getNumItemsForCursor returns 0
        => setVbucketState(active) added to readyQ
6. BT2: ActiveStreamCheckpointProcessorTask continues, adds mutations acquired into readyQ
        -> Note the mutations were acquired in step 4
        => Notified memcached connections
7. T1: step() .. ships messages in incorrect order

On the new master, the vbucket is promoted to active state and then more
mutations are received from the old master. If there were front end ops
at this time, there could be an inconsistency in highSeqno or in worst
cases crashes in checkpoint manager due to highSeqno not belonging in
the designated range.

The fix: Add an atomic flag that is also checked for along with
getNumItemsForCursor in nextCheckpointItem(). This flag is set before
retrieving all items for a cursor (getAllItemsForCursor) and unset after
all the retrieved items have been added to the ready queue of the stream.

Change-Id: I5c04d47cc99c7dd3b2d87cb68dd30d36473226e5
Reviewed-on: http://review.couchbase.org/59627
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Avoid copy overhead of std::deque in getOutstandingItems 54/59754/2
abhinavdangeti [Wed, 10 Feb 2016 18:03:03 +0000 (10:03 -0800)]
MB-17766: Avoid copy overhead of std::deque in getOutstandingItems

Change-Id: I771182bd54a0f702f70287ff4728d26b7ffaa323
Reviewed-on: http://review.couchbase.org/59754
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Refactor nextCheckpointItemTask to allow testing 64/59664/4
Dave Rigby [Tue, 9 Feb 2016 13:33:38 +0000 (13:33 +0000)]
MB-17766: Refactor nextCheckpointItemTask to allow testing

Split nextCheckpointItemTask() into two inner (protected) functions,
to allow testing of the fix for MB-17766.

Change-Id: I9d441d873cf7f727f90a966d4dda03043c7f6480
Reviewed-on: http://review.couchbase.org/59664
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17885: Address compilation errors in ep_testsuite.cc 78/59678/4
abhinavdangeti [Tue, 9 Feb 2016 22:27:10 +0000 (14:27 -0800)]
MB-17885: Address compilation errors in ep_testsuite.cc

3.0.x don't support C++11!

Change-Id: Ia3adf0c7ace9b771b999427811f0872774740386
Reviewed-on: http://review.couchbase.org/59678
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>

5 years agoMB-17885: Update flow control bytesSent correctly on DCP producer 75/59575/4
Manu Dhundi [Mon, 8 Feb 2016 17:42:25 +0000 (09:42 -0800)]
MB-17885: Update flow control bytesSent correctly on DCP producer

This is a fix for a regression introduced recently. Also this adds
a DCP test case to test flow control behavior of DCP producer.

Change-Id: Ia56858cb9e687a0a045b582c18e4b68948cb460c
Reviewed-on: http://review.couchbase.org/59575
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17502: DCP performance regression fixed. 86/58886/9
Jim Walker [Wed, 6 Jan 2016 15:40:57 +0000 (15:40 +0000)]
MB-17502: DCP performance regression fixed.

Many patches were added to speed up DCP, however some of that
performance was lost when doing some code tidying without
re-profiling.

With all the DCP performance patches (particuarly) 87869fd39 straight
DCP performance is a touch slower. This is because DCP used to take
one lock and then do work. The new code has more locks, but holds
them for fewer lines of code. This means that DCP is friendlier/fairer
to the other threads interacting with a DCP producer.
The front-end operation threads are no longer stalled for long periods
whilst DCP holds the one lock.

Frontend latency before locking changes:

 === Latency [With background DCP] - 100000 items
                                 Percentile
                   Median     95th     99th  Std Dev
 Add               16.337   34.894   45.241   25.627
 Get                1.226    1.524    1.745    0.435
 Replace           16.311   34.386   42.097    8.435
 Delete            15.636   32.915   41.999    7.408

Frontend latency after locking changes:

 === Latency [With background DCP] - 100000 items
                                 Percentile
                   Median     95th     99th  Std Dev
 Add                3.996   12.159   20.724   11.376
 Get                1.299    1.629    1.730    0.634
 Replace            4.274   12.831   22.988    4.523
 Delete             3.142   10.302   14.292    3.350

The average and 95th/99th are all improved.

Fix details:

The roundRobin/vbReady code has a bufferLog.pauseIfFull call on the
"hot" part of the loop, this is the main cause of the regression.

With that fixed CPU profiling and benchmarking shows that DCP is back
to 3.1.3 levels but highlighted that:

1. DcpProducer::getNextItem was hot (5% of a DcpProducer thread).
2. DcpConsumer::processBufferedItems was hitting SpinLock hard.
   20 to 30% at times was consumed by SpinLock code.
3. snapshot creation was frequently yielding even though it had work todo.

So to address 1. the fix is actually to remove the roundRobin/vbReady
code. It is actually no better and in some cases a little slower than
the orginal. This code is replaces with std:: structures *but* the
Mutex used has a much smaller scope.

Note the DcpProducerReadyQueue has been profiled and proven that having
the std::map powering find() is much faster than searching the list.
This is important because the find method is part of the front-end
operation thread.

To address 2. it was observed that the consumer code is constructing
a passive_stream_t frequently, then testing if there is a pointer.
The construction uses the SpinLock code and can be avoided just by
testing the streams[vb] directly and only then do we construct
a copy of the passive_stream_t. This avoids the SpinLock code on
every iteration of the for loop in the affected function.

To address 3. ensure that the snapshot tasks work queue doesn't have
duplicates, there's no need. Then raise the number of snapshots before
yield. Various rebalances showed that around 250 was enough, so let's go
with 256.

Change-Id: I8fb0bd30f8e07d000192675de425726ad26e403a
Reviewed-on: http://review.couchbase.org/58886
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17086: Fix to performance regression. 80/58380/4
Jim Walker [Fri, 8 Jan 2016 14:02:43 +0000 (14:02 +0000)]
MB-17086: Fix to performance regression.

Revert "MB-16632: As part of queueDirty schedule a DCP connections notifier task"

This reverts commit fa17728e7ca0c637c84a2208b5decfe7ba7e54f1.

Performance testing showed that a regression has been introduced and that
fa17728 was the cause.

The regression was introduced by some fixes made during review that weren't
re-profiled.

Performance can be improved by making some further changes but the investigation
revealed that performance is actually at its best without fa17728.

Change-Id: I7ac3ff49d0b9ce8563f3a932dd337a58d03a0153
Reviewed-on: http://review.couchbase.org/58380
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16632: Use a background task to handle snapshot creation 48/57148/17
Jim Walker [Thu, 12 Nov 2015 11:14:11 +0000 (11:14 +0000)]
MB-16632: Use a background task to handle snapshot creation

Frontend threads are delayed by large snaphots due to the time taken
in processing the items into the readyQ.

Moving this work to a background task frees frontend threads to
do other work.

Change-Id: Ic399ef06be996b7b7e179c4c8934a0f5a74cb8f7
Reviewed-on: http://review.couchbase.org/57148
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17220: [BP] Seperate logs for notifying seqno/checkpoint persistence 33/58233/3
abhinavdangeti [Tue, 1 Dec 2015 19:08:41 +0000 (11:08 -0800)]
MB-17220: [BP] Seperate logs for notifying seqno/checkpoint persistence

- Print different logs while notifying completion or timeouts
during seqno persistence and checkpoint persistence.
- Also adding additional information to the logs.

Change-Id: Idf29cab2197f37b180b0295b19f6b46542bdc6b6
Reviewed-on: http://review.couchbase.org/58233
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17168: Log lastSentSeqno during takeover state change 17/58217/3
abhinavdangeti [Mon, 4 Jan 2016 17:43:01 +0000 (09:43 -0800)]
MB-17168: Log lastSentSeqno during takeover state change

When an active vbucket state is changed to dead as part
of takeover, log a message that would indicate the last
sent seqno for the vbucket on the particular stream and
the vbucket's high seqno.

Change-Id: I7097b79cf41b2c62688ddb9345bc529ac08b2223
Reviewed-on: http://review.couchbase.org/58217
Reviewed-by: Sriram Ganesan <sriram@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16656: Set the open chkpt id on replica to 0 when disk snapshot is recvd. 34/57834/4
Manu Dhundi [Sat, 19 Dec 2015 02:15:16 +0000 (18:15 -0800)]
MB-16656: Set the open chkpt id on replica to 0 when disk snapshot is recvd.

Currently due to a bug in 3.0.x the open checkpoint id is not set to 0
when replica receives a disk snapshot from active.

Change-Id: Iffda89b8da713539a52d50aa4acc33458ae7150e
Reviewed-on: http://review.couchbase.org/57834
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16656: Stream a full (disk+mem) snapshot from DCP producer on replica vb 16/57816/7
Manu Dhundi [Mon, 21 Dec 2015 21:42:07 +0000 (13:42 -0800)]
MB-16656: Stream a full (disk+mem) snapshot from DCP producer on replica vb

A replica vbucket receives items from an active vbucket, and until a full
snapshot is received the data on the replica vbucket is not consistent due
to de-duplication and other reasons. Hence while streaming items to a DCP
client from a replica vbucket we need to combine backfill and in memory
snapshots and send items in one snapshot. A caveat here is the replica vb
might not have received all the items in the latest (memory) snapshot, so the
DCP client streaming from replica will have to wait till the replica gets
all the items in the latest snapshot from the active.

Change-Id: I4db622f967316d120506dc9b125211578194bb60
Reviewed-on: http://review.couchbase.org/57816
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
5 years ago[DcpStream] Removing extra exception/abort that was added recently 62/57962/3
abhinavdangeti [Mon, 21 Dec 2015 20:03:40 +0000 (12:03 -0800)]
[DcpStream] Removing extra exception/abort that was added recently

Exceptions in 3.0.x are unhandled which makes them pretty
much the same as aborts/asserts.

Although it is impossible for the event where an active stream
enters STREAM_READING state to occur , it may be the better
thing to do - to have the risk of hitting this assertion be ZERO
for the maintainance releases only.

Change-Id: I0a1eff5ab6c8cec8ad6d97e9a1c2201844c25fbd
Reviewed-on: http://review.couchbase.org/57962
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Sriram Ganesan <sriram@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17051: [DcpProducer] Ensure no un-notified streams are left behind 67/57867/9
abhinavdangeti [Thu, 17 Dec 2015 18:57:56 +0000 (10:57 -0800)]
MB-17051: [DcpProducer] Ensure no un-notified streams are left behind

Reiterate vbReady list at the end of a DcpProducer step to
ensure un-notified vbuckets are not left unprocessed.

Change-Id: I21065cf99f8be0af6dedf506237ce3dbe683387d
Reviewed-on: http://review.couchbase.org/57867
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years ago[DcpProducer] Refactor function name to indicate intent 72/57872/4
abhinavdangeti [Thu, 17 Dec 2015 03:56:44 +0000 (19:56 -0800)]
[DcpProducer] Refactor function name to indicate intent

unpauseIfSpace --> unpauseIfSpaceAvailable

Change-Id: Ifb4ec181e2228d819ab460bd03eccfefd75c48d6
Reviewed-on: http://review.couchbase.org/57872
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoFix heap-use-after-free issue detected by thread sanitizer 99/57699/5
abhinavdangeti [Thu, 10 Dec 2015 22:23:44 +0000 (14:23 -0800)]
Fix heap-use-after-free issue detected by thread sanitizer

No need to stop Producer Notififer in the destructor of
dcpConnMap. This is already taken care of when the executor
pool is unregistered.

WARNING: ThreadSanitizer: heap-use-after-free (pid=158780)
  Read of size 8 at 0x7d180000c1a0 by main thread:
    #0 DcpConnMap::~DcpConnMap() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/tasks.h:103 (ep.so+0x0000000453e1)
    #1 DcpConnMap::~DcpConnMap() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/connmap.cc:954 (ep.so+0x0000000456f5)
    #2 EventuallyPersistentEngine::~EventuallyPersistentEngine() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:6410 (ep.so+0x0000000d0e5c)
    #3 EvpDestroy(engine_interface*, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/ep_engine.cc:147 (ep.so+0x0000000b27f7)
    #4 mock_destroy(engine_interface*, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/memcached/programs/engine_testapp/engine_testapp.cc:99 (engine_testapp+0x0000004cbd97)
    #5 destroy_bucket(engine_interface*, engine_interface_v1*, bool) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/memcached/programs/engine_testapp/engine_testapp.cc:996 (engine_testapp+0x0000004cbc19)
    #6 perf_latency_baseline_multi_thread_bucket(test*, int, int, int) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/tests/ep_perfsuite.cc:386 (ep_perfsuite.so+0x00000000dfc4)
    #7 perf_latency_baseline_multi_bucket_4(test*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/tests/ep_perfsuite.cc:429 (ep_perfsuite.so+0x0000000091ef)
    #8 execute_test(test, char const*, char const*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/memcached/programs/engine_testapp/engine_testapp.cc:1104 (engine_testapp+0x0000004cb21c)
    #9 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

  Previous write of size 8 at 0x7d180000c1a0 by thread T15 (mutexes: write M11751):
    #0 operator delete(void*) <null> (engine_testapp+0x0000004641db)
    #1 DcpConnMap::DcpProducerNotifier::~DcpProducerNotifier() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/connmap.h:530 (ep.so+0x00000004ab85)
    #2 ExecutorThread::run() /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/atomic.h:325 (ep.so+0x0000000f17cb)
    #3 launch_executor_thread(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f15f5)
    #4 platform_thread_wrap(void*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-master/platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x000000004e7b)

Change-Id: Ib458d0826cc33b4b233da5a422b90bcf08d408bb
Reviewed-on: http://review.couchbase.org/57699
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17006: [BP] DCP Producer could miss fetching items from a stream 41/57641/5
abhinavdangeti [Mon, 7 Dec 2015 21:02:55 +0000 (13:02 -0800)]
MB-17006: [BP] DCP Producer could miss fetching items from a stream

Here's the scenario:
1. Stream currently in backfill phase
2. When backfill is received, 1 item added to readyQ
    a. itemsReady of stream set to true (Producer notified)
3. Front end op comes in
    a. item added to checkpoint queue
    b. itemsReady not set to true, as it already is (Producer not
      notified)
4. Producer calls stream::next()
    a. stream in backfillPhase(): 1 item popped from readyQ
    b. backfill task still running => no state transition to IN_MEMORY
    c. 1 op returned to producer, producer re-adds vbucket to ready list
5. Backfill completes
6. Producer calls stream::next()
    a. stream in backfillPhase(): no items in readyQ
    b. As backfill task has completed, state transitions to IN_MEMORY
    c. no items in readyQ => NULL returned
    d. As no op obtained, producer doesn't re-add vbucket to ready list

=> Front end item remains stuck in checkpoint queue, until more front
end ops come in - which would notify the producer

The proposed fix here is: In step 6b, when the producer sees the
backfill task has completed, and the state for the stream transitions
to IN_MEMORY, move checkpoint items into readyQ. This way the readyQ
will not be empty, and the producer would re-add the vbucket back into
the ready list.

Change-Id: I3403d3926f97788074990ef0e4c69cac902b2a93
Reviewed-on: http://review.couchbase.org/57516
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Change-Id: I0821402fefd01f0851572d7c22ccee5fc065778d
Reviewed-on: http://review.couchbase.org/57641
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoFix test case test_dcp_early_termination 42/57642/5
abhinavdangeti [Wed, 9 Dec 2015 19:42:43 +0000 (11:42 -0800)]
Fix test case test_dcp_early_termination

Account for tasks that are already in the future queue of the
auxIO dispatcher to ensure all DCP backfill tasks (auxIO) have
completed.

Change-Id: I9544a79436193f3ef42b08a2b6615eb4be4792ce
Reviewed-on: http://review.couchbase.org/57642
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.1.3' into 'couchbase/3.0.x' 19/57619/1
Dave Rigby [Wed, 9 Dec 2015 10:12:25 +0000 (10:12 +0000)]
Merge remote-tracking branch 'couchbase/3.1.3' into 'couchbase/3.0.x'

* couchbase/3.1.3:
  Fix compilation issue on windows
  [BP] MB-16915: Remove cyclic reference between DcpConsumer and PassiveStream.
  [BP] MB-16915: RollbackTask to hold ref count ptr for DCP consumer instead of raw ptr
  [BP] MB-16915: Use refcounted pointers on producer/consumer

Change-Id: Ied8b262fef0eb06671277524e17f0b6cbf7acbeb

5 years agoFix compilation issue on windows 54/57454/4 3.1.3 v3.1.3
abhinavdangeti [Fri, 4 Dec 2015 00:15:47 +0000 (16:15 -0800)]
Fix compilation issue on windows

<http://factory.couchbase.com/job/win_cs_build/ws/couchbase\ep-engine\test
s\ep_testsuite.cc(4958)> : error C2782: 'void checkeqfn(T,T,const char
        ,const char ,const int)' : template parameter 'T' is ambiguous

<http://factory.couchbase.com/job/win_cs_build/ws/couchbase\ep-engine\test
s\ep_testsuite.cc(74)> : see declaration of 'checkeqfn'
could be 'unsigned __int64'
or       'unsigned long'
NMAKE : fatal error U1077:
'C:\PROGRA~2\MICROS~2.0\VC\bin\amd64\cl.exe' :
return code '0x2'

Change-Id: I9a0bf5bd74276ebe9ac6a709302704a2bab06c25
Reviewed-on: http://review.couchbase.org/57454
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: abhinav dangeti <abhinav@couchbase.com>
5 years ago[BP] MB-16915: Remove cyclic reference between DcpConsumer and PassiveStream. 49/57449/2
Manu Dhundi [Thu, 3 Dec 2015 21:32:35 +0000 (13:32 -0800)]
[BP] MB-16915: Remove cyclic reference between DcpConsumer and PassiveStream.

DcpConsumer holds a reference to PassiveStream and vice versa. We must
make sure that one of them (DcpConsumer here) releases the reference
to another in a function other than the object destructor.

Change-Id: I8e5c262bc5ac50342f85ba80d481987a26a7a21d
Reviewed-on: http://review.couchbase.org/57429
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-on: http://review.couchbase.org/57449
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years ago[BP] MB-16915: RollbackTask to hold ref count ptr for DCP consumer instead of raw ptr 48/57448/2
Manu Dhundi [Thu, 3 Dec 2015 01:19:51 +0000 (17:19 -0800)]
[BP] MB-16915: RollbackTask to hold ref count ptr for DCP consumer instead of raw ptr

Rollback task is spawned when a DCP consumer is asked to rollback by a DCP
producer. Rollback runs in background and there is a possibility that the DCP
consumer object gets deleted before rollback task completes. We can avoid this
if RollbackTask holds a ref counted ptr of DCP consumer instead of a raw ptr.

Change-Id: I00c1bced0ec445226e64e6f7647a3bfbfb063f94
Reviewed-on: http://review.couchbase.org/57427
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-on: http://review.couchbase.org/57448
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years ago[BP] MB-16915: Use refcounted pointers on producer/consumer 47/57447/2
Jim Walker [Mon, 30 Nov 2015 13:31:59 +0000 (13:31 +0000)]
[BP] MB-16915: Use refcounted pointers on producer/consumer

Prevents a race/crash occuring when the DcpProducer is destroyed
and there are backfill tasks running/pending.

The test case reveals the probem when run under valgrind as
a series of invalid reads of freed memory. E.g.

==40673== Thread 17:
==40673== Invalid read of size 8
==40673==    at 0x71A3CEE: DCPBackfill::run() (dcp-stream.cc:175)
==40673==    by 0x717215C: ExecutorThread::run() (executorthread.cc:110)
==40673==    by 0x7172868: launch_executor_thread (executorthread.cc:34)
==40673==    by 0x503EC67: platform_thread_wrap (cb_pthreads.c:24)
==40673==    by 0x524A181: start_thread (pthread_create.c:312)
==40673==    by 0x555A47C: clone (clone.S:111)
==40673==  Address 0x64c2380 is 48 bytes inside a block of size 384 free'd
==40673==    at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==40673==    by 0x718C4ED: DcpConnMap::manageConnections() (atomic.h:430)
==40673==    by 0x71906A5: ConnManager::run() (connmap.cc:151)
==40673==    by 0x717215C: ExecutorThread::run() (executorthread.cc:110)
==40673==    by 0x7172868: launch_executor_thread (executorthread.cc:34)
==40673==    by 0x503EC67: platform_thread_wrap (cb_pthreads.c:24)
==40673==    by 0x524A181: start_thread (pthread_create.c:312)
==40673==    by 0x555A47C: clone (clone.S:111)

Change-Id: I32a7dfd10daa4565b9cbb4c8142ed8f71c13ca31
Reviewed-on: http://review.couchbase.org/57296
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-on: http://review.couchbase.org/57447
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16915: Remove cyclic reference between DcpConsumer and PassiveStream. 29/57429/5
Manu Dhundi [Thu, 3 Dec 2015 21:32:35 +0000 (13:32 -0800)]
MB-16915: Remove cyclic reference between DcpConsumer and PassiveStream.

DcpConsumer holds a reference to PassiveStream and vice versa. We must
make sure that one of them (DcpConsumer here) releases the reference
to another in a function other than the object destructor.

Change-Id: I8e5c262bc5ac50342f85ba80d481987a26a7a21d
Reviewed-on: http://review.couchbase.org/57429
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16915: RollbackTask to hold ref count ptr for DCP consumer instead of raw ptr 27/57427/3
Manu Dhundi [Thu, 3 Dec 2015 01:19:51 +0000 (17:19 -0800)]
MB-16915: RollbackTask to hold ref count ptr for DCP consumer instead of raw ptr

Rollback task is spawned when a DCP consumer is asked to rollback by a DCP
producer. Rollback runs in background and there is a possibility that the DCP
consumer object gets deleted before rollback task completes. We can avoid this
if RollbackTask holds a ref counted ptr of DCP consumer instead of a raw ptr.

Change-Id: I00c1bced0ec445226e64e6f7647a3bfbfb063f94
Reviewed-on: http://review.couchbase.org/57427
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16632: As part of queueDirty schedule a DCP connections notifier task 01/56301/28
abhinavdangeti [Thu, 22 Oct 2015 21:54:28 +0000 (14:54 -0700)]
MB-16632: As part of queueDirty schedule a DCP connections notifier task

This is how things are done for TAP.
This pretty much removed the notifications' lock overhead on
store/delete/(front-end) OP latencies.

Change-Id: I32c3c26daf6ea8cebeecc2a81fb1f0e957ba3e3d
Reviewed-on: http://review.couchbase.org/56301
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16915: Use refcounted pointers on producer/consumer 96/57296/11
Jim Walker [Mon, 30 Nov 2015 13:31:59 +0000 (13:31 +0000)]
MB-16915: Use refcounted pointers on producer/consumer

Prevents a race/crash occuring when the DcpProducer is destroyed
and there are backfill tasks running/pending.

The test case reveals the probem when run under valgrind as
a series of invalid reads of freed memory. E.g.

==40673== Thread 17:
==40673== Invalid read of size 8
==40673==    at 0x71A3CEE: DCPBackfill::run() (dcp-stream.cc:175)
==40673==    by 0x717215C: ExecutorThread::run() (executorthread.cc:110)
==40673==    by 0x7172868: launch_executor_thread (executorthread.cc:34)
==40673==    by 0x503EC67: platform_thread_wrap (cb_pthreads.c:24)
==40673==    by 0x524A181: start_thread (pthread_create.c:312)
==40673==    by 0x555A47C: clone (clone.S:111)
==40673==  Address 0x64c2380 is 48 bytes inside a block of size 384 free'd
==40673==    at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==40673==    by 0x718C4ED: DcpConnMap::manageConnections() (atomic.h:430)
==40673==    by 0x71906A5: ConnManager::run() (connmap.cc:151)
==40673==    by 0x717215C: ExecutorThread::run() (executorthread.cc:110)
==40673==    by 0x7172868: launch_executor_thread (executorthread.cc:34)
==40673==    by 0x503EC67: platform_thread_wrap (cb_pthreads.c:24)
==40673==    by 0x524A181: start_thread (pthread_create.c:312)
==40673==    by 0x555A47C: clone (clone.S:111)

Change-Id: I32a7dfd10daa4565b9cbb4c8142ed8f71c13ca31
Reviewed-on: http://review.couchbase.org/57296
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16632: Replace std::list with std:deque in DCP checkpoint processing 47/57147/8
Jim Walker [Wed, 4 Nov 2015 12:17:57 +0000 (12:17 +0000)]
MB-16632: Replace std::list with std:deque in DCP checkpoint processing

The algorithm does not need a std::list when it is implementing
nothing more than an queue.

This change brings some performance improvement to snaphot marker
creation.

Change-Id: I2f1ac82364737e9f56ff9c0c11b3cc1775b3f0d2
Reviewed-on: http://review.couchbase.org/57147
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16632: Reducing locking contention in DCP-Producer/Stream 00/56300/15
abhinavdangeti [Thu, 22 Oct 2015 22:36:12 +0000 (15:36 -0700)]
MB-16632: Reducing locking contention in DCP-Producer/Stream

- Adding a new RWLock for streams in Producer and avoid queueLock
- Improving BufferLog and remove need for queueLock on access
- Adding an array of atomic bool for lockless vbucket ready notification
- Changing some ActiveStream variables to be atomic to allow for lockless
  updates.

Change-Id: I11c54f1058c4c8a3f013dfc858a39d17362c9531
Reviewed-on: http://review.couchbase.org/56300
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoIncorrect log paramenters while logging backfill completion 17/56417/3
abhinavdangeti [Mon, 23 Nov 2015 17:05:01 +0000 (09:05 -0800)]
Incorrect log paramenters while logging backfill completion

Change-Id: I877fd7067862f09801ffd16e7014a0c952e8c559
Reviewed-on: http://review.couchbase.org/56417
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Daniel Owen <owend@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16836: Reset the stat 'ep_bg_fetched' to 0 on 'cbstats reset' command 29/57129/2
Manu Dhundi [Tue, 17 Nov 2015 22:55:18 +0000 (14:55 -0800)]
MB-16836: Reset the stat 'ep_bg_fetched' to 0 on 'cbstats reset' command

Change-Id: I444bd6c76265788d4366061d4d2b25e3c5e60518
Reviewed-on: http://review.couchbase.org/57129
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16357: Create a variable to get correct locking scope 76/56976/2
Jim Walker [Thu, 12 Nov 2015 12:04:56 +0000 (12:04 +0000)]
MB-16357: Create a variable to get correct locking scope

A mistake in 495e00acc24 means that no variable is
created for the ReaderLockHolder, the compiler either
optimises away the lock constructor/destructor or the lock
scope is wrong.

Either way we need to create a variable.

Change-Id: I642ac64d71b73d3d78207ff50d33539a06ce0e7e
Reviewed-on: http://review.couchbase.org/56976
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16686: Remove sanity check while adding TAP over DCP 64/56564/3 v3.1.2
abhinavdangeti [Fri, 30 Oct 2015 17:11:46 +0000 (10:11 -0700)]
MB-16686: Remove sanity check while adding TAP over DCP

This check isn't accurate as certain TAP messages from
the producer carry no vbucket information - initialized to
zero (expected), as they aren't vbucket specific operations.
In such a scenario, if the TAP consumer needs to be created,
it wouldn't be allowed to if a DCP passive stream exists
for vbucket 0. This would break an online upgrade.

Change-Id: I310b9cf4dbaf652c233cba02de7ca72469efa89d
Reviewed-on: http://review.couchbase.org/56564
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-15171: [BP] Initialize dcpConnMap_ to NULL in engine constructor 71/56171/3
Sriram Ganesan [Thu, 28 May 2015 01:51:23 +0000 (18:51 -0700)]
MB-15171: [BP] Initialize dcpConnMap_ to NULL in engine constructor

Not initializing this variable to NULL can cause access to an
invalid pointer during engine destroy.

Change-Id: Icc5d848f7826bb6331deb40b4832efcf64622dea
Reviewed-on: http://review.couchbase.org/51492
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-on: http://review.couchbase.org/56171

5 years agoMB-14825: [BP] While trying to stream next checkpoint item, check if vbucket is valid 70/56170/3
Manu Dhundi [Thu, 7 May 2015 01:30:24 +0000 (18:30 -0700)]
MB-14825: [BP] While trying to stream next checkpoint item, check if vbucket is valid

If a vbucket is deleted in middle of a DCP connection streaming a checkpoint
item, we should handle such a scenario in a graceful manner.

Change-Id: I24fe52adc572f504f492f015f82fc8d5e0325925
Reviewed-on: http://review.couchbase.org/50674
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-on: http://review.couchbase.org/56170
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16500 [BP]: Address data race in DcpConsumer, by acquiring readyMutex 65/56065/4
abhinavdangeti [Wed, 7 Oct 2015 21:49:41 +0000 (14:49 -0700)]
MB-16500 [BP]: Address data race in DcpConsumer, by acquiring readyMutex

WARNING: ThreadSanitizer: data race (pid=27652)

  Write of size 8 at 0x7d08000443c0 by main thread (mutexes: write M57876):
    #0 operator delete(void*) <null>:0 (engine_testapp+0x000000050e7b)
    #1 __gnu_cxx::new_allocator<std::_List_node<unsigned short> >::deallocate(std::_List_node<unsigned short>*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/ext/new_allocator.h:110 (ep.so+0x00000005d69a)
    #2 DcpConsumer::step(dcp_message_producers*) /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:516 (ep.so+0x00000005c5cc)
    #3 EvpDcpStep(engine_interface*, void const*, dcp_message_producers*) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1479 (ep.so+0x0000000b480b)
    #4 mock_dcp_step(engine_interface*, void const*, dcp_message_producers*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:476 (engine_testapp+0x0000000bb055)
    #5 dcp_step(engine_interface*, engine_interface_v1*, void const*) /home/abhinav/couchbase/ep-engine/tests/ep_test_apis.cc:1219 (ep_testsuite.so+0x0000000b61bd)
    #6 test_chk_manager_rollback(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:5526 (ep_testsuite.so+0x0000000809b4)
    #7 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b952c)
    #8 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous write of size 8 at 0x7d08000443c0 by thread T16:
    #0 operator new(unsigned long) <null>:0 (engine_testapp+0x00000005090d)
    #1 __gnu_cxx::new_allocator<std::_List_node<unsigned short> >::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/ext/new_allocator.h:104 (ep.so+0x00000005f265)
    #2 PassiveStream::reconnectStream(RCPtr<VBucket>&, unsigned int, unsigned long) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:1104 (ep.so+0x000000076f5f)
    #3 DcpConsumer::doRollback(unsigned int, unsigned short, unsigned long) /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:676 (ep.so+0x00000005db67)
    #4 RollbackTask::run() /home/abhinav/couchbase/ep-engine/src/dcp/consumer.cc:574 (ep.so+0x00000005d9d4)
    #5 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:115 (ep.so+0x0000000f834c)
    #6 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f7eb5)
    #7 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d71)

Change-Id: I196a78e54bf8014967a51cdb081126597153f77b
Reviewed-on: http://review.couchbase.org/55881
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-on: http://review.couchbase.org/56065
Reviewed-by: Dave Rigby <daver@couchbase.com>
5 years agoMB-16500 [BP]: Removing unnecessary locking in consumer code 80/56080/3
abhinavdangeti [Thu, 13 Aug 2015 18:46:35 +0000 (11:46 -0700)]
MB-16500 [BP]: Removing unnecessary locking in consumer code

streamMutex is to protect the ready list, but not the streams list.

The front end operations: addStream, closeStream, handleResponse, step
- wouldn't race with each other over the streams list, as multiple
memcached threads will not serve a single cookie.

The back end operations: processBufferedMessages (doesn't grab lock any
way), doRollback just read from streams list.

An addstream (front end op) is the only one that updates streams, and
this wouldn't update when a rollback is in progress.

Therefore, renaming the streamMutex lock in DCPConsumer to readyMutex
which is more apt for its operation - guarding the ready list.

Change-Id: Ia342d7243fef4b97b729aa94fdc64ad020711589
Reviewed-on: http://review.couchbase.org/54406
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Reviewed-on: http://review.couchbase.org/56080
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: abhinav dangeti <abhinav@couchbase.com>