ep-engine.git
4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 28/65328/2
Dave Rigby [Wed, 29 Jun 2016 09:17:13 +0000 (10:17 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19359: [2] Address lock inversion with vb's state lock and snapshot lock
  MB-19359: [1] Address lock inversion with vb's state lock and snapshot lock

Change-Id: Ia068af9b26e8a4b980bf22341fa43ac5452aca60

4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 89/65189/2
Dave Rigby [Tue, 28 Jun 2016 15:13:14 +0000 (16:13 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19343: Use cb_gmtime_r instead of gmtime_r
  [BP] MB-16366: Obtain vbstate readlock in numerous operations
  MB-19280: Fix data race in CouchKVStore stats access
  MB-19279: Fix race in use of gmtime()
  MB-19113: Suppress test_mb16357 when on thread sanitizer

Change-Id: Id289ae95e6fc5e03a64957cc41f53af9ee262a2a

4 years agoMerge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock
Dave Rigby [Tue, 28 Jun 2016 15:10:12 +0000 (15:10 +0000)]
Merge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock

4 years agoMB-19948: Handle 18 bytes of metadata 16/65016/3
Jim Walker [Thu, 16 Jun 2016 15:57:49 +0000 (16:57 +0100)]
MB-19948: Handle 18 bytes of metadata

Correctly read metadata of various sizes.

* 16 bytes
* 18 bytes
* 19 bytes

Are all possible sizes stored in couchdb by ep-engine.

Change-Id: Iede967ba0ce45e95e38c1f6cdb47a5164ab3c5d3
Reviewed-on: http://review.couchbase.org/65016
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMB-19948: CouchKVStore metadata tests 14/65014/3
Jim Walker [Thu, 16 Jun 2016 12:00:02 +0000 (13:00 +0100)]
MB-19948: CouchKVStore metadata tests

This commit contains some new tests to exercise the code
which assembles our metadata into couchstore.

There are upstream fixes and refactoring which will utilise
these tests for some positive vibes about maintaining correctness
as the code is changed.

Change-Id: I4facbc343133db1ba9a7bf76b8ba9834c3f69cae
Reviewed-on: http://review.couchbase.org/65014
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 87/65187/3
Dave Rigby [Thu, 23 Jun 2016 10:44:53 +0000 (11:44 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* commit 'a430629':
  MB-19278: Fix lock-order inversion on ActiveStream::streamMutex
  MB-19277: Set executorThread's waketime to atomic
  MB-19276: Fix data race on ExecutorThread::taskStart
  MB-19275: Address data race on a DCP stream's state
  MB-19273: Fix data race on PassiveStream::buffer.{bytes,items}
  MB-19260: Make cookie atomic to serialize set/get in ConnHandler
  MB-19259: Fix data race on DcpConsumer::backoffs
  MB-19258: Address data race with replicationThrottle parameters
  MB-19281: [BP] Add template class RelaxedAtomic<>
  MB-19257: Fix data race on ExecutorThread::now
  MB-19256: Address possible data race on VBCBAdaptor::currentvb

Further merge of mostly TSan fixes from 3.0.x into sherlock.

Change-Id: Ic88c446c4e09d669f7a4da7f8cb2f97c13d70ab7

4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 08/65008/1
Dave Rigby [Fri, 17 Jun 2016 11:41:12 +0000 (12:41 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19253: Fix race in void ExecutorPool::doWorkerStat
  MB-19252: Fix data race on Stream::readyQueueMemory
  MB-19251: Fix race in updating Vbucket.file{SpaceUsed,Size}
  MB-19249: Address possible data races in ConnHandler context
  MB-19248: Fix race in TaskQueue.{ready,future,pending}Queue access
  MB-19247: Fix possible data race in workload.h: workloadPattern
  MB-19246: Fix potentially incorrect persist_time in OBSERVE response
  MB-19229: Address possible data race in vbucket.cc: numHpChks
  MB-19228: Address possible data races in ActiveStream context
  MB-19227: Fix race in ConnNotifier.task access

Change-Id: I184b86cd800e406b5be96ec5f7c456e73f54b05c

4 years agoMerge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock
Dave Rigby [Fri, 17 Jun 2016 10:20:42 +0000 (10:20 +0000)]
Merge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock

4 years agoMerge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock
Dave Rigby [Fri, 17 Jun 2016 08:32:22 +0000 (08:32 +0000)]
Merge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock

4 years agoMerge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock
Dave Rigby [Fri, 17 Jun 2016 08:29:58 +0000 (08:29 +0000)]
Merge "Merge remote-tracking branch 'couchbase/3.0.x' into sherlock" into sherlock

4 years agoMB-19897: Fix the data race on lastSendTime between stats and dcp worker threads 59/64959/2
Manu Dhundi [Wed, 15 Jun 2016 17:08:50 +0000 (10:08 -0700)]
MB-19897: Fix the data race on lastSendTime between stats and dcp worker threads

Fix the thread sanitizer warning
http://cv.jenkins.couchbase.com/job/ep-engine-threadsanitizer-3.0.x/258/console

WARNING: ThreadSanitizer: data race (pid=102290)
  Read of size 4 at 0x7d580000f71c by thread T14 (mutexes: write M969):
    #0 void ConnHandler::addStat<unsigned int>(char const*, unsigned int const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) const /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/tapconnection.h:294 (ep.so+0x00000020e9f3)
    #1 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-producer.cc:557 (ep.so+0x00000027fbe5)
    #2 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:3701 (ep.so+0x000000183c44)
    #3 ConnStatBuilder std::for_each<std::_List_iterator<SingleThreadedRCPtr<ConnHandler> >, ConnStatBuilder>(std::_List_iterator<SingleThreadedRCPtr<ConnHandler> >, std::_List_iterator<SingleThreadedRCPtr<ConnHandler> >, ConnStatBuilder) /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/stl_algo.h:3755 (ep.so+0x0000001838a5)
    #4 void ConnMap::each_UNLOCKED<ConnStatBuilder>(ConnStatBuilder) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/connmap.h:148 (ep.so+0x000000183808)
    #5 void ConnMap::each<ConnStatBuilder>(ConnStatBuilder) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/connmap.h:140 (ep.so+0x00000017732e)
    #6 EventuallyPersistentEngine::doDcpStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:3954 (ep.so+0x00000014c5ed)

  Previous write of size 4 at 0x7d580000f71c by main thread:
    #0 DcpProducer::maybeSendNoop(dcp_message_producers*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-producer.cc:740 (ep.so+0x00000027d8ce)
    #1 DcpProducer::step(dcp_message_producers*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/dcp-producer.cc:323 (ep.so+0x00000027c920)
    #2 EvpDcpStep(engine_interface*, void const*, dcp_message_producers*) /home/couchbase/jenkins/workspace/ep-engine-threadsanitizer-3.0.x/ep-engine/src/ep_engine.cc:1404 (ep.so+0x000000138baa)

Change-Id: I2a2b0b0f01b10ecb31701bfc2330881bbafc6b74
Reviewed-on: http://review.couchbase.org/64959
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 81/64981/1
Dave Rigby [Thu, 16 Jun 2016 16:59:54 +0000 (17:59 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19226: Address potential data races in the warmup code
  MB-19225: Fix data race on Flusher::taskId
  MB-19225: Fix race in Flusher._state
  MB-19224: Address possible data race with global task's waketime

Change-Id: Idc461799bc50bd1274f3ffafab4b3257a024327b

4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 80/64980/2
Dave Rigby [Thu, 16 Jun 2016 16:01:14 +0000 (17:01 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19223: Switch to hrtime from timeval in Global Thread Pool
  MB-19222: Fix race condition in TaskQueue shutdown
  MB-19220: Ensure HashTable::size is atomic

Change-Id: I6e36e57c8394bb0eb147c697a81d0cbeeae423f7

4 years agoMerge remote-tracking branch 'couchbase/3.0.x' into sherlock 77/64977/1
Dave Rigby [Thu, 16 Jun 2016 08:42:22 +0000 (09:42 +0100)]
Merge remote-tracking branch 'couchbase/3.0.x' into sherlock

* couchbase/3.0.x:
  MB-19204: ep_testsuite: Don't release the item while we're using it
  MB-19204: Address data race in ep_test_apis/testsuite
  MB-19204: ep_testsuite: Use std::string for last_key/body
  MB-19204: Remove alarm() call from atomic_ptr_test, reduce iteration count
  MB-19204: hash_table_test: Fix TSan issues

Start of merge of 3.1.5+ changes into sherlock, broken into multiple
merges due to the size.

Change-Id: I65530d3c81d6b5e8b0171d0e3e1da3e14e0bb308

4 years agoMerge tag 'v3.1.5' into sherlock 55/64955/1
Dave Rigby [Wed, 15 Jun 2016 11:14:34 +0000 (12:14 +0100)]
Merge tag 'v3.1.5' into sherlock

3.1.5 release (ep-engine)

* tag 'v3.1.5':
  MB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call
  MB-19153: Break circular dependency while deleting bucket
  MB-19113: Address false positive lock inversion seen with test_mb16357

Change-Id: I2e7cd72f09c8b2b3780568ed7f7ca81fde064cb9

4 years agoMB-19843: Modify the end_seqno in DCP stream request after checking for rollback 06/64906/4
Manu Dhundi [Tue, 14 Jun 2016 18:05:04 +0000 (11:05 -0700)]
MB-19843: Modify the end_seqno in DCP stream request after checking for rollback

During a DCP stream request, we will update the end seqno when flags
DCP_ADD_STREAM_FLAG_LATEST/DCP_ADD_STREAM_FLAG_DISKONLY are used.
Currently in some cases when a rollback is required, the end_seqno could become
less than start_seqno before we check if a rollback is needed, resulting in
rejection of stream request.

Hence we should modify the end_seqno (if required as per the flags) only after
checking if a rollback is needed.

Change-Id: I23b112c16b9167023a990a5709ae6aae4838472e
Reviewed-on: http://review.couchbase.org/64906
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
4 years agoMB-19897: Only update sendTime if successfully send noop 78/64878/4
Daniel Owen [Thu, 21 Apr 2016 09:51:48 +0000 (10:51 +0100)]
MB-19897: Only update sendTime if successfully send noop

In the maybeSendNoop function when a DCP producer attempts
to send a noop to a consumer it can receive back
ENGINE_SUCCESS or ENGINE_E2BIG.

We should only set pendingRecv to true and update the
last sendTime if ENGINE_SUCCESS is returned.

Change-Id: Ice8a66dcae35505d7bab7d261f080d5ffb95c8e3
Reviewed-on: http://review.couchbase.org/64878
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19897: Record time for all DCP consumer messages 79/64879/3
Daniel Owen [Mon, 25 Apr 2016 13:06:41 +0000 (14:06 +0100)]
MB-19897: Record time for all DCP consumer messages

The DCP documentation states that the consumer should see
some sort of message or a No-Op message in a period
equal to twice the noop interval otherwise it should close
its connection.  See documentation/commands/no-op.md in
https://github.com/couchbaselabs/dcp-documentation

This patch changes from checking only the receival of a
no-op message to check for recieving the following messages
- add stream
- close stream
- deletion
- expiration
- flush
- mutation
- set VBucket state
- snapshot Marker
- stream end

Change-Id: Ib2268dba339cbf3701f3c7782ee8256bddc79ba3
Reviewed-on: http://review.couchbase.org/64879
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMB-18452: Force DcpConsumer processor task to yield 18/64718/4
Jim Walker [Fri, 3 Jun 2016 10:49:55 +0000 (11:49 +0100)]
MB-18452: Force DcpConsumer processor task to yield

Introduce two config tunable values that limit the DCP processor from
running 'forever'.

* dcp_consumer_process_buffered_messages_yield_limit
* dcp_consumer_process_buffered_messages_batch_size

The yield parameter forces the NONIO task to yield when the
limit is reached.

Change-Id: Ifce5a18fc807285471b08e9737cedb5db2b7923f
Reviewed-on: http://review.couchbase.org/64718
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19678: Merge backfill and in-memory snapshots correctly on replica vb 92/64192/4
Manu Dhundi [Tue, 17 May 2016 18:55:28 +0000 (11:55 -0700)]
MB-19678: Merge backfill and in-memory snapshots correctly on replica vb

When a DCP client, on replica vb, opens a stream which it intends to
keep open forever, merge the backfill and in-memory snapshots by using the
the checkpoint snapshot_end as snapshot_end_seqno.

Change-Id: Ic05a59ccafa54bbee19882707404a78c47002be7
Reviewed-on: http://review.couchbase.org/64192
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19690: Fix compile warning introduced by fix for 16656 08/64208/4
Dave Rigby [Thu, 19 May 2016 11:13:07 +0000 (12:13 +0100)]
MB-19690: Fix compile warning introduced by fix for 16656

Fix compile warning in ep_test_apis.cc introduced by
00272e095b58fb5267b950b0511a0b9147b40dd5 - the fix for MB-16656.

Change-Id: I1f005116faeee8999dd2c2c87fd5c96a9b7f4a4b
Reviewed-on: http://review.couchbase.org/64208
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19635: Initialise failovers correctly from 2.5.x vbstate 55/64155/2
Jim Walker [Tue, 17 May 2016 16:41:10 +0000 (17:41 +0100)]
MB-19635: Initialise failovers correctly from 2.5.x vbstate

When loading a vb file, don't force the failover table data
to be ("[{\"id\":0,\"seq\":0}]"); if the file doesn't contain
any data.

Change-Id: I41673bf848fcbab9b616edec5c7fd2ab9a3ddd6b
Reviewed-on: http://review.couchbase.org/64155
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
4 years agoMB-19503: Fix ConnMap so notifications don't go missing [2] 72/64072/3
Jim Walker [Mon, 16 May 2016 15:24:35 +0000 (16:24 +0100)]
MB-19503: Fix ConnMap so notifications don't go missing [2]

Previous patch[1] cleared the isNotificationScheduled flag
at the wrong place and meant things could then never
again get scheduled.

This is because we only cleared the flag if tp->isPaused()
yet we still pop the notification from the queue, so we
left tp->isNotificationScheduled yet the queue is empty.
Now no more notifications will ever get scheduled!

So we need to clear the notification scheduled boolean
unconditionally of the other flags on tp.

[1] - Commit 0856e0b3d3fc6

Change-Id: I11c9fd72f4b35102328022bd4c334a9e09a61cd0
Reviewed-on: http://review.couchbase.org/64072
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19503: Fix ConnMap so notifications don't go missing. 25/64025/2
Jim Walker [Wed, 11 May 2016 15:26:47 +0000 (16:26 +0100)]
MB-19503: Fix ConnMap so notifications don't go missing.

There's a reliance on an atomic bool and cmpxchg to
prevent the producer of notification from queueing
himself if he's already got a notification scheduled.

There's an ordering issue though where the producers code
can execute, see the flag is true and not bother queueing
a notification, yet the consumer side is about to clear the
flag and finish. The notification thus never gets queued
and the producer side thinks he will get a notification.

In my terminology:
producer is ConnMap::notifyPausedConnection
consumer is ConnMap::notifyAllPausedConnections

Change-Id: Id324b6369c5ee3a6b6758a7a93e017a4ff7c4a78
Reviewed-on: http://review.couchbase.org/64025
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19359: [2] Address lock inversion with vb's state lock and snapshot lock 66/63366/6
abhinavdangeti [Tue, 26 Apr 2016 18:03:18 +0000 (11:03 -0700)]
MB-19359: [2] Address lock inversion with vb's state lock and snapshot lock

+ [Not a backport, this code was altered/removed in master]
+ Address this lock inversion by moving reading the vbucket snapshot
  range to outside the vbucket's state lock context.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=245522)
  Cycle in lock order graph: M21518 (0x7d640003e220) => M21515 (0x7d640003e0f0) => M21518

  Mutex M21515 acquired here while holding mutex M21518 in thread T17:
    #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000462260)
    #1 cb_rw_reader_enter <null> (libplatform.so.0.1.0+0x000000004800)
    #2 RWLock::readerLock() ep-engine/src/rwlock.h:38 (ep.so+0x000000132360)
    #3 ReaderLockHolder::ReaderLockHolder(RWLock&) ep-engine/src/locks.h:167 (ep.so+0x0000000f8087)
    #4 EventuallyPersistentStore::addTAPBackfillItem(Item const&, unsigned char, bool) ep-engine/src/ep.cc:851 (ep.so+0x0000000d9ba7)
    #5 PassiveStream::commitMutation(MutationResponse*, bool) ep-engine/src/dcp-stream.cc:1370 (ep.so+0x00000029dd8c)
    #6 PassiveStream::processMutation(MutationResponse*) ep-engine/src/dcp-stream.cc:1346 (ep.so+0x00000029cbd0)
    #7 PassiveStream::processBufferedMessages(unsigned int&) ep-engine/src/dcp-stream.cc:1286 (ep.so+0x00000029c522)
    #8 DccpConsumer::processBufferedItems() ep-engine/src/dcp-consumer.cc:599 (ep.so+0x000000262e04)
    #9 Processer::run() ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002629ff)
    #10 ExecutorThread::run() ep-engine/src/executorthread.cc:109 (ep.so+0x0000001e38f1)
    #11 launch_executor_thread(void*) ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2f1a)
    #12 platform_thread_wrap platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000377c)

  Mutex M21518 acquired here while holding mutex M21515 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e9e0)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x0000000039c0)
    #2 Mutex::acquire() ep-engine/src/mutex.cc:31 (ep.so+0x0000001e241e)
    #3 LockHolder::lock() ep-engine/src/locks.h:71 (ep.so+0x000000080bc3)
    #4 LockHolder::LockHolder(Mutex&, bool) ep-engine/src/locks.h:48 (ep.so+0x000000080832)
    #5 VBucket::getCurrentSnapshot(unsigned long&, unsigned long&) ep-engine/src/vbucket.h:233 (ep.so+0x0000000fae05)
    #6 EventuallyPersistentEngine::doSeqnoStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), char const*, int) ep-engine/src/ep_engine.cc:4255 (ep.so+0x00000014f202)
    #7 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4372 (ep.so+0x000000150bcb)
    #8 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:214 (ep.so+0x000000136a72)
    #9 mock_get_stats memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c8403)
    #10 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) ep-engine/tests/ep_test_apis.cc:771 (ep_testsuite.so+0x0000000e21ea)
    #11 wait_for_stat_to_be(engine_interface*, engine_interface_v1*, char const*, int, char const*) ep-engine/tests/ep_test_apis.cc:860 (ep_testsuite.so+0x0000000e8d2b)
    #12 test_dcp_replica_stream_backfill(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:5306 (ep_testsuite.so+0x00000008e601)
    #13 execute_test memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e9f)
    #14 main crtstuff.c (engine_testapp+0x0000004c2e01)

Change-Id: Ia4dd34ab152d1cc1d1658ebe957da7c3b8d32c06
Reviewed-on: http://review.couchbase.org/63366
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19359: [1] Address lock inversion with vb's state lock and snapshot lock 19/63319/7
abhinavdangeti [Mon, 25 Apr 2016 17:51:07 +0000 (10:51 -0700)]
MB-19359: [1] Address lock inversion with vb's state lock and snapshot lock

+ [Not a backport, this code was altered/removed in master]
+ This scenario should not occur in real operation.
+ Most DCP unit tests would however flag this as an issue because
  of how we do things in the tests --> [setting vbucket's state to
  replica at the very beginning (by the main thread)].
+ Suppressing this lock inversion, by moving the function call to
  update the vbucket's snapshot range to outside the state lock
  context in setState(), as it isn't necessary to acquire the state
  lock to update the snapshot range.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=39750)
  Cycle in lock order graph: M43306 (0x7d640000fcf0) => M43309 (0x7d640000fe18) => M43306

  Mutex M43309 acquired here while holding mutex M43306 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x000000474420)
    #1 cb_mutex_enter /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:85 (libplatform.so.0.1.0+0x0000000034a0)
    #2 Mutex::acquire() /home/daver/repos/couchbase/server/ep-engine/src/mutex.cc:31 (ep.so+0x0000001c611e)
    #3 LockHolder::lock() /home/daver/repos/couchbase/server/ep-engine/src/locks.h:71 (ep.so+0x00000006a4e3)
    #4 LockHolder /home/daver/repos/couchbase/server/ep-engine/src/locks.h:48 (ep.so+0x00000006a172)
    #5 VBucket::setCurrentSnapshot(unsigned long, unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/vbucket.h:217 (ep.so+0x0000000e5ee5)
    #6 VBucket::setState(vbucket_state_t, server_handle_v1_t*) /home/daver/repos/couchbase/server/ep-engine/src/vbucket.cc:196 (ep.so+0x0000002932e9)
    #7 EventuallyPersistentStore::setVBucketState(unsigned short, vbucket_state_t, bool, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:1060 (ep.so+0x0000000c0b61)
    #8 EventuallyPersistentEngine::setVBucketState(unsigned short, vbucket_state_t, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.h:628 (ep.so+0x000000188a12)
    #9 setVBucket(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:824 (ep.so+0x00000014aaaa)
    #10 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:1118 (ep.so+0x000000147707)
    #11 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:1312 (ep.so+0x00000011a055)
    #12 mock_unknown_command /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:335 (engine_testapp+0x0000004be97a)
    #13 set_vbucket_state(engine_interface*, engine_interface_v1*, unsigned short, vbucket_state_t) /home/daver/repos/couchbase/server/ep-engine/tests/ep_test_apis.cc:484 (ep_testsuite.so+0x0000000e1562)
    #14 test_dcp_replica_stream_backfill(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_testsuite.cc:5278 (ep_testsuite.so+0x00000008c84a)
    #15 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1042 (engine_testapp+0x0000004ba8ff)
    #16 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

  Mutex M43306 acquired here while holding mutex M43309 in thread T17:
    #0 pthread_rwlock_rdlock <null> (engine_testapp+0x000000457ca0)
    #1 cb_rw_reader_enter /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:264 (libplatform.so.0.1.0+0x0000000042e0)
    #2 RWLock::readerLock() /home/daver/repos/couchbase/server/ep-engine/src/rwlock.h:38 (ep.so+0x000000115cf0)
    #3 ReaderLockHolder /home/daver/repos/couchbase/server/ep-engine/src/locks.h:167 (ep.so+0x0000000dbbe7)
    #4 EventuallyPersistentStore::addTAPBackfillItem(Item const&, unsigned char, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:851 (ep.so+0x0000000be35d)
    #5 PassiveStream::commitMutation(MutationResponse*, bool) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1370 (ep.so+0x00000027e8cc)
    #6 PassiveStream::processMutation(MutationResponse*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1346 (ep.so+0x00000027d680)
    #7 PassiveStream::processBufferedMessages(unsigned int&) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:1286 (ep.so+0x00000027cfbc)
    #8 DcpConsumer::processBufferedItems() /home/daver/repos/couchbase/server/ep-engine/src/dcp-consumer.cc:599 (ep.so+0x0000002454cc)
    #9 Processer::run() /home/daver/repos/couchbase/server/ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002450ef)
    #10 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:109 (ep.so+0x0000001c76d9)
    #11 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001c6caa)
    #12 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x00000000325c)

Change-Id: I2f3cf88e6aebbf6078f533fb1ed87bd9fe618616
Reviewed-on: http://review.couchbase.org/63319
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19343: Use cb_gmtime_r instead of gmtime_r 14/63314/2
Trond Norbye [Tue, 18 Nov 2014 07:49:11 +0000 (08:49 +0100)]
MB-19343: Use cb_gmtime_r instead of gmtime_r

Backport / cherry-pick from: bc660d479709b5eee74357920a1940294c786216
to fix Windows build break.

Change-Id: I49d19bbea22e31bd600f694acf89d98ffa3a62f3
Reviewed-on: http://review.couchbase.org/63314
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years ago[BP] MB-16366: Obtain vbstate readlock in numerous operations 65/62965/4
Jim Walker [Fri, 9 Oct 2015 14:14:28 +0000 (15:14 +0100)]
[BP] MB-16366: Obtain vbstate readlock in numerous operations

Any KV update operations grab the lock early and test that VB state
is active, they keep the lock until complete, this certainly protects
queueDirty from colliding with a VB state change and also any other
paths we're unaware of.

The GET operations only use the read lock if the GET has triggered a
expiry/queueDirty.

A couple of other locations that trigger queueDirty are also interlocked
with VB state changes.

(Already Reviewed-on: http://review.couchbase.org/55868)

Change-Id: Icaee69520da230a9fdde6eb85365a7ddae790fd6
Reviewed-on: http://review.couchbase.org/62965
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19280: Fix data race in CouchKVStore stats access 81/63081/6
Dave Rigby [Wed, 7 Oct 2015 15:27:03 +0000 (15:27 +0000)]
MB-19280: Fix data race in CouchKVStore stats access

As reported by ThreadSanitizer. CouchKVStore maintains a map of
vBucketID to counter - dbFileRevMap. This is read by some of the stats
functions (e.g. getNumPersistedDeletes) without a lock and hence there
is a potential race.

Solve this by changing the type of these counters to RelaxedAtomic<>.

WARNING: ThreadSanitizer: data race (pid=10155)
  Read of size 8 at 0x7d9000008000 by main thread (mutexes: write M21730):
    #0 CouchKVStore::getNumPersistedDeletes(unsigned short) ep-engine/src/couch-kvstore/couch-kvstore.cc:2095 (ep.so+0x000000326779)
    #1 EventuallyPersistentEngine::doDcpVbTakeoverStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), std::string&, unsigned short) ep-engine/src/ep_engine.cc:5312 (ep.so+0x000000155ca5)
    #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4462 (ep.so+0x000000154622)

  Previous write of size 8 at 0x7d9000008000 by thread T8 (mutexes: write M15079):
    #0 CouchKVStore::updateDbFileMap(unsigned short, unsigned long) ep-engine/src/couch-kvstore/couch-kvstore.cc:1306 (ep.so+0x000000311d3c)
    #1 CouchKVStore::openDB(unsigned short, unsigned long, _db**, unsigned long, unsigned long*) ep-engine/src/couch-kvstore/couch-kvstore.cc:1336 (ep.so+0x00000030f7ae)
    #2 CouchKVStore::setVBucketState(unsigned short, vbucket_state&, Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:981 (ep.so+0x00000031a557)
    #3 CouchKVStore::snapshotVBucket(unsigned short, vbucket_state&, Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:891 (ep.so+0x00000031a11c)
    #4 EventuallyPersistentStore::snapshotVBuckets(Priority const&, unsigned short) ep-engine/src/ep.cc:949 (ep.so+0x0000000dce69)

Change-Id: I83db17ffce0d0a49cfe80f23a34e5dac25ede719
Reviewed-on: http://review.couchbase.org/63081
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19279: Fix race in use of gmtime() 80/63080/5
Dave Rigby [Wed, 12 Nov 2014 17:05:56 +0000 (17:05 +0000)]
MB-19279: Fix race in use of gmtime()

As identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=17259)
      Write of size 8 at 0x7fec86e44de0 by main thread (mutexes: write M1161):
#0 gmtime ??:0 (libtsan.so.0+0x000000025135)
#1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:3369 (ep.so+0x00000010f4be)
#2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4339 (ep.so+0x000000113c35)
#3 EvpGetStats /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#4 mock_get_stats /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#5 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000832d8)
#6 wait_for_stat_change(engine_interface*, engine_interface_v1*, char const*, int, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:846 (ep_testsuite.so+0x0000000838d6)
#7 test_setup /repos/couchbase/server/source/ep-engine/tests/ep_testsuite.cc:178 (ep_testsuite.so+0x00000001ee69)
#8 execute_test /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1050 (exe+0x000000005970)
#9 main /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

      Previous write of size 8 at 0x7fec86e44de0 by thread T7:
#0 gmtime ??:0 (libtsan.so.0+0x000000025135)
#1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:3369 (ep.so+0x00000010f4be)
#2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4339 (ep.so+0x000000113c35)
#3 EventuallyPersistentStore::snapshotStats() /repos/couchbase/server/source/ep-engine/src/ep.cc:1465 (ep.so+0x0000000e150e)
#4 StatSnap::run() /repos/couchbase/server/source/ep-engine/src/tasks.cc:79 (ep.so+0x000000174db2)
#5 ExecutorThread::run() /repos/couchbase/server/source/ep-engine/src/executorthread.cc:110 (ep.so+0x00000014a0e7)
#6 launch_executor_thread /repos/couchbase/server/source/ep-engine/src/executorthread.cc:34 (ep.so+0x000000149930)
#7 platform_thread_wrap /repos/couchbase/server/source/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#8 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

Switch to gmtime_r which is thread-safe.

Change-Id: Id0773df65f4fc569c0a173b6185b1ef8bd91862d
Reviewed-on: http://review.couchbase.org/63080
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19113: Suppress test_mb16357 when on thread sanitizer 22/62922/10
abhinavdangeti [Fri, 15 Apr 2016 18:59:14 +0000 (11:59 -0700)]
MB-19113: Suppress test_mb16357 when on thread sanitizer

This is to suppress a false positive thrown by thread
sanitizer regarding a lock inversion that would never
occur in operation.
    The lock inversion pointed out is between the front
end work load thread, that grabs the hash table partition
lock and then the vbucket snapshot lock, while the dcp
consumer processer task grabs the snapshot lock and then
the hash table partition lock. Note that the first thread
always works on an active vbucket, while the second thread
always works on a replica vbucket, and the vbucket cannot
be in active and replica state(s) at the same time.

Change-Id: I5e42e14a2333b0720d8c43c9e2a4d7190696f5e9
Reviewed-on: http://review.couchbase.org/62922
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>

4 years agoMB-19278: Fix lock-order inversion on ActiveStream::streamMutex 33/63033/6
Dave Rigby [Tue, 19 Apr 2016 13:41:06 +0000 (14:41 +0100)]
MB-19278: Fix lock-order inversion on ActiveStream::streamMutex

ThreadSanitizer has identified a potential deadlock due to a cycle in
  the lock order graph: Cycle in lock order graph:

    M43515         => M36787      => M36848            => M43515
   [ActiveStream::   [TaskQueue::   [ExecutorThread::    [ActiveStream::
    streamMutex]      mutex]         currentTaskMutex]    streamMutex]

The crux of the problem appears to be the acquisition of streamMutex
in the destructor of ActiveStream. This is ultimately a Bad Idea - if
you still have multiple threads accessing an object when it's been
deleted then you are already into undefined behaviour.

Change-Id: I2353b5a8ed93a4f9e8cc036cb85680c185cbcc2f
Reviewed-on: http://review.couchbase.org/63033
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Daniel Owen <owend@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19277: Set executorThread's waketime to atomic 29/63029/6
abhinavdangeti [Thu, 22 Oct 2015 01:55:28 +0000 (18:55 -0700)]
MB-19277: Set executorThread's waketime to atomic

This one looks benign - we only perform the dirty read when
calculating the %s:waketime stat which is not used by anyone apart
from end-users.

WARNING: ThreadSanitizer: data race (pid=41666)
  Read of size 8 at 0x7d4400008370 by main thread (mutexes: write M21616):
    #0 ExecutorThread::getWaketime() ep-engine/src/executorthread.h:120 (ep.so+0x0000001cac0e)
    #1 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:692 (ep.so+0x0000001b6f06)
    #2 ExecutorPool::doWorkerStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:706 (ep.so+0x0000001b6734)
    #3 EventuallyPersistentEngine::doDispatcherStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4139 (ep.so+0x00000015223e)
    #4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4375 (ep.so+0x000000155336)

  Previous write of size 8 at 0x7d4400008370 by thread T6 (mutexes: write M14985):
    #0 TaskQueue::_fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:125 (ep.so+0x00000025b3fd)
    #1 TaskQueue::fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:161 (ep.so+0x00000025bfcf)
    #2 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:217 (ep.so+0x0000001afc67)
    #3 ExecutorPool::nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:232 (ep.so+0x0000001afe3f)
    #4 ExecutorThread::run() ep-engine/src/executorthread.cc:81 (ep.so+0x0000001e85c9)

Change-Id: I34b12681dd9dfc87c889f301692ca714f04d2a82
Reviewed-on: http://review.couchbase.org/63029
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19276: Fix data race on ExecutorThread::taskStart 28/63028/6
abhinavdangeti [Tue, 8 Dec 2015 21:48:20 +0000 (13:48 -0800)]
MB-19276: Fix data race on ExecutorThread::taskStart

WARNING: ThreadSanitizer: data race (pid=236996)
  Write of size 8 at 0x7d380000dbd8 by thread T5:
    #0 ExecutorThread::run() ep-engine/src/executorthread.cc:105 (ep.so+0x0000000ee0cc)
    #1 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000edd75)
    #2 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x000000004e7b)

  Previous read of size 8 at 0x7d380000dbd8 by main thread (mutexes: write M266835151185571736):
    #0 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorthread.h:108 (ep.so+0x0000000ea8dc)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4363 (ep.so+0x0000000c05dd)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:219 (ep.so+0x0000000af40e)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:240 (engine_testapp+0x0000004cc71d)
    #4 test_worker_stats(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:9464 (ep_testsuite.so+0x00000003c038)
    #5 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1091 (engine_testapp+0x0000004cb315)
    #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: If01aba3cf6b3591f19c5bb62119e40998f12c8ff
Reviewed-on: http://review.couchbase.org/63028
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19275: Address data race on a DCP stream's state 27/63027/6
abhinavdangeti [Wed, 28 Oct 2015 21:58:55 +0000 (14:58 -0700)]
MB-19275: Address data race on a DCP stream's state

WARNING: ThreadSanitizer: data race (pid=139161)
  Read of size 4 at 0x7d480000b150 by thread T31 (mutexes: write M51120):
    #0 DCPBackfill::scan() ep-engine/src/dcp/stream.h:126 (ep.so+0x000000053391)
    #1 DCPBackfill::run() ep-engine/src/dcp/backfill.cc:118 (ep.so+0x000000052737)
    #2 BackfillManager::backfill() ep-engine/src/dcp/backfill-manager.cc:240 (ep.so+0x00000004cf65)
    #3 BackfillManagerTask::run() ep-engine/src/dcp/backfill-manager.cc:43 (ep.so+0x00000004cb8f)
    #4 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000eb94d)
    #5 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000eb515)
    #6 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:53 (libplatform.so.0.1.0+0x0000000048ab)

  Previous write of size 4 at 0x7d480000b150 by main thread (mutexes: write M1241, write M32448, write M51071, write M51087):
    #0 ActiveStream::transitionState(stream_state_t) ep-engine/src/dcp/stream.cc:829 (ep.so+0x00000006accb)
    #1 ActiveStream::endStream(end_stream_status_t) ep-engine/src/dcp/stream.cc:688 (ep.so+0x00000006a8c2)
    #2 ActiveStream::setDead(end_stream_status_t) ep-engine/src/dcp/stream.cc:654 (ep.so+0x00000006f27b)
    #3 DcpProducer::setDisconnect(bool) ep-engine/src/dcp/producer.cc:835 (ep.so+0x000000065605)
    #4 DcpConnMap::disconnect_UNLOCKED(void const*) ep-engine/src/connmap.cc:1116 (ep.so+0x000000045d6c)
    #5 DcpConnMap::disconnect(void const*) ep-engine/src/connmap.cc:1109 (ep.so+0x000000045c8b)
    #6 EventuallyPersistentEngine::handleDisconnect(void const*) ep-engine/src/ep_engine.cc:6265 (ep.so+0x0000000ca38a)
    #7 EvpHandleDisconnect(void const*, ENGINE_EVENT_TYPE, void const*, void const*) ep-engine/src/ep_engine.cc:1802 (ep.so+0x0000000af976)
    #8 destroy_mock_cookie memcached/programs/engine_testapp/mock_server.cc:325 (engine_testapp+0x0000004f4082)
    #9 dcp_stream_req(engine_interface*, engine_interface_v1*, unsigned int, unsigned short, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, ENGINE_ERROR_CODE) ep-engine/tests/ep_testsuite.cc:4331 (ep_testsuite.so+0x000000090b06)
    #10 test_failover_log_dcp(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:14127 (ep_testsuite.so+0x00000007ce7a)
    #11 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1091 (engine_testapp+0x0000004cb315)
    #12 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: Icfc82fa877999d128184c9cac8f8c0e1cafc4e67
Reviewed-on: http://review.couchbase.org/63027
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19273: Fix data race on PassiveStream::buffer.{bytes,items} 26/63026/6
Dave Rigby [Tue, 19 Apr 2016 10:38:37 +0000 (11:38 +0100)]
MB-19273: Fix data race on PassiveStream::buffer.{bytes,items}

As reported by ThreadSanitizer (see below), there is a dirty read on
{{buffer.items}} & {{buffer.bytes}} during stat writing, due to
PassiveStream::addStats not acquiring the {{bufMutex}} lock before
reading.

This appears benign as the stat is only user sent to users, not used
by ns_server etc for any calculation.

WARNING: ThreadSanitizer: data race (pid=28418)
  Read of size 8 at 0x7d5000018128 by main thread (mutexes: write M23810, write M969):
    #0 void STATWRITER_NAMESPACE::add_casted_stat<unsigned long>(char const*, unsigned long const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/statwriter.h:47 (ep.so+0x0000000afe89)
    #1 PassiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/dcp-stream.cc:1512 (ep.so+0x0000002a04ba)
    #2 DcpConsumer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) ep-engine/src/dcp-consumer.cc:555 (ep.so+0x00000026e867)

  Previous write of size 8 at 0x7d5000018128 by thread T18 (mutexes: write M23762):
    #0 PassiveStream::processBufferedMessages(unsigned int&) ep-engine/src/dcp-stream.cc:1311 (ep.so+0x00000029e196)
    #1 DcpConsumer::processBufferedItems() ep-engine/src/dcp-consumer.cc:599 (ep.so+0x0000002647e9)
    #2 Processer::run() ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002643ef)

Change-Id: I443e85d59ffda3827b670e747794b3fcb69c4f7c
Reviewed-on: http://review.couchbase.org/63026
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19260: Make cookie atomic to serialize set/get in ConnHandler 78/62978/8
abhinavdangeti [Tue, 6 Oct 2015 01:34:06 +0000 (18:34 -0700)]
MB-19260: Make cookie atomic to serialize set/get in ConnHandler

WARNING: ThreadSanitizer: data race (pid=20056)

  Write of size 8 at 0x7d600000f038 by main thread (mutexes: write M1412):
    #0 ConnHandler::setCookie(void const*) /home/abhinav/couchbase/ep-engine/src/tapconnection.h:344 (ep.so+0x000000042367)
    #1 EventuallyPersistentEngine::createTapQueue(void const*, std::string&, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:2655 (ep.so+0x0000000b86da)
    #2 EvpGetTapIterator(engine_interface*, void const*, void const*, unsigned long, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1462 (ep.so+0x0000000b46a3)
    #3 mock_get_tap_iterator(engine_interface*, void const*, void const*, unsigned long, unsigned int, void const*, unsigned long) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:467 (engine_testapp+0x0000000bae3e)
    #4 test_tap_ack_stream(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:7341 (ep_testsuite.so+0x000000050416)
    #5 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #6 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous read of size 8 at 0x7d600000f038 by thread T9 (mutexes: write M1411):
    #0 ConnHandler::getCookie() const /home/abhinav/couchbase/ep-engine/src/tapconnection.h:340 (ep.so+0x00000004067c)
    #1 bool TapConnMap::performOp<Item*>(std::string const&, TapOperation<Item*>&, Item*) /home/abhinav/couchbase/ep-engine/src/connmap.h:389 (ep.so+0x00000001fa08)
    #2 ItemResidentCallback::callback(CacheLookup&) /home/abhinav/couchbase/ep-engine/src/backfill.cc:63 (ep.so+0x00000001d9ca)
    #3 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1654 (ep.so+0x000000180ca0)
    #4 recordDbDumpC(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:66 (ep.so+0x00000017fe95)
    #5 lookup_callback(couchfile_lookup_request*, _sized_buf const*, _sized_buf const*) /home/abhinav/couchbase/couchstore/src/couch_db.cc:767 (libcouchstore.so+0x00000000d7e5)
    #6 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:99 (libcouchstore.so+0x00000000b5a2)
    #7 btree_lookup /home/abhinav/couchbase/couchstore/src/btree_read.cc:131 (libcouchstore.so+0x00000000affc)
    #8 couchstore_changes_since /home/abhinav/couchbase/couchstore/src/couch_db.cc:812 (libcouchstore.so+0x00000000d5f1)
    #9 CouchKVStore::scan(ScanContext*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1264 (ep.so+0x00000017f94e)
    #10 BackfillDiskLoad::run() /home/abhinav/couchbase/ep-engine/src/backfill.cc:131 (ep.so+0x00000001e449)
    #11 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f8956)
    #12 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f84f5)
    #13 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I8a668f17013c95abc9786d853ed2c6462cae5320
Reviewed-on: http://review.couchbase.org/62978
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19259: Fix data race on DcpConsumer::backoffs 77/62977/8
Dave Rigby [Thu, 8 Oct 2015 13:55:09 +0000 (13:55 +0000)]
MB-19259: Fix data race on DcpConsumer::backoffs

This member is read by stats processing concurrently with DcpConsumer
updating it. Change to RelaxedAtomic<>

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=24629)
  Read of size 4 at 0x7d5000016b9c by main thread (mutexes: write M27034, write M2479):
    #0 void ConnHandler::addStat<unsigned int>(), void const*) ep-engine/src/tapconnection.h:291:18 (ep.so+0x00000004e993)
    #1 DcpConsumer::addStats(), void const*) ep-engine/src/dcp/consumer.cc:731:5 (ep.so+0x00000005a2c3)
    ...

  Previous write of size 4 at 0x7d5000016b9c by thread T10:
    #0 DcpConsumer::processBufferedItems() ep-engine/src/dcp/consumer.cc:755:17 (ep.so+0x0000000539c3)
    #1 Processer::run() ep-engine/src/dcp/consumer.cc:57:13 (ep.so+0x00000005376b)
    #2 ExecutorThread::run() ep-engine/src/executorthread.cc:115:26 (ep.so+0x0000000e944e)
    ...

Change-Id: Iabddcc06213fbb80815d4b464c459adb922a0eff
Reviewed-on: http://review.couchbase.org/62977
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19258: Address data race with replicationThrottle parameters 76/62976/8
abhinavdangeti [Thu, 28 Jan 2016 20:31:56 +0000 (12:31 -0800)]
MB-19258: Address data race with replicationThrottle parameters

The detected usage is just in stats, however there is an indirect
usage of this variable (persistenceQueueSmallEnough, via
stats.replicationThrottleWriteQueueCap) which /might/ result in an
incorrect queue size being used.

WARNING: ThreadSanitizer: data race (pid=31345)
  Read of size 8 at 0x7d08000066c0 by thread T12:
    #0 ReplicationThrottle::adjustWriteQueueCap(unsigned long) ep-engine/src/replicationthrottle.cc:49 (ep.so+0x000000114c7f)
    #1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:3100 (ep.so+0x0000000bb424)
    #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4597 (ep.so+0x0000000c54d5)
    #3 EventuallyPersistentStore::snapshotStats() ep-engine/src/ep.cc:1744 (ep.so+0x000000090f2e)
    #4 StatSnap::run() ep-engine/src/tasks.cc:100 (ep.so+0x000000136dd6)
    #5 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000f6966)
    #6 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f6515)
    #7 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x00000000551b)
12:23:35
  Previous write of size 8 at 0x7d08000066c0 by main thread (mutexes: write M2287262534014660504):
    #0 EPStoreValueChangeListener::sizeValueChanged(std::string const&, unsigned long) ep-engine/src/replicationthrottle.h:42 (ep.so+0x0000000b35ee)
    #1 Configuration::setParameter(std::string const&, long) ep-engine/src/configuration.cc:225 (ep.so+0x000000197ac5)
    #2 Configuration::setReplicationThrottleQueueCap(long const&) build/ep-engine/src/generated_configuration.cc:459 (ep.so+0x0000001a6fa8)
    #3 setTapParam(EventuallyPersistentEngine*, char const*, char const*, std::string&) ep-engine/src/ep_engine.cc:323 (ep.so+0x0000000d7f3f)
    #4 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1365 (ep.so+0x0000000b4f68)
    #5 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:382 (engine_testapp+0x0000004ce149)
    #6 set_param(engine_interface*, engine_interface_v1*, protocol_binary_engine_param_t, char const*, char const*) ep-engine/tests/ep_test_apis.cc:597 (ep_testsuite_dcp.so+0x000000038e17)
    #7 test_consumer_backoff_stat(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite_dcp.cc:2184 (ep_testsuite_dcp.so+0x000000017411)
    #8 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1131 (engine_testapp+0x0000004cc600)
    #9 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Change-Id: Ie4ff039603f2ddfc5b44d5d7f217544307655d31
Reviewed-on: http://review.couchbase.org/62976
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19281: [BP] Add template class RelaxedAtomic<> 69/63169/4
Dave Rigby [Thu, 21 Apr 2016 08:38:54 +0000 (09:38 +0100)]
MB-19281: [BP] Add template class RelaxedAtomic<>

Backport of the RelaxedAtomic template class from
platform/watson. Changed from C++11 std::atomic<> to our own
AtomicValue<> as 3.x doesn't have C++11 support on all platforms, and
moved to ep-engine as AtomicValue is an ep-engine specific class.

Doesn't include unit tests as they depend on GTest which isn't present
in 3.0.x.

Merge of the following platform commits:

* http://review.couchbase.org/54973 - Add template class RelaxedAtomic<>
* http://review.couchbase.org/55870 - RelaxedAtomic: Allow construction from template type
* http://review.couchbase.org/55889 - RelaxedAtomic: Remove 'explicit' definition for copy constructor

Change-Id: I16a5e2ebe85201aae85592329a2212c8a5c3a464
Reviewed-on: http://review.couchbase.org/63169
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19257: Fix data race on ExecutorThread::now 75/62975/6
Dave Rigby [Thu, 8 Oct 2015 15:49:52 +0000 (15:49 +0000)]
MB-19257: Fix data race on ExecutorThread::now

This value is accessed without a lock from addWorkerStats. Change to
atomic to fix race.

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=18761)
  Read of size 8 at 0x7d4400007fa8 by main thread (mutexes: write M19371):
    #0 ExecutorThread::getCurTime() /home/couchbase/server/ep-engine/src/executorthread.h:129:46 (ep.so+0x0000000e67a0)
    #1 addWorkerStats(char const*, ExecutorThread*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/executorpool.cc:748 (ep.so+0x0000000e67a0)
    #2 ExecutorPool::doWorkerStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/executorpool.cc:760 (ep.so+0x0000000e67a0)
    #3 EventuallyPersistentEngine::doDispatcherStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:4352:5 (ep.so+0x0000000bc72d)
    #4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:4588 (ep.so+0x0000000bc72d)
    #5 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/ep-engine/src/ep_engine.cc:214:38 (ep.so+0x0000000ab70e)
    #6 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:239:19 (engine_testapp+0x0000004c553d)
    #7 test_worker_stats(engine_interface*, engine_interface_v1*) /home/couchbase/server/ep-engine/tests/ep_testsuite.cc:8901:24 (ep_testsuite.so+0x000000039908)
    #8 execute_test(test, char const*, char const*) /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:1090:19 (engine_testapp+0x0000004c4142)
    #9 main /home/couchbase/server/memcached/programs/engine_testapp/engine_testapp.cc:1439 (engine_testapp+0x0000004c4142)

  Previous write of size 8 at 0x7d4400007fa8 by thread T7 (mutexes: write M12645):
    #0 TaskQueue::_doSleep(ExecutorThread&) /home/couchbase/server/ep-engine/src/taskqueue.cc:78:5 (ep.so+0x00000012eb21)
    #1 TaskQueue::_fetchNextTask(ExecutorThread&, bool) /home/couchbase/server/ep-engine/src/taskqueue.cc:117:21 (ep.so+0x00000012ed66)
    #2 TaskQueue::fetchNextTask(ExecutorThread&, bool) /home/couchbase/server/ep-engine/src/taskqueue.cc:160:17 (ep.so+0x00000012f907)
    #3 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) /home/couchbase/server/ep-engine/src/executorpool.cc:226:17 (ep.so+0x0000000dfa6f)
    #4 ExecutorPool::nextTask(ExecutorThread&, unsigned char) /home/couchbase/server/ep-engine/src/executorpool.cc:241:21 (ep.so+0x0000000dfac6)
    #5 ExecutorThread::run() /home/couchbase/server/ep-engine/src/executorthread.cc:81:28 (ep.so+0x0000000e9cfe)
    #6 launch_executor_thread(void*) /home/couchbase/server/ep-engine/src/executorthread.cc:33:9 (ep.so+0x0000000e9b05)
    #7 platform_thread_wrap /home/couchbase/server/platform/src/cb_pthreads.c:23:5 (libplatform.so.0.1.0+0x000000003dc1)

Change-Id: Idcaea9a157293dbf95ca236354673556a2f3c4ac
Reviewed-on: http://review.couchbase.org/62975
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19256: Address possible data race on VBCBAdaptor::currentvb 74/62974/6
abhinavdangeti [Mon, 5 Oct 2015 20:06:13 +0000 (13:06 -0700)]
MB-19256: Address possible data race on VBCBAdaptor::currentvb

WARNING: ThreadSanitizer: data race (pid=12312)

Read of size 2 at 0x7d400000fff8 by main thread (mutexes: write M12542):
    #0 VBCBAdaptor::getDescription() /home/abhinav/couchbase/ep-engine/src/ep.h:128 (ep.so+0x0000000a7fe1)
    #1 ExecutorPool::_stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:562 (ep.so+0x0000000f3c21)
    #2 ExecutorPool::stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:585 (ep.so+0x0000000f3f5e)
    #3 ~EventuallyPersistentStore /home/abhinav/couchbase/ep-engine/src/ep.cc:468 (ep.so+0x0000000836c6)
    #4 ~EventuallyPersistentEngine /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6326 (ep.so+0x0000000d4eda)
    #5 EvpDestroy(engine_interface*, bool) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000b4b8c)
    #6 mock_destroy(engine_interface*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000ba027)
    #7 destroy_bucket(engine_interface*, engine_interface_v1*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000b952e)
    #8 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Previous write of size 2 at 0x7d400000fff8 by thread T10:
    #0 VBCBAdaptor::run() /home/abhinav/couchbase/ep-engine/src/ep.cc:3776 (ep.so+0x00000009d7e3)
    #1 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f94d3)
    #2 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9055)
    #3 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I316fb1a845fbee09f634d39e64057c170fab4795
Reviewed-on: http://review.couchbase.org/62974
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19253: Fix race in void ExecutorPool::doWorkerStat 73/62973/6
Dave Rigby [Tue, 6 Oct 2015 14:29:34 +0000 (14:29 +0000)]
MB-19253: Fix race in void ExecutorPool::doWorkerStat

As reported by ThreadSanitizer (see below), there is a race between
setting the current task associated with a ExecutorThread and reading
the name of that thread.

Unfortunately there doesn't seem to be a straightforward way to solve
this without adding a new mutex; currentTask (the variable the race is
on) is a SingleThreadedRCPtr, which is non-trivial to make thread-safe
(i.e. atomic). I did consider changing currenTask (and all other
ExTask variables) to be a std::shared_ptr as in C++11 this has support
for updating atomically, however the support in mainstream compilers
apparently isn't quite there yet.

Therefore I've just added a mutex to guard currentTask.

ThreadSanitizer report:

WARNING: ThreadSanitizer: data race (pid=27332)
  Read of size 8 at 0x7d340000c8f0 by main thread (mutexes: write M19366):
    #0 ExecutorThread::getTaskableName() const /home/couchbase/couchbase/ep-engine/src/atomic.h:309 (ep.so+0x0000000e6178)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:4346 (ep.so+0x0000000bc4dd)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000ab49e)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000004c54ad)
    #4 test_worker_stats(engine_interface*, engine_interface_v1*) /home/couchbase/couchbase/ep-engine/tests/ep_testsuite.cc:8901 (ep_testsuite.so+0x000000039768)
    #5 execute_test(test, char const*, char const*) /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000004c40b2)
    #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

  Previous write of size 8 at 0x7d340000c8f0 by thread T5:
    #0 ExecutorThread::run() /home/couchbase/couchbase/ep-engine/src/atomic.h:322 (ep.so+0x0000000e9906)
    #1 launch_executor_thread(void*) /home/couchbase/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000e9795)
    #2 platform_thread_wrap /home/couchbase/.ccache/tmp/cb_pthread.tmp.00b591814417.18511.i (libplatform.so.0.1.0+0x000000003d91)

Change-Id: Id02f7a98b40b952a415cf9027a8f2243af38fc4d
Reviewed-on: http://review.couchbase.org/62973
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19252: Fix data race on Stream::readyQueueMemory 72/62972/6
Dave Rigby [Mon, 18 Apr 2016 13:47:19 +0000 (14:47 +0100)]
MB-19252: Fix data race on Stream::readyQueueMemory

As detected by TSan:

WARNING: ThreadSanitizer: data race (pid=17244)
  Read of size 8 at 0x7d480000b370 by main thread (mutexes: write M24165, write M969, read M24121):
    #0 Stream::getReadyQueueMemory() /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:234 (ep.so+0x00000028f51e)
    #1 ActiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:563 (ep.so+0x00000029452f)
    #2 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-producer.cc:551 (ep.so+0x00000027f1a0)
    #3 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:3696 (ep.so+0x000000182d54)

  Previous write of size 8 at 0x7d480000b370 by thread T16 (mutexes: write M24143):
    #0 Stream::pushToReadyQ(DcpResponse*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:211 (ep.so+0x00000028f4a6)
    #1 ActiveStream::backfillReceived(Item*) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:407 (ep.so+0x00000028d6e5)
    #2 CacheCallback::callback(CacheLookup&) /home/daver/repos/couchbase/server/ep-engine/src/dcp-stream.cc:87 (ep.so+0x00000028d4b3)
    #3 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/daver/repos/couchbase/server/ep-engine/src/couch-kvstore/couch-kvstore.cc:1563 (ep.so+0x00000031dec5)

See also: http://review.couchbase.org/54314 which originally fixed
this issue in watson; however it also fixed a couple of other issues
in the same patch.

Change-Id: Iae6a34403394e54c9d7213a7c2703be761e7dc0f
Reviewed-on: http://review.couchbase.org/62972
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19251: Fix race in updating Vbucket.file{SpaceUsed,Size} 71/62971/6
Dave Rigby [Wed, 12 Nov 2014 17:52:41 +0000 (17:52 +0000)]
MB-19251: Fix race in updating Vbucket.file{SpaceUsed,Size}

These variables are used in the calculation of ep_db_data_size and
ep_db_file_size stats, and crucially those stats are used by ns_server
when determining if a bucket should be compacted.

It is possible that due to this issue, compaction may not be triggered
when expected, or triggered when it shouldn't.

As identified by ThreadSantizer:

    WARNING: ThreadSanitizer: data race (pid=4009)
      Write of size 8 at 0x7d440000c5b0 by thread T6 (mutexes: write M55):
#0 KVStatsCallback::callback(KVStatsCtx&) /repos/couchbase/server/source/ep-engine/src/ep.cc:933 (ep.so+0x0000000f0a22)
#1 CouchKVStore::commit2couchstore(Callback<KVStatsCtx>*, unsigned long, unsigned long) /repos/couchbase/server/source/ep-engine/src/couch-kvstore/couch-kvstore.cc:1697 (ep.so+0x0000001aa8c6)
#2 CouchKVStore::commit(Callback<KVStatsCtx>*, unsigned long, unsigned long) /repos/couchbase/server/source/ep-engine/src/couch-kvstore/couch-kvstore.cc:1040 (ep.so+0x0000001a6483)
#3 EventuallyPersistentStore::flushVBucket(unsigned short) /repos/couchbase/server/source/ep-engine/src/ep.cc:2909 (ep.so+0x0000000e780b)
#4 Flusher::flushVB() /repos/couchbase/server/source/ep-engine/src/flusher.cc:283 (ep.so+0x00000013f363)
#5 Flusher::step(GlobalTask*) /repos/couchbase/server/source/ep-engine/src/flusher.cc:174 (ep.so+0x00000013e9c8)
#6 FlusherTask::run() /repos/couchbase/server/source/ep-engine/src/tasks.cc:44 (ep.so+0x000000174a85)
#7 ExecutorThread::run() /repos/couchbase/server/source/ep-engine/src/executorthread.cc:110 (ep.so+0x00000014a0c1)
#8 launch_executor_thread /repos/couchbase/server/source/ep-engine/src/executorthread.cc:34 (ep.so+0x00000014990a)
#9 platform_thread_wrap /repos/couchbase/server/source/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#10 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

      Previous read of size 8 at 0x7d440000c5b0 by main thread (mutexes: write M193510842443017784):
#0 VBucketCountVisitor::visitBucket(RCPtr<VBucket>&) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2889 (ep.so+0x00000010c631)
#1 VBucketCountAggregator::visitBucket(RCPtr<VBucket>&) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2926 (ep.so+0x000000121392)
#2 EventuallyPersistentStore::visit(VBucketVisitor&) /repos/couchbase/server/source/ep-engine/src/ep.cc:3278 (ep.so+0x0000000e99d9)
#3 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:2955 (ep.so+0x00000010cb46)
#4 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:4344 (ep.so+0x000000113c0f)
#5 EvpGetStats /repos/couchbase/server/source/ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#6 mock_get_stats /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#7 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000832d8)
#8 wait_for_flusher_to_settle(engine_interface*, engine_interface_v1*) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:900 (ep_testsuite.so+0x000000083cd4)
#9 wait_for_persisted_value(engine_interface*, engine_interface_v1*, char const*, char const*, unsigned short) /repos/couchbase/server/source/ep-engine/tests/ep_test_apis.cc:917 (ep_testsuite.so+0x000000083de5)
#10 test_io_stats /repos/couchbase/server/source/ep-engine/tests/ep_testsuite.cc:6279 (ep_testsuite.so+0x0000000468b3)
#11 execute_test /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000059fc)
#12 main /repos/couchbase/server/source/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

Change to AtomicValues to fix this.

Change-Id: Ie7f9a403f809e751ff1802cceb5d2a77a483a586
Reviewed-on: http://review.couchbase.org/62971
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19249: Address possible data races in ConnHandler context 70/62970/6
abhinavdangeti [Mon, 5 Oct 2015 22:55:34 +0000 (15:55 -0700)]
MB-19249: Address possible data races in ConnHandler context

WARNING: ThreadSanitizer: data race (pid=2443)

  Read of size 1 at 0x7d5000016a58 by thread T10:
    #0 ConnHandler::doDisconnect() /home/abhinav/couchbase/ep-engine/src/tapconnection.h:375 (ep.so+0x000000058416)
    #1 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f87f6)
    #2 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8395)
    #3 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 1 at 0x7d5000016a58 by main thread:
    [failed to restore the stack]

  Location is heap block of size 480 at 0x7d5000016a00 allocated by main thread:
    #0 operator new(unsigned long) <null>:0 (engine_testapp+0x00000005084d)
    #1 DcpConnMap::newConsumer(void const*, std::string const&) /home/abhinav/couchbase/ep-engine/src/connmap.cc:969 (ep.so+0x000000048384)
    #2 EventuallyPersistentEngine::dcpOpen(void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6189 (ep.so+0x0000000d3668)
    #3 EvpDcpOpen(engine_interface*, void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:1494 (ep.so+0x0000000b4e5f)
    #4 mock_dcp_open(engine_interface*, void const*, unsigned int, unsigned int, unsigned int, void*, unsigned short) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:488 (engine_testapp+0x0000000bb015)
    #5 test_dcp_consumer_flow_control_aggressive(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:3826 (ep_testsuite.so+0x00000006ecfd)
    #6 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #7 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Change-Id: Id5223e93c416e5e5287c137d561aea1e453cbd41
Reviewed-on: http://review.couchbase.org/62970
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19248: Fix race in TaskQueue.{ready,future,pending}Queue access 69/62969/6
Dave Rigby [Wed, 12 Nov 2014 17:26:17 +0000 (17:26 +0000)]
MB-19248: Fix race in TaskQueue.{ready,future,pending}Queue access

Fix race as identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=4243)
      Read of size 8 at 0x7d04000fde60 by main thread (mutexes: write M1367):
#0 std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >::operator++() /usr/include/c++/4.8/bits/stl_list.h:235 (ep.so+0x00000013a129)
#1 std::iterator_traits<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >::difference_type std::__distance<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >(std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::input_iterator_tag) /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:82 (ep.so+0x000000138b67)
#2 std::iterator_traits<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >::difference_type std::distance<std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> > >(std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >, std::_List_const_iterator<SingleThreadedRCPtr<GlobalTask> >) /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:118 (ep.so+0x000000136b63)
#3 std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::size() const /usr/include/c++/4.8/bits/stl_list.h:874 (ep.so+0x000000135538)
#4 ExecutorPool::doTaskQStat(EventuallyPersistentEngine*, void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/executorpool.cc:654 (ep.so+0x0000001331b6)
#5 EventuallyPersistentEngine::doWorkloadStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4198 (ep.so+0x000000112ed3)
#6 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4454 (ep.so+0x000000114cb4)
#7 EvpGetStats ep-engine/src/ep_engine.cc:217 (ep.so+0x000000102b14)
#8 mock_get_stats memcached/programs/engine_testapp/engine_testapp.c:195 (exe+0x0000000026de)
#9 test_workload_stats ep-engine/tests/ep_testsuite.cc:7094 (ep_testsuite.so+0x00000004e931)
#10 execute_test memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000059fc)
#11 main memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x000000006606)

      Previous write of size 8 at 0x7d04000fde60 by thread T5 (mutexes: write M45):
#0 RCValue::_rc_decref() const ep-engine/src/atomic.h:293 (ep.so+0x000000096797)
#1 __gnu_cxx::new_allocator<std::_List_node<SingleThreadedRCPtr<GlobalTask> > >::allocate(unsigned long, void const*) /usr/include/c++/4.8/ext/new_allocator.h:104 (ep.so+0x00000017b42b)
#2 std::_List_base<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_get_node() /usr/include/c++/4.8/bits/stl_list.h:334 (ep.so+0x00000017b0dc)
#3 std::_List_node<SingleThreadedRCPtr<GlobalTask> >* std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_create_node<SingleThreadedRCPtr<GlobalTask> const&>(SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:502 (ep.so+0x00000017a25a)
#4 void std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::_M_insert<SingleThreadedRCPtr<GlobalTask> const&>(std::_List_iterator<SingleThreadedRCPtr<GlobalTask> >, SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:1561 (ep.so+0x000000178e7b)
#5 std::list<SingleThreadedRCPtr<GlobalTask>, std::allocator<SingleThreadedRCPtr<GlobalTask> > >::push_back(SingleThreadedRCPtr<GlobalTask> const&) /usr/include/c++/4.8/bits/stl_list.h:1016 (ep.so+0x00000017817b)
#6 TaskQueue::_fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:126 (ep.so+0x000000176932)
#7 TaskQueue::fetchNextTask(ExecutorThread&, bool) ep-engine/src/taskqueue.cc:142 (ep.so+0x000000176acf)
#8 ExecutorPool::_nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:214 (ep.so+0x000000130707)
#9 ExecutorPool::nextTask(ExecutorThread&, unsigned char) ep-engine/src/executorpool.cc:229 (ep.so+0x0000001307a5)
#10 ExecutorThread::run() ep-engine/src/executorthread.cc:78 (ep.so+0x000000149da0)
#11 launch_executor_thread ep-engine/src/executorthread.cc:34 (ep.so+0x00000014990a)
#12 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x000000002d8b)
#13 __tsan_write_range ??:0 (libtsan.so.0+0x00000001b1c9)

Fix by adding new helper methods which will return the size of the
various queues (while holding the TaskQueue's mutex).

Change-Id: If5e8d357e45803d78c4ba6ed1475e6e1a90e1c89
Reviewed-on: http://review.couchbase.org/62969
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19247: Fix possible data race in workload.h: workloadPattern 68/62968/6
abhinavdangeti [Mon, 5 Oct 2015 21:22:38 +0000 (14:22 -0700)]
MB-19247: Fix possible data race in workload.h: workloadPattern

This variable is reported in the ThreadSanitizer output as being read
by a stats function, which could result in incorrect stats printed to
the user.  However, from code inspection the compactor
(EventuallyPersistentStore::scheduleCompaction) also reads this
variable, and hence could result in incorrect task scheduling.

WARNING: ThreadSanitizer: data race (pid=24180)

  Read of size 4 at 0x7d040000f608 by main thread (mutexes: write M1308043):
    #0 WorkLoadPolicy::stringOfWorkLoadPattern() ep-engine/src/workload.h:65 (ep.so+0x0000000bee15)
    #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4554 (ep.so+0x0000000c5cac)
    #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b4dee)
    #3 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #4 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aebb1)
    #5 test_access_scanner(engine_interface*, engine_interface_v1*) ep-engine/tests/ep_testsuite.cc:8569 (ep_testsuite.so+0x00000002efd7)
    #6 execute_test(test, char const*, char const*) memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #7 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous write of size 4 at 0x7d040000f608 by thread T10:
    #0 WorkLoadPolicy::setWorkLoadPattern(workload_pattern_t) ep-engine/src/workload.h:76 (ep.so+0x00000013d75b)
    #1 ExecutorThread::run() ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f9503)
    #2 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9085)
    #3 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: If1dd4885a7beefc804e425d077ff18b117be8bdd
Reviewed-on: http://review.couchbase.org/62968
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19246: Fix potentially incorrect persist_time in OBSERVE response 67/62967/6
abhinavdangeti [Mon, 5 Oct 2015 23:20:08 +0000 (16:20 -0700)]
MB-19246: Fix potentially incorrect persist_time in OBSERVE response

There is a data race in updating EPStore::lastTransTimePerItem. This
could result in an incorrect value for the persist_time field in an
OBSERVE response.

WARNING: ThreadSanitizer: data race (pid=4590)

  Write of size 8 at 0x7d540000fe88 by thread T8 (mutexes: write M11599):
    #0 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3307 (ep.so+0x00000009954f)
    #1 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ff32f)
    #2 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fd825)
    #3 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:63 (ep.so+0x00000013bbb2)
    #4 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f89b6)
    #5 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8555)
    #6 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 8 at 0x7d540000fe88 by thread T6 (mutexes: write M11602):
    #0 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3307 (ep.so+0x00000009954f)
    #1 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ff32f)
    #2 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fd825)
    #3 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:63 (ep.so+0x00000013bbb2)
    #4 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f89b6)
    #5 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8555)
    #6 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Iccf1b0eacba495a8147fe81922361d566cb1d6a0
Reviewed-on: http://review.couchbase.org/62967
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19229: Address possible data race in vbucket.cc: numHpChks 66/62966/5
abhinavdangeti [Mon, 5 Oct 2015 21:21:54 +0000 (14:21 -0700)]
MB-19229: Address possible data race in vbucket.cc: numHpChks

WARNING: ThreadSanitizer: data race (pid=23450)

  Write of size 8 at 0x7d680001f678 by thread T5 (mutexes: write M11600, write M15531):
    #0 VBucket::notifyCheckpointPersisted(EventuallyPersistentEngine&, unsigned long, bool) /home/abhinav/couchbase/ep-engine/src/vbucket.cc:356 (ep.so+0x00000014b3e3)
    #1 EventuallyPersistentStore::flushVBucket(unsigned short) /home/abhinav/couchbase/ep-engine/src/ep.cc:3334 (ep.so+0x000000099b9f)
    #2 Flusher::flushVB() /home/abhinav/couchbase/ep-engine/src/flusher.cc:296 (ep.so+0x0000000ffe7f)
    #3 Flusher::step(GlobalTask*) /home/abhinav/couchbase/ep-engine/src/flusher.cc:186 (ep.so+0x0000000fe375)
    #4 FlusherTask::run() /home/abhinav/couchbase/ep-engine/src/tasks.cc:62 (ep.so+0x00000013c928)
    #5 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f9503)
    #6 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f9085)
    #7 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous read of size 8 at 0x7d680001f678 by main thread (mutexes: write M910146210457775232):
    #0 VBucket::getHighPriorityChkSize() /home/abhinav/couchbase/ep-engine/src/vbucket.cc:401 (ep.so+0x00000014b7d9)
    #1 VBucketCountVisitor::visitBucket(RCPtr<VBucket>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3030 (ep.so+0x0000000ba940)
    #2 VBucketCountAggregator::visitBucket(RCPtr<VBucket>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3067 (ep.so+0x0000000baf43)
    #3 EventuallyPersistentStore::visit(VBucketVisitor&) /home/abhinav/couchbase/ep-engine/src/ep.cc:3719 (ep.so+0x000000089917)
    #4 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3093 (ep.so+0x0000000bb5ab)
    #5 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4554 (ep.so+0x0000000c5cac)
    #6 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b4dee)
    #7 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #8 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/abhinav/couchbase/ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aebb1)
    #9 test_access_scanner(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:8569 (ep_testsuite.so+0x00000002efd7)
    #10 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #11 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Change-Id: I98021a535bd78d34e31d428e364192bb2ef33dcf
Reviewed-on: http://review.couchbase.org/62966
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19228: Address possible data races in ActiveStream context 18/62918/6
abhinavdangeti [Mon, 5 Oct 2015 22:22:48 +0000 (15:22 -0700)]
MB-19228: Address possible data races in ActiveStream context

Address possible data races in ActiveStream context when gathering
stats.

WARNING: ThreadSanitizer: data race (pid=27028)

  Read of size 8 at 0x7d480000b1f8 by main thread (mutexes: write M32941632, write M1367, write M32940809):
    #0 void STATWRITER_NAMESPACE::add_casted_stat<unsigned long>(char const*, unsigned long const&, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/statwriter.h:45 (ep.so+0x000000037825)
    #1 ActiveStream::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:477 (ep.so+0x000000071d16)
    #2 DcpProducer::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) /home/abhinav/couchbase/ep-engine/src/dcp/producer.cc:602 (ep.so+0x000000068057)
    #3 ConnStatBuilder::operator()(SingleThreadedRCPtr<ConnHandler>&) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:3887 (ep.so+0x0000000e13e1)
    #4 EventuallyPersistentEngine::doDcpStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4144 (ep.so+0x0000000c151a)
    #5 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:4564 (ep.so+0x0000000c5405)
    #6 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:213 (ep.so+0x0000000b422e)
    #7 mock_get_stats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:239 (engine_testapp+0x0000000ba9ad)
    #8 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/abhinav/couchbase/ep-engine/tests/ep_test_apis.cc:990 (ep_testsuite.so+0x0000000aeb81)
    #9 dcp_stream(engine_interface*, engine_interface_v1*, char const*, void const*, unsigned short, unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, int, int, int, int, bool, bool, unsigned char, bool, unsigned long*, bool) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:4090 (ep_testsuite.so+0x00000009790c)
    #10 test_dcp_producer_stream_req_dgm(engine_interface*, engine_interface_v1*) /home/abhinav/couchbase/ep-engine/tests/ep_testsuite.cc:4564 (ep_testsuite.so+0x000000077604)
    #11 execute_test(test, char const*, char const*) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:1090 (engine_testapp+0x0000000b946c)
    #12 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

  Previous write of size 8 at 0x7d480000b1f8 by thread T9 (mutexes: write M32940880, write M32940855):
    #0 ActiveStream::backfillReceived(Item*, backfill_source_t) /home/abhinav/couchbase/ep-engine/src/dcp/stream.cc:287 (ep.so+0x00000007054e)
    #1 DiskCallback::callback(GetValue&) /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:94 (ep.so+0x000000056067)
    #2 CouchKVStore::recordDbDump(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1757 (ep.so+0x00000018103f)
    #3 recordDbDumpC(_db*, _docinfo*, void*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:66 (ep.so+0x00000017fcc5)
    #4 lookup_callback(couchfile_lookup_request*, _sized_buf const*, _sized_buf const*) /home/abhinav/couchbase/couchstore/src/couch_db.cc:767 (libcouchstore.so+0x00000000d7f5)
    #5 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:99 (libcouchstore.so+0x00000000b5b2)
    #6 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:69 (libcouchstore.so+0x00000000b370)
    #7 btree_lookup_inner(couchfile_lookup_request*, unsigned long, int, int) /home/abhinav/couchbase/couchstore/src/btree_read.cc:69 (libcouchstore.so+0x00000000b370)
    #8 btree_lookup /home/abhinav/couchbase/couchstore/src/btree_read.cc:131 (libcouchstore.so+0x00000000b00c)
    #9 couchstore_changes_since /home/abhinav/couchbase/couchstore/src/couch_db.cc:812 (libcouchstore.so+0x00000000d601)
    #10 CouchKVStore::scan(ScanContext*) /home/abhinav/couchbase/ep-engine/src/couch-kvstore/couch-kvstore.cc:1264 (ep.so+0x00000017f77e)
    #11 DCPBackfill::scan() /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:193 (ep.so+0x000000057672)
    #12 DCPBackfill::run() /home/abhinav/couchbase/ep-engine/src/dcp/backfill.cc:118 (ep.so+0x000000056647)
    #13 BackfillManager::backfill() /home/abhinav/couchbase/ep-engine/src/dcp/backfill-manager.cc:240 (ep.so+0x0000000508d5)
    #14 BackfillManagerTask::run() /home/abhinav/couchbase/ep-engine/src/dcp/backfill-manager.cc:43 (ep.so+0x00000005052f)
    #15 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f8796)
    #16 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f8335)
    #17 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: I166917524b5fcad285b3623ff160e875c316d983
Reviewed-on: http://review.couchbase.org/62918
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19227: Fix race in ConnNotifier.task access 17/62917/4
Dave Rigby [Mon, 17 Nov 2014 15:22:22 +0000 (15:22 +0000)]
MB-19227: Fix race in ConnNotifier.task access

Race identified by ThreadSanitizer:

      Read of size 8 at 0x7d08000c1430 by thread T10:
        #0 ConnNotifier::notifyConnections() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:127 (ep.so+0x0000002537e8)
        #1 ConnNotifierCallback::run() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:80 (ep.so+0x000000277e8b)
        #2 ExecutorThread::run() /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:110 (ep.so+0x0000002163af)
        #3 launch_executor_thread(void*) /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000002158ba)
        #4 platform_thread_wrap /home/vagrant/couchbase-server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000033f4)

      Previous write of size 8 at 0x7d08000c1430 by main thread:
        #0 ConnNotifier::start() /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:99 (ep.so+0x00000025336f)
        #1 ConnMap::initialize(conn_notifier_type) /home/vagrant/couchbase-server/ep-engine/src/connmap.cc:194 (ep.so+0x0000002544a6)
        #2 EventuallyPersistentEngine::initialize(char const*) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:2060 (ep.so+0x000000179b4a)
        #3 EvpInitialize(engine_interface*, char const*) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:135 (ep.so+0x000000173675)
        #4 init_engine /home/vagrant/couchbase-server/memcached/utilities/engine_loader.c:116 (libmcd_util.so.1.0.0+0x000000003fac)
        #5 start_your_engines /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:913 (exe+0x0000000a2fb5)
        #6 execute_test /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1048 (exe+0x0000000a3d29)
        #7 main /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x0000000a1d84)

Change-Id: I16cfdff1ea363bbb07a62a92f09f829483276b3d
Reviewed-on: http://review.couchbase.org/62917
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19226: Address potential data races in the warmup code 16/62916/4
abhinavdangeti [Thu, 16 Jul 2015 23:59:01 +0000 (16:59 -0700)]
MB-19226: Address potential data races in the warmup code

Context: warmupState, estimatedWarmupCount,
warmup, metadata

As reported by ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=7023)
  Write of size 8 at 0x7d240000d5e0 by thread T7:
    #0 Warmup::checkForAccessLog() /home/daver/repos/couchbase/server/ep-engine/src/warmup.cc:590 (ep.so+0x0000002d1ffc)
    #1 WarmupCheckforAccessLog::run() /home/daver/repos/couchbase/server/ep-engine/src/warmup.h:303 (ep.so+0x0000002e2bfb)
    #2 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:106 (ep.so+0x0000001e34f9)
    #3 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2b7a)
    #4 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Previous read of size 8 at 0x7d240000d5e0 by main thread (mutexes: write M699180732592992152):
  #0 Warmup::addStats(void (*)(char const*, unsigned short, char const*, unsigned int, void const*), void const*) const /home/daver/repos/couchbase/server/ep-engine/src/warmup.cc:893 (ep.so+0x0000002d6086)
  #1 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:4422 (ep.so+0x000000151813)
  #2 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:214 (ep.so+0x0000001367b2)
  #3 mock_get_stats /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:194 (engine_testapp+0x0000004bde63)
  #4 wait_for_warmup_complete(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_test_apis.cc:898 (ep_testsuite.so+0x0000000ead95)
  #5 test_setup(engine_interface*, engine_interface_v1*) /home/daver/repos/couchbase/server/ep-engine/tests/ep_testsuite.cc:168 (ep_testsuite.so+0x0000000237d3)
  #6 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1037 (engine_testapp+0x0000004ba82b)
  #7 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

Change-Id: If96933b3b8b0aa1ed75073a0d8d629f138da081f
Reviewed-on: http://review.couchbase.org/62916
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19225: Fix data race on Flusher::taskId 15/62915/4
Dave Rigby [Wed, 15 Jul 2015 08:40:25 +0000 (08:40 +0000)]
MB-19225: Fix data race on Flusher::taskId

Flusher::taskId is accessed from different threads without correct
synchronization - see ThreadSanitizer report:

  WARNING: ThreadSanitizer: data race (pid=13527)
    Write of size 8 at 0x7d4400009650 by thread T14 (mutexes: write M5523):
      #0 Flusher::step(GlobalTask*) couchbase/ep-engine/src/flusher.cc:200 (ep.so+0x0000000e78ae)
      #1 FlusherTask::run() couchbase/ep-engine/src/tasks.cc:61 (ep.so+0x0000001202e2)
      #2 ExecutorThread::run() couchbase/ep-engine/src/executorthread.cc:124 (ep.so+0x0000000e2e05)
      #3 launch_executor_thread(void*) couchbase/ep-engine/src/executorthread.cc:34 (ep.so+0x0000000e28b9)
      #4 platform_thread_wrap .ccache/tmp/cb_pthread.tmp.7e5bc917ff0e.45579.i:0 (libplatform.so.0.1.0+0x000000003891)

    Previous read of size 8 at 0x7d4400009650 by main thread:
      #0 Flusher::wait() couchbase/ep-engine/src/flusher.cc:41 (ep.so+0x0000000e66cf)
      #1 EventuallyPersistentStore::~EventuallyPersistentStore() couchbase/ep-engine/src/ep.cc:514 (ep.so+0x000000071386)
      #2 EventuallyPersistentEngine::~EventuallyPersistentEngine() couchbase/ep-engine/src/ep_engine.cc:6201 (ep.so+0x0000000bfeea)
      #3 EvpDestroy(engine_interface*, bool) couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000a0e9c)
      #4 mock_destroy(engine_interface*, bool) couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000c4b87)
      #5 execute_test(test, char const*, char const*) couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000c4076)
      #6 __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226 (libc.so.6+0x00000002176c)

Given that taskId is a simple primitive type (size_t) fix by removing
the mutex (which wasn't acquired for all accesses) and replace taskId
with an atomic type.

Change-Id: Idc75278ed2882abd173297b77bdb72834cbe4163
Reviewed-on: http://review.couchbase.org/62915
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19225: Fix race in Flusher._state 14/62914/4
Dave Rigby [Mon, 17 Nov 2014 15:01:16 +0000 (15:01 +0000)]
MB-19225: Fix race in Flusher._state

Fix race identified by ThreadSanitizer:

    WARNING: ThreadSanitizer: data race (pid=10664)
      Read of size 4 at 0x7d4400009888 by main thread (mutexes: write M18321):
        #0 Flusher::stateName() const /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:119 (ep.so+0x0000001fc22e)
        #1 EventuallyPersistentEngine::doEngineStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:3025 (ep.so+0x000000180649)
        #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:4374 (ep.so+0x00000018d960)
        #3 EvpGetStats(engine_interface*, void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) /home/vagrant/couchbase-server/ep-engine/src/ep_engine.cc:214 (ep.so+0x000000174062)
        #4 mock_get_stats /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:196 (exe+0x0000000a741d)
        #5 get_int_stat(engine_interface*, engine_interface_v1*, char const*, char const*) /home/vagrant/couchbase-server/ep-engine/tests/ep_test_apis.cc:799 (ep_testsuite.so+0x0000000dae46)
        #6 verify_curr_items(engine_interface*, engine_interface_v1*, int, char const*) /home/vagrant/couchbase-server/ep-engine/tests/ep_test_apis.cc:834 (ep_testsuite.so+0x0000000e1d31)
        #7 test_dcp_producer_stream_req_disk(engine_interface*, engine_interface_v1*) /home/vagrant/couchbase-server/ep-engine/tests/ep_testsuite.cc:3589 (ep_testsuite.so+0x000000094c9a)
        #8 execute_test /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1055 (exe+0x0000000a3e83)
        #9 main /home/vagrant/couchbase-server/memcached/programs/engine_testapp/engine_testapp.c:1313 (exe+0x0000000a1d84)

      Previous write of size 4 at 0x7d4400009888 by thread T5:
        #0 Flusher::transition_state(flusher_state) /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:114 (ep.so+0x0000001fb7df)
        #1 Flusher::step(GlobalTask*) /home/vagrant/couchbase-server/ep-engine/src/flusher.cc:167 (ep.so+0x0000001fc9d1)
        #2 FlusherTask::run() /home/vagrant/couchbase-server/ep-engine/src/tasks.cc:44 (ep.so+0x00000027870e)
        #3 ExecutorThread::run() /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:110 (ep.so+0x0000002160ff)
        #4 launch_executor_thread(void*) /home/vagrant/couchbase-server/ep-engine/src/executorthread.cc:34 (ep.so+0x00000021560a)
        #5 platform_thread_wrap /home/vagrant/couchbase-server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000033f4)

Change-Id: Iaeb60efdc1032de7ba344e04cce454cc1d876d40
Reviewed-on: http://review.couchbase.org/62914
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
4 years agoMB-19224: Address possible data race with global task's waketime 13/62913/4
abhinavdangeti [Mon, 5 Oct 2015 19:18:47 +0000 (12:18 -0700)]
MB-19224: Address possible data race with global task's waketime

WARINING: ThreadSanitizer: data race (pid=7728)

Write of size 8 at 0x7d140000e8a8 by main thread (mutexes: write M11519, write M11535):
0 TaskQueue::_wake(SingleThreadedRCPtr<GlobalTask>&) /home/abhinav/couchbase/ep-engine/src/taskqueue.cc:272 (ep.so+0x000000142b59)
1 TaskQueue::wake(SingleThreadedRCPtr<GlobalTask>&) /home/abhinav/couchbase/ep-engine/src/taskqueue.cc:299 (ep.so+0x00000014382e)
2 ExecutorPool::_stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:568 (ep.so+0x0000000f2f46)
3 ExecutorPool::stopTaskGroup(unsigned long, task_type_t, bool) /home/abhinav/couchbase/ep-engine/src/executorpool.cc:585 (ep.so+0x0000000f31ee)
4 ~EventuallyPersistentStore /home/abhinav/couchbase/ep-engine/src/ep.cc:468 (ep.so+0x0000000830f6)
5 ~EventuallyPersistentEngine /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:6326 (ep.so+0x0000000d42ba)
6 EvpDestroy(engine_interface*, bool) /home/abhinav/couchbase/ep-engine/src/ep_engine.cc:141 (ep.so+0x0000000b3fbc)
7 mock_destroy(engine_interface*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:98 (engine_testapp+0x0000000ba027)
8 destroy_bucket(engine_interface*, engine_interface_v1*, bool) /home/abhinav/couchbase/memcached/programs/engine_testapp/engine_testapp.cc:995 (engine_testapp+0x0000000b952e)
9 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021ec4)

Previous write of size 8 at 0x7d140000e8a8 by thread T10:
0 GlobalTask::snooze(double) /home/abhinav/couchbase/ep-engine/src/tasks.cc:56 (ep.so+0x00000013b6fa)
1 ConnManager::run() /home/abhinav/couchbase/ep-engine/src/connmap.cc:151 (ep.so+0x00000005032e)
2 ExecutorThread::run() /home/abhinav/couchbase/ep-engine/src/executorthread.cc:112 (ep.so+0x0000000f86da)
3 launch_executor_thread(void*) /home/abhinav/couchbase/ep-engine/src/executorthread.cc:33 (ep.so+0x0000000f82e5)
4 platform_thread_wrap /home/abhinav/couchbase/platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Ib11f9b3cd6919e292f84cc08260eabd8e1381aa6
Reviewed-on: http://review.couchbase.org/62913
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19223: Switch to hrtime from timeval in Global Thread Pool 12/62912/4
Sundar Sridharan [Thu, 23 Jul 2015 22:39:59 +0000 (15:39 -0700)]
MB-19223: Switch to hrtime from timeval in Global Thread Pool

This has small improvements in memory and cpu usage.
Also fixes several ThreadSanitizer races from unit tests - for example:

WARNING: ThreadSanitizer: data race (pid=21672)
  Write of size 8 at 0x7d140000e7b8 by main thread (mutexes: write M14972, write M14985):
    #0 memcpy <null> (engine_testapp+0x000000453040)
    #1 TaskQueue::_wake(SingleThreadedRCPtr<GlobalTask>&) /home/daver/repos/couchbase/server/ep-engine/src/taskqueue.cc:255 (ep.so+0x0000002577f6)
    #2 TaskQueue::wake(SingleThreadedRCPtr<GlobalTask>&) /home/daver/repos/couchbase/server/ep-engine/src/taskqueue.cc:282 (ep.so+0x000000257c73)
    #3 ExecutorPool::_wake(unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/executorpool.cc:320 (ep.so+0x0000001acc76)
    #4 ExecutorPool::wake(unsigned long) /home/daver/repos/couchbase/server/ep-engine/src/executorpool.cc:328 (ep.so+0x0000001ace13)
    #5 Flusher::wait() /home/daver/repos/couchbase/server/ep-engine/src/flusher.cc:41 (ep.so+0x0000001cc4ff)
    #6 EventuallyPersistentStore::stopFlusher() /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:402 (ep.so+0x0000000d54d5)
    #7 ~EventuallyPersistentStore /home/daver/repos/couchbase/server/ep-engine/src/ep.cc:364 (ep.so+0x0000000d49cb)
    #8 ~EventuallyPersistentEngine /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:5778 (ep.so+0x000000161043)
    #9 EvpDestroy(engine_interface*, bool) /home/daver/repos/couchbase/server/ep-engine/src/ep_engine.cc:143 (ep.so+0x000000135efa)
    #10 mock_destroy /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:61 (engine_testapp+0x0000004bb9d6)
    #11 destroy_engine /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:998 (engine_testapp+0x0000004bb646)
    #12 execute_test /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1048 (engine_testapp+0x0000004baa11)
    #13 main /home/daver/repos/couchbase/server/memcached/programs/engine_testapp/engine_testapp.c:1296 (engine_testapp+0x0000004b8861)

  Previous read of size 8 at 0x7d140000e7b8 by thread T14:
    #0 ExecutorThread::run() /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:106 (ep.so+0x0000001e3488)
    #1 launch_executor_thread(void*) /home/daver/repos/couchbase/server/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e2a5a)
    #2 platform_thread_wrap /home/daver/repos/couchbase/server/platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Change-Id: I78fdddb832251fc062058c04f75f8d22c4c2f68d
Reviewed-on: http://review.couchbase.org/62912
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19222: Fix race condition in TaskQueue shutdown 11/62911/4
Dave Rigby [Wed, 12 Nov 2014 14:58:28 +0000 (14:58 +0000)]
MB-19222: Fix race condition in TaskQueue shutdown

There is a bug in the use of ExecutorThread.state when sleeping a
TaskQueue - TaskQueue::_doSleep() doesn't atomically transition the
state from RUNNING -> SLEEPING. This can cause a deadlock when
shutting down a ExecutorThread:

    Thread A:                           Thread B:
    --------------------------------    ------------------------------
    if (t.state == RUNNING) {  // true
                                        t.state = SHUTDOWN
        t.state = SLEEPING              cb_join_thread(Thread A)
                                        // wait forever
    ...
    if (t.state == SHUTDOWN) { // FALSE
      exit(0) // NEVER REACHED
    }

Fix by changing ExecutorThread.state to be an AtomicValue, and use
compare-and-exchange to move from RUNNING -> SLEEPING (and SLEEPING ->
RUNNING).

Change-Id: I9fab90a83978ae2aa6a0dcdd3b079a1c2f369402
Reviewed-on: http://review.couchbase.org/62911
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19220: Ensure HashTable::size is atomic 10/62910/3
Dave Rigby [Thu, 14 Apr 2016 10:31:27 +0000 (11:31 +0100)]
MB-19220: Ensure HashTable::size is atomic

Fix data race as reported by ThreadSanitizer. I'm pretty sure this is
beniegn, as size can only be modified if *all* HashTable mutexes are
acquired (see HashTable::resize()), and all callers of
getBucketForHash() either do so with at least 1 HashTable mutex
acquired, or they perform double-checked locking (as per this
instance).

TSan report:
    WARNING: ThreadSanitizer: data race (pid=11329)
      Write of size 8 at 0x7ffe3a0ef1f8 by thread T9 (mutexes: write M45069, write M45070, write M45071):
        #0 HashTable::resize(unsigned long) ep-engine/src/stored-value.cc:370 (ep-engine_hash_table_test+0x00000050173c)
        #1 AccessGenerator::resize() ep-engine/tests/module_tests/hash_table_test.cc:314 (ep-engine_hash_table_test+0x0000004df4d9)
        #2 AccessGenerator::operator()() ep-engine/tests/module_tests/hash_table_test.cc:304 (ep-engine_hash_table_test+0x0000004df3fd)
        #3 SyncTestThread<bool>::run() ep-engine/tests/module_tests/threadtests.h:94 (ep-engine_hash_table_test+0x0000004e6bd8)
        #4 _ZL23launch_sync_test_threadIbEvPv ep-engine/tests/module_tests/threadtests.h:66 (ep-engine_hash_table_test+0x0000004c9ca8)
        #5 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

      Previous read of size 8 at 0x7ffe3a0ef1f8 by thread T8:
        #0 HashTable::getBucketForHash(int) ep-engine/src/stored-value.h:1470 (ep-engine_hash_table_test+0x0000004d32b6)
        #1 HashTable::getLockedBucket(int, int*) ep-engine/src/stored-value.h:1265 (ep-engine_hash_table_test+0x0000004d2e69)
        #2 HashTable::getLockedBucket(std::string const&, int*) ep-engine/src/stored-value.h:1295 (ep-engine_hash_table_test+0x0000004d277b)
        #3 HashTable::del(std::string const&) ep-engine/src/stored-value.h:1370 (ep-engine_hash_table_test+0x0000004d8f88)
        #4 AccessGenerator::operator()() ep-engine/tests/module_tests/hash_table_test.cc:306 (ep-engine_hash_table_test+0x0000004df430)
        #5 SyncTestThread<bool>::run() ep-engine/tests/module_tests/threadtests.h:94 (ep-engine_hash_table_test+0x0000004e6bd8)
        #6 _ZL23launch_sync_test_threadIbEvPv ep-engine/tests/module_tests/threadtests.h:66 (ep-engine_hash_table_test+0x0000004c9ca8)
        #7 platform_thread_wrap platform/src/cb_pthreads.c:19 (libplatform.so.0.1.0+0x0000000035dc)

Change-Id: I97c189310aafa8a002299f73cab9cbfb0e619768
Reviewed-on: http://review.couchbase.org/62910
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: ep_testsuite: Don't release the item while we're using it 09/62909/3
Trond Norbye [Wed, 5 Nov 2014 19:24:03 +0000 (20:24 +0100)]
MB-19204: ep_testsuite: Don't release the item while we're using it

Fixes a number of issues detected by ThreadSanitizer.

Change-Id: I6ab6c9fee2497d0843af47647f33bcce73111f76
Reviewed-on: http://review.couchbase.org/62909
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: Address data race in ep_test_apis/testsuite 08/62908/3
Dave Rigby [Thu, 14 Apr 2016 13:08:17 +0000 (14:08 +0100)]
MB-19204: Address data race in ep_test_apis/testsuite

WARNING: ThreadSanitizer: data race (pid=18824)

  Write of size 4 at 0x7fcc31350244 by thread T12 (mutexes: write M44413):
    #0 add_response ep-engine/tests/ep_test_apis.cc:75 (ep_testsuite.so+0x0000000acbcc)
    #1 sendResponse(bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*), void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*) ep-engine/src/ep_engine.cc:92 (ep.so+0x0000000d004c)
    #2 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1266 (ep.so+0x0000000d603c)
    #3 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1387 (ep.so+0x0000000b3f58)
    #4 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:380 (engine_testapp+0x0000000bab29)
    #5 del_with_meta(engine_interface*, engine_interface_v1*, char const*, unsigned long, unsigned int, ItemMetaData*, unsigned long, bool, bool, long, unsigned char, void const*) ep-engine/tests/ep_test_apis.cc:360 (ep_testsuite.so+0x0000000ae69f)
    #6 multi_del_with_meta(void*) ep-engine/tests/ep_testsuite.cc:13299 (ep_testsuite.so+0x00000009572e)
    #7 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

  Previous write of size 4 at 0x7fcc31350244 by thread T11 (mutexes: write M1638603450185076640):
    #0 add_response ep-engine/tests/ep_test_apis.cc:75 (ep_testsuite.so+0x0000000acbcc)
    #1 sendResponse(bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*), void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*) ep-engine/src/ep_engine.cc:92 (ep.so+0x0000000ced72)
    #2 processUnknownCommand(EventuallyPersistentEngine*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1258 (ep.so+0x0000000d5718)
    #3 EvpUnknownCommand(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) ep-engine/src/ep_engine.cc:1387 (ep.so+0x0000000b3f58)
    #4 mock_unknown_command(engine_interface*, void const*, protocol_binary_request_header*, bool (*)(void const*, unsigned short, void const*, unsigned char, void const*, unsigned int, unsigned char, unsigned short, unsigned long, void const*)) memcached/programs/engine_testapp/engine_testapp.cc:380 (engine_testapp+0x0000000bab29)
    #5 set_with_meta(engine_interface*, engine_interface_v1*, char const*, unsigned long, char const*, unsigned long, unsigned int, ItemMetaData*, unsigned long, bool, unsigned char, bool, long, unsigned char, void const*) ep-engine/tests/ep_test_apis.cc:702 (ep_testsuite.so+0x0000000b24db)
    #6 multi_set_with_meta(void*) ep-engine/tests/ep_testsuite.cc:13279 (ep_testsuite.so+0x000000094e33)
    #7 platform_thread_wrap platform/src/cb_pthreads.c:23 (libplatform.so.0.1.0+0x000000003d31)

Change-Id: Ic53d401fb674dbe161aa73381e2c08c5995f262a
Reviewed-on: http://review.couchbase.org/62908
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: ep_testsuite: Use std::string for last_key/body 07/62907/4
Dave Rigby [Thu, 2 Jul 2015 15:25:10 +0000 (15:25 +0000)]
MB-19204: ep_testsuite: Use std::string for last_key/body

Replace the manually-managed char* for last_body and last_key with
std::string. This solves the issue of leaving these two buffers
un-free'd at the end of a test; and gives simplifies managing and
testing the last body & key values.

Change-Id: Ic1c64032e34e7abbe5ba8de3e16c115a78a6632f
Reviewed-on: http://review.couchbase.org/62907
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Will Gardner <will.gardner@couchbase.com>
4 years agoMB-19204: Remove alarm() call from atomic_ptr_test, reduce iteration count 06/62906/2
Dave Rigby [Thu, 14 Apr 2016 11:14:39 +0000 (12:14 +0100)]
MB-19204: Remove alarm() call from atomic_ptr_test, reduce iteration count

This test runs slower under ThreadSanitizer than normal. Given we
already have CTest enforcing timeouts, remove the explicit alarm calls
and handle the timeout at the CTest level.

Also reduce the iteration count by 10x, so the test runs in a more
resonable time.

Change-Id: Ia6914c10a3073f5fea121cb7e600568ca5081beb
Reviewed-on: http://review.couchbase.org/62906
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-19204: hash_table_test: Fix TSan issues 05/62905/2
Dave Rigby [Tue, 6 Oct 2015 10:53:21 +0000 (10:53 +0000)]
MB-19204: hash_table_test: Fix TSan issues

Fix issues with hash_table_test on 3.x:

* The default number of HashTable locks (193) causes problems for
  ThreadSanitizer as it exceeds the maximum number of acquired locks
  it can track. Given that the tests where we do not already set the
  lock count are single-threaded, change these to have 1 lock.

* Remove alarm() calls - the tests take longer when run under TSan,
  and given that CTest already enforeces a test-level timeout these
  are redundent inside the test functions.

* Fix data race on AccessGenerator::size test harness.

Change-Id: Ib30b36bbd6517f1326660ae578a12d93e4d828c7
Reviewed-on: http://review.couchbase.org/62905
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
4 years agoMB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call 25/62925/3 v4.1.1
Manu Dhundi [Fri, 15 Apr 2016 20:54:42 +0000 (13:54 -0700)]
MB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call

For replica vbucket we must send snapshotEnd received in the last snapshotMarker
as the high seqno. Sending lastClosedChkSeqno can cause problems for view engine
which builds an index from replica vbucket.

Previously this was sent correctly in seqno stats, now adding it for
GET_ALL_VB_SEQNOS as well.

Change-Id: Ifad267521184c4976e1cb194e6814b56963298b0
Reviewed-on: http://review.couchbase.org/62925
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call 89/62889/5 v3.1.5
Manu Dhundi [Fri, 15 Apr 2016 15:38:44 +0000 (08:38 -0700)]
MB-16656: Send snapshotEnd as highSeqno for replica vb in GET_ALL_VB_SEQNOS call

For replica vbucket we must send snapshotEnd received in the last snapshotMarker
as the high seqno. Sending lastClosedChkSeqno can cause problems for view engine
which builds an index from replica vbucket.

Previously this was sent correctly in seqno stats, now adding it for
GET_ALL_VB_SEQNOS as well.

Change-Id: I58dd168f9248263172759616bc53e751b536e5e3
Reviewed-on: http://review.couchbase.org/62889
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-19153: Break circular dependency while deleting bucket 99/62699/4
abhinavdangeti [Tue, 12 Apr 2016 01:28:37 +0000 (18:28 -0700)]
MB-19153: Break circular dependency while deleting bucket

As part of unregistering the last bucket, when stopTaskGroup
is invoked, all the running threads will cancelled. In this
issue reported, when DcpBackfill was closed, the ref count of
the DcpProducer whose reference it was holding on to became
zero, causing its destructor to be invoked. In the DcpProducer's
destructor, an attempt was made to cancel the checkpoint creator
task which needed to acquire the executorpool's tMutex that
unregisterBucket had already acquired.

Reproduction steps:
<delete_bucket> --> <unregister_bucket> --> <stop_task_group>
    --> <acquire tMutex> --> .. --> <cancel DcpBackfill> -->
    --> <destroy DcpBackfill> --> <destroy DcpProducer>
    --> <cancel Checkpoint creator task> --> [tries to acquire tMutex]

The fix here would be to not attempt to kill the task within
the DcpProducer's destructor, but to do so when the producer is
being disconnected.

+ Unit test case that reproduces the hang.

Change-Id: Ia3c0597e3d8f85a1b40ef56e251e38339023b471
Reviewed-on: http://review.couchbase.org/62699
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-19113: Address false positive lock inversion seen with test_mb16357 67/62567/7
abhinavdangeti [Fri, 8 Apr 2016 20:57:42 +0000 (13:57 -0700)]
MB-19113: Address false positive lock inversion seen with test_mb16357

The inversion being pointed out by this test case is between the snapshot
lock and the hash table lock and this scenario would never happen
during regular operation, this is because the first thread points out
that the vbucket is active while the second indicates that the vbucket
is replica. These 2 operations can never occur simulataneously.

ThreadSanitizer assumes that this is a lock inversion because the 2
operations are done by different threads (main_thread and dcp_thread).

The fix here suppresses this thread sanitizer warning by getting rid of
the dcp_thread, and instead have the main_thread perform the activity
that the dcp thread is responsible for.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=5899)
  Cycle in lock order graph: M21372 (0x7d780000f510) => M21408 (0x7d640000f920) => M21372

  Mutex M21408 acquired here while holding mutex M21372 in main thread:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e970)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x000000003870)
    #2 Mutex::acquire() /home/couchbase/couchbase/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e287e)
    #3 LockHolder::lock() /home/couchbase/couchbase/ep-engine/src/locks.h:71 (ep.so+0x000000082543)
    #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/couchbase/ep-engine/src/locks.h:48 (ep.so+0x0000000821b2)
    #5 VBucket::getSnapshotLock() /home/couchbase/couchbase/ep-engine/src/vbucket.h:212 (ep.so+0x000000104c72)
    #6 EventuallyPersistentStore::queueDirty(RCPtr<VBucket>&, StoredValue*, LockHolder*, bool, bool, bool) /home/couchbase/couchbase/ep-engine/src/ep.cc:2863 (ep.so+0x0000000d7123)
    #7 EventuallyPersistentStore::set(Item const&, void const*, bool, unsigned char) /home/couchbase/couchbase/ep-engine/src/ep.cc:683 (ep.so+0x0000000d9dfa)
    #8 EventuallyPersistentEngine::store(void const*, void*, unsigned long*, ENGINE_STORE_OPERATION, unsigned short) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:2128 (ep.so+0x00000013d538)
    #9 EvpStore(engine_interface*, void const*, void*, unsigned long*, ENGINE_STORE_OPERATION, unsigned short) /home/couchbase/couchbase/ep-engine/src/ep_engine.cc:229 (ep.so+0x00000013712d)
    #10 mock_store /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c7304)
    #11 storeCasVb11(engine_interface*, engine_interface_v1*, void const*, ENGINE_STORE_OPERATION, char const*, char const*, unsigned long, unsigned int, void**, unsigned long, unsigned short, unsigned int, unsigned char) /home/couchbase/couchbase/ep-engine/tests/ep_test_apis.cc:659 (ep_testsuite.so+0x0000000e8d17)
    #12 store(engine_interface*, engine_interface_v1*, void const*, ENGINE_STORE_OPERATION, char const*, char const*, void**, unsigned long, unsigned short, unsigned int, unsigned char) /home/couchbase/couchbase/ep-engine/tests/ep_test_apis.cc:631 (ep_testsuite.so+0x0000000e654a)
    #13 test_mb16357(engine_interface*, engine_interface_v1*) /home/couchbase/couchbase/ep-engine/tests/ep_testsuite.cc:11713 (ep_testsuite.so+0x0000000afc36)
    #14 execute_test /home/couchbase/couchbase/memcached/programs/engine_testapp/engine_testapp.c (engine_testapp+0x0000004c4e2f)
    #15 main crtstuff.c (engine_testapp+0x0000004c2d91)

  Mutex M21372 acquired here while holding mutex M21408 in thread T10:
    #0 pthread_mutex_lock <null> (engine_testapp+0x00000047e970)
    #1 cb_mutex_enter <null> (libplatform.so.0.1.0+0x000000003870)
    #2 Mutex::acquire() /home/couchbase/couchbase/ep-engine/src/mutex.cc:31 (ep.so+0x0000001e287e)
    #3 LockHolder::lock() /home/couchbase/couchbase/ep-engine/src/locks.h:71 (ep.so+0x000000082543)
    #4 LockHolder::LockHolder(Mutex&, bool) /home/couchbase/couchbase/ep-engine/src/locks.h:48 (ep.so+0x0000000821b2)
    #5 HashTable::getLockedBucket(int, int*) /home/couchbase/couchbase/ep-engine/src/stored-value.h:1266 (ep.so+0x00000008418a)
    #6 HashTable::getLockedBucket(std::string const&, int*) /home/couchbase/couchbase/ep-engine/src/stored-value.h:1295 (ep.so+0x00000007df9b)
    #7 EventuallyPersistentStore::setWithMeta(Item const&, unsigned long, void const*, bool, bool, unsigned char, bool, bool) /home/couchbase/couchbase/ep-engine/src/ep.cc:1827 (ep.so+0x0000000e6b4f)
    #8 PassiveStream::commitMutation(MutationResponse*, bool) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1369 (ep.so+0x00000029ba21)
    #9 PassiveStream::processMutation(MutationResponse*) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1341 (ep.so+0x00000029a7a0)
    #10 PassiveStream::processBufferedMessages(unsigned int&) /home/couchbase/couchbase/ep-engine/src/dcp-stream.cc:1281 (ep.so+0x00000029a0f2)
    #11 DcpConsumer::processBufferedItems() /home/couchbase/couchbase/ep-engine/src/dcp-consumer.cc:599 (ep.so+0x000000262a23)
    #12 Processer::run() /home/couchbase/couchbase/ep-engine/src/dcp-consumer.cc:48 (ep.so+0x0000002625ff)
    #13 ExecutorThread::run() /home/couchbase/couchbase/ep-engine/src/executorthread.cc:110 (ep.so+0x0000001e3dd9)
    #14 launch_executor_thread(void*) /home/couchbase/couchbase/ep-engine/src/executorthread.cc:34 (ep.so+0x0000001e32ea)
    #15 platform_thread_wrap /home/couchbase/couchbase/platform/src/cb_pthreads.c (libplatform.so.0.1.0+0x00000000362c)

Change-Id: I6c7b1fadf76529a044341a4a9b6ed0ea829c4999
Reviewed-on: http://review.couchbase.org/62567
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock' 98/62598/1
abhinavdangeti [Fri, 8 Apr 2016 03:03:22 +0000 (20:03 -0700)]
Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'

couchbase/3.0.x:
|\
| * 771859f MB-19093 [BP]: [ActiveStream] Address potential lock-inversion scenarios
| * 6811030 MB-19075: Remove printing of empty string in CouchKVStore::getMulti()

Change-Id: I06717dcaad2b764582ebe378a6f9023f7ca76d88

5 years agoMB-19093 [BP]: [ActiveStream] Address potential lock-inversion scenarios 98/62498/4
abhinavdangeti [Fri, 15 Jan 2016 16:36:52 +0000 (08:36 -0800)]
MB-19093 [BP]: [ActiveStream] Address potential lock-inversion scenarios

Acquire vbucket state lock only when really necessary
in the ActiveStream context. Also avoid acquiring one
lock within the other wherever possible in the ActiveStream
context again.

This change is to avert potential deadlocks due to
lock inversion that will be induced by upcoming changes,
here are the scenarios:
(i)     Locking between streamsMutex, streamMutex and
        vb_stateLock in the set operation - handle
        response scenario.
        (http://factory.couchbase.com/job/ep-engine-threadsanitizer-master/1225/console)
(ii)    In case of a set operation, vb_stateLock is
        acquired and then streamMutex is acquired for
        notification. During markDiskSnapshot, the
        streamMutex is acquired before the vb_stateLock
        lock is acquired.
        (http://factory.couchbase.com/job/ep-engine-threadsanitizer-master/1268/console)

(Already reviewed at: http://review.couchbase.org/58557)

Change-Id: I5e5a3e2cc5ba9ae17090e1a3ee4bde100d305f1c
Reviewed-on: http://review.couchbase.org/62498
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-19075: Remove printing of empty string in CouchKVStore::getMulti() 25/62325/7
Sriram Ganesan [Fri, 1 Apr 2016 23:12:10 +0000 (16:12 -0700)]
MB-19075: Remove printing of empty string in CouchKVStore::getMulti()

In case of an error in opening a file, an error message is logged.
But the string that is supposed to hold the name of the file is
not populated, thus resulting in an empty string getting printed.
Remove the string from printed as openDB already prints the name
of the file in case of an open failure.

Change-Id: Ife3aec8381ead4f2e0b84c921a3781efa39a2126
Reviewed-on: http://review.couchbase.org/62325
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-18476: Handle write failures more gracefully in the mutation log 16/61116/11
Sriram Ganesan [Tue, 8 Mar 2016 22:08:50 +0000 (14:08 -0800)]
MB-18476: Handle write failures more gracefully in the mutation log

Log and error message in case of a write failure and remove any unnecessary
asserts in that code path

Change-Id: I50b7e4de4d414e21bf00404a22863baff06c0f4f
Reviewed-on: http://review.couchbase.org/61116
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17517: Testcase for invalid max_cas in _local/vbstate 58/60858/3
Dave Rigby [Thu, 3 Mar 2016 15:03:27 +0000 (15:03 +0000)]
MB-17517: Testcase for invalid max_cas in _local/vbstate

Unit test for fe67145 - verify that if a vbucket is created with a max
CAS of -1, upon re-reading from disk it is detected and fixed.

Change-Id: Ifdf3e642589f93191786ac55c1cf02207159f657
Reviewed-on: http://review.couchbase.org/60858
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
5 years agoMB-17517: Don't create TAP_MUTATIONS with CAS of -1 01/60801/6
Dave Rigby [Wed, 2 Mar 2016 16:56:40 +0000 (16:56 +0000)]
MB-17517: Don't create TAP_MUTATIONS with CAS of -1

In TapProducer::getNextItem(), a TAP mutation is generated by asking
EPStore to get() an Item for the given key. However, if the key in
question is locked (for example, if there was a SET followed by a LOCK
operation), then get() returns an Item with a CAS of -1. This is
normally only intended for end-users (we 'hide' the CAS of locked
items so only the client which locked it, and has the correct CAS can
mutate it). However we incorrectly end up creating a TAP_MUTATION
packet with a CAS of -1; corrupting the data sent to the TAP consumer
(replica, backup, etc).

Fix by adding a new option to get() to make the CAS 'hiding' optional,
and choose to not hide it for the TapProducer.

Change-Id: I4e9f4963e77437c9dc5a9ebb482c727e8ef4beb7
Reviewed-on: http://review.couchbase.org/60801
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years ago[BP] MB-17517: Check for and disard an invalid max_cas in _local/vbstate 95/60795/3
Dave Rigby [Wed, 2 Mar 2016 16:43:41 +0000 (16:43 +0000)]
[BP] MB-17517: Check for and disard an invalid max_cas in _local/vbstate

If there is an invalid max CAS value on disk (in _local/vbstate),
discard it (resetting it to zero). Instead will need to rebuild the
max CAS value from the items we load from the file.

Change-Id: Ibe283160ab0b477b1a2a91985036269ce1d590af
Reviewed-on: http://review.couchbase.org/60795
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge "Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'"...
Chiyoung Seo [Tue, 1 Mar 2016 17:59:44 +0000 (17:59 +0000)]
Merge "Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'" into sherlock

5 years agoMB-17024: Add more logs during bucket deletion 96/60596/2
Sriram Ganesan [Sat, 27 Feb 2016 02:30:02 +0000 (18:30 -0800)]
MB-17024: Add more logs during bucket deletion

Add more logs and change certain logs to WARNING level while these
tasks are stopped during bucket deletion

Change-Id: Ia2ada9d8525cec4c3e50ed087d2d052a05663873
Reviewed-on: http://review.couchbase.org/60596
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock' 11/60711/1
Jim Walker [Tue, 1 Mar 2016 12:34:11 +0000 (12:34 +0000)]
Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'

* couchbase/3.0.x:
  MB-17889: No notify from streamRequest
  MB-17889: Address view_query latency regression

Change-Id: Ib1a5c25fccdfddc85e6072ec1d6831569c84128f

5 years agoMB-17889: No notify from streamRequest 52/60352/3 v3.1.4
Jim Walker [Mon, 22 Feb 2016 21:01:09 +0000 (21:01 +0000)]
MB-17889: No notify from streamRequest

The 3.1.3 (i.e. before the DCP churn) didn't ever notify from
streamRequest, so that is reverted and helps to bring
view_query latency down (view engine is constantly creating
streams).

A second tweak is to not call stream->next whilst holding
the streamMutex. This can block streamRequest again
affecting the view-engine's DCP stream requests.

Change-Id: I5b57fd7998003251fb32897f37c8a2f15f687a13
Reviewed-on: http://review.couchbase.org/60352
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17889: Address view_query latency regression 50/59750/3
Jim Walker [Wed, 10 Feb 2016 16:33:10 +0000 (16:33 +0000)]
MB-17889: Address view_query latency regression

Some of the stale=false query latency tests show a 2x increase
in the 80 and 95 percentile. This is observed when moving from
3.1.4 1835 to 1836.

The prime suspect is that the snapshot yield parameter is now
stalling the view engine's DCP data.

In the change from 1835 to 1836 this value was tuned based upon
queue lengths observed during rebalance, moving from 10 to 256.

It could be that the task now runs for longer periods blocking
DCP backfill from running, and view engine drives many backfills
due to the way it frequently closes and opens streams.

Note that DCP Backfill and the snapshot task both share the same
task type, so can block each other.

This patch moves this config value back to 10 (as it was in 1835).

The view-query latency is very difficult to relibably reproduce, observe
and tune, but there is evidence (a trend) that with this config value
at 10, the performance (view latency) is improved.

Some small scale rebalance tests (3 node cluster, swap 1 node for 1)
showed that with 10 rebalance was not adversly affected, but it's a risk.

A latency comparison is attached to the MB which hints that
the latency is better.

https://issues.couchbase.com/secure/attachment/29513/29513_benchmark.png

Change-Id: I6ecf8ff950f77638eb03e4fedaefb700cf945d54
Reviewed-on: http://review.couchbase.org/59750
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock' 11/60211/1
Manu Dhundi [Thu, 18 Feb 2016 20:04:11 +0000 (12:04 -0800)]
Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'

*
|\
| * 61ab952 2016-02-18 | MB-17889: DCP producer can leave an operation behind

Change-Id: Iad7e13aef50018432bedc17f776984c7a21a4cb4

5 years agoMB-17889: DCP producer can leave an operation behind 96/60196/3
Jim Walker [Thu, 18 Feb 2016 15:14:28 +0000 (15:14 +0000)]
MB-17889: DCP producer can leave an operation behind

In low-traffic setups there's a case in DcpProducer::getNextItem
where the function exits and pauses the task, yet an item was
waiting to be sent.

getNextItem() looked like

  1. setpaused=false;
  2. while(ready.pop(vbucket)) {
  3.   process(vbucket);
  4. }
  5. setpaused=true;
  6. return NULL;

The notifier

  a. if(ready.pushUnique(vbucket))
  b.   wakeupIfPaused;

If a and b execute between 4 and 5, then the producer will sleep
and not process the vbucket until the next wakeup (which maybe never).

This is not good, the first operation will have a long latency before
it can be seen on DCP. As long as it takes for the second operation.

If a and b occur between 5 and 6, that's fine, wakupIfPaused will re-wake
the producer.

a,b,5,6 is bad
5,a,6,b is ok
5,6,a,b is ok
5,a,b,6 is ok
5,6,a,b is ok
...

The fix.

getNextItem()

  0. do {
  1.   setpaused=false;
  2.   while(ready.pop(vbucket)) {
  3.     process(vbucket);
  4.   }
  5.   setpaused=true;
  6.  } while(!ready.empty());
  7. return NULL;

Now if ab occurs after 4, but before 5, it's ok as 6 will now consume
the vbucket.

5,a,b,6 is ok, as 6 will loop and consume
5,a,6,b is ok, "    "    "    "    "
6,a,b,7 is ok, paused is true (5), b will wake the task

Change-Id: Ib412a85ee10de0e2a2ca4116d0cc85bbad538da2
Reviewed-on: http://review.couchbase.org/60196
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock' 37/60137/1
abhinavdangeti [Wed, 17 Feb 2016 18:10:54 +0000 (10:10 -0800)]
Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'

couchbase/3.0.x:
|\
| * 37d53b1 MB-18171: Break cyclic reference between ActiveStream & ChkptProcesser
| * 928ba39 MB-17766: Set maxNumAuxIO in stream_test to zero
| * 9ac7fd0 [BP] MB-17766: Fix intermittant stream_test failure

Change-Id: I5a01bbf360bd5e59a91ae8400f3de14cc50f5911

5 years agoMB-18171: Break cyclic reference between ActiveStream & ChkptProcesser 60/60060/4
abhinavdangeti [Tue, 16 Feb 2016 18:49:09 +0000 (10:49 -0800)]
MB-18171: Break cyclic reference between ActiveStream & ChkptProcesser

Removing circular dependency between ActiveStream and
ActiveStreamCheckpointProcesserTask where each holds a reference
to the other causing a memory leak during shutdown.

Also explicitly clear the queues of checkpointProcessor task upon
disconnection of the DcpProducer, so as to remove a cyclic reference
between DcpProducer, ActiveStream, and ActiveStreamCheckpointProcesserTask.

Change-Id: Ifac03a40132431476a6b5000725ce972068b47f4
Reviewed-on: http://review.couchbase.org/60060
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Set maxNumAuxIO in stream_test to zero 59/60059/2
abhinavdangeti [Tue, 16 Feb 2016 19:12:53 +0000 (11:12 -0800)]
MB-17766: Set maxNumAuxIO in stream_test to zero

Setting maxNumAuxIO to zero will ensure that the producer's
ActiveStreamCheckpointProcesserTask will never run causing
unexpected results in the test context.

Change-Id: I5e7f4b18b1b72af1f99e83cadc5ee979dbcd4cae
Reviewed-on: http://review.couchbase.org/60059
Tested-by: buildbot <build@couchbase.com>
Well-Formed: Hari Kodungallur <hari.kodungallur@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years ago[BP] MB-17766: Fix intermittant stream_test failure 58/60058/2
Dave Rigby [Tue, 16 Feb 2016 12:54:38 +0000 (12:54 +0000)]
[BP] MB-17766: Fix intermittant stream_test failure

Address two issues:

1) end sequence numbers were incorrect, which could result in not
   having any items in our cursor.
2) Don't check CheckpointMamager::registerCursor() return falue, we
   don't actually care if any other cursors are already registered for
   a given checkpoint (persistence cursor sometimes registers before
   us).

Change-Id: I1145d5fb61c0c12f019154c979afdd50b4060509
Reviewed-on: http://review.couchbase.org/60058
Tested-by: buildbot <build@couchbase.com>
Well-Formed: Hari Kodungallur <hari.kodungallur@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge branch 'couchbase/3.0.x' into 'couchbase/sherlock' 66/59866/1
abhinavdangeti [Fri, 12 Feb 2016 17:13:58 +0000 (09:13 -0800)]
Merge branch 'couchbase/3.0.x' into 'couchbase/sherlock'

couchbase/3.0.x:
|\
| * b84d09d MB-17766: Regression test that checks for race during takeover
| * ba305c4 MB-17766: Incorrect ordering of messages during ActiveStream's takeover-send phase
| * 4f39683 MB-17766: Avoid copy overhead of std::deque in getOutstandingItems
| * e3f4855 MB-17766: Refactor nextCheckpointItemTask to allow testing

Change-Id: I57a2aa37abc4ab60f09648bd7b02740b4bf933e6

5 years agoMerge branch 'couchbase/3.0.x' into 'couchbase/sherlock' 65/59765/11
abhinavdangeti [Fri, 12 Feb 2016 16:54:56 +0000 (08:54 -0800)]
Merge branch 'couchbase/3.0.x' into 'couchbase/sherlock'

couchbase/3.0.x:
|\
| * 0da7d42 MB-17885: Address compilation errors in ep_testsuite.cc
| * b7ee24c MB-17885: Update flow control bytesSent correctly on DCP producer

Change-Id: I70cda64395781a433a8e40720bdc5c75f5d0e3c2

5 years agoMB-17766: Regression test that checks for race during takeover 66/59666/8
Dave Rigby [Tue, 9 Feb 2016 18:21:25 +0000 (18:21 +0000)]
MB-17766: Regression test that checks for race during takeover

Module test: ep-engine_stream_test

Change-Id: I8e11722b1ed1029c8b969dcb88000c5903fbb0ca
Reviewed-on: http://review.couchbase.org/59666
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Incorrect ordering of messages during ActiveStream's takeover-send phase 27/59627/7
abhinavdangeti [Tue, 9 Feb 2016 20:18:01 +0000 (12:18 -0800)]
MB-17766: Incorrect ordering of messages during ActiveStream's takeover-send phase

A race between the step() and ActiveStreamCheckpointProcessorTask
can cause mutations to be queued into the readyQ after the
setVBucketState(active) message.

Here's the scenario in chronology (T: Front-end thread, BT: IO thread):
1. T1: ActiveStream::setVBucketStateAckRecieved()
2. T1: transitionState(takeoverSend) => schedules ActiveStreamCheckpointProcessorTask
3. BT1: manageConnections() notifies memcached about specific conn (max idle time: 5s)
4. BT2: ActiveStreamCheckpointProcessorTask runs, gets all Items For Cursor
5. T1: step() -> takeoverSendPhase() -> readyQ is empty
        => nextCheckpointItem() return false, as getNumItemsForCursor returns 0
        => setVbucketState(active) added to readyQ
6. BT2: ActiveStreamCheckpointProcessorTask continues, adds mutations acquired into readyQ
        -> Note the mutations were acquired in step 4
        => Notified memcached connections
7. T1: step() .. ships messages in incorrect order

On the new master, the vbucket is promoted to active state and then more
mutations are received from the old master. If there were front end ops
at this time, there could be an inconsistency in highSeqno or in worst
cases crashes in checkpoint manager due to highSeqno not belonging in
the designated range.

The fix: Add an atomic flag that is also checked for along with
getNumItemsForCursor in nextCheckpointItem(). This flag is set before
retrieving all items for a cursor (getAllItemsForCursor) and unset after
all the retrieved items have been added to the ready queue of the stream.

Change-Id: I5c04d47cc99c7dd3b2d87cb68dd30d36473226e5
Reviewed-on: http://review.couchbase.org/59627
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Avoid copy overhead of std::deque in getOutstandingItems 54/59754/2
abhinavdangeti [Wed, 10 Feb 2016 18:03:03 +0000 (10:03 -0800)]
MB-17766: Avoid copy overhead of std::deque in getOutstandingItems

Change-Id: I771182bd54a0f702f70287ff4728d26b7ffaa323
Reviewed-on: http://review.couchbase.org/59754
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17766: Refactor nextCheckpointItemTask to allow testing 64/59664/4
Dave Rigby [Tue, 9 Feb 2016 13:33:38 +0000 (13:33 +0000)]
MB-17766: Refactor nextCheckpointItemTask to allow testing

Split nextCheckpointItemTask() into two inner (protected) functions,
to allow testing of the fix for MB-17766.

Change-Id: I9d441d873cf7f727f90a966d4dda03043c7f6480
Reviewed-on: http://review.couchbase.org/59664
Well-Formed: buildbot <build@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17885: Address compilation errors in ep_testsuite.cc 78/59678/4
abhinavdangeti [Tue, 9 Feb 2016 22:27:10 +0000 (14:27 -0800)]
MB-17885: Address compilation errors in ep_testsuite.cc

3.0.x don't support C++11!

Change-Id: Ia3adf0c7ace9b771b999427811f0872774740386
Reviewed-on: http://review.couchbase.org/59678
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>

5 years agoMB-17517: Regenerate CAS for TAP/DCP mutations with invalid CAS 97/59597/4
Dave Rigby [Fri, 5 Feb 2016 11:47:27 +0000 (11:47 +0000)]
MB-17517: Regenerate CAS for TAP/DCP mutations with invalid CAS

When receiving mutations via TAP or DCP, validate the CAS if invalid
generate a new one.

This is instead of the simply dropping the mutaiton (returning an
error to the producer), as by dropping the mutation we could cause
data loss.

Change-Id: I3183aab7c5eecb090ccc560319a7aac915cb35b8
Reviewed-on: http://review.couchbase.org/59597
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17885: Update flow control bytesSent correctly on DCP producer 75/59575/4
Manu Dhundi [Mon, 8 Feb 2016 17:42:25 +0000 (09:42 -0800)]
MB-17885: Update flow control bytesSent correctly on DCP producer

This is a fix for a regression introduced recently. Also this adds
a DCP test case to test flow control behavior of DCP producer.

Change-Id: Ia56858cb9e687a0a045b582c18e4b68948cb460c
Reviewed-on: http://review.couchbase.org/59575
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-16910: Stop logging multiple 'warmup is complete' messages 98/58898/8
Norair Khachiyan [Wed, 27 Jan 2016 23:54:30 +0000 (15:54 -0800)]
MB-16910: Stop logging multiple 'warmup is complete' messages

Fix prevents logging more than one 'Engine warmup is complete' message
for each bucket. Fix resolves problem of overloading log file with the
message mentioned above.

Change-Id: Icfdeb5def6e3f055159e1a57430c6dc661382e30
Reviewed-on: http://review.couchbase.org/58898
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMerge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock' 46/59146/1
abhinavdangeti [Wed, 27 Jan 2016 22:40:20 +0000 (14:40 -0800)]
Merge remote-tracking branch 'couchbase/3.0.x' into 'couchbase/sherlock'

couchbase/3.0.x:
|\
| * ae39d7f MB-17502: DCP performance regression fixed.

Change-Id: Id3262008479ffe7e350a9a4d88438ed541ac242a

5 years agoMB-17502: DCP performance regression fixed. 86/58886/9
Jim Walker [Wed, 6 Jan 2016 15:40:57 +0000 (15:40 +0000)]
MB-17502: DCP performance regression fixed.

Many patches were added to speed up DCP, however some of that
performance was lost when doing some code tidying without
re-profiling.

With all the DCP performance patches (particuarly) 87869fd39 straight
DCP performance is a touch slower. This is because DCP used to take
one lock and then do work. The new code has more locks, but holds
them for fewer lines of code. This means that DCP is friendlier/fairer
to the other threads interacting with a DCP producer.
The front-end operation threads are no longer stalled for long periods
whilst DCP holds the one lock.

Frontend latency before locking changes:

 === Latency [With background DCP] - 100000 items
                                 Percentile
                   Median     95th     99th  Std Dev
 Add               16.337   34.894   45.241   25.627
 Get                1.226    1.524    1.745    0.435
 Replace           16.311   34.386   42.097    8.435
 Delete            15.636   32.915   41.999    7.408

Frontend latency after locking changes:

 === Latency [With background DCP] - 100000 items
                                 Percentile
                   Median     95th     99th  Std Dev
 Add                3.996   12.159   20.724   11.376
 Get                1.299    1.629    1.730    0.634
 Replace            4.274   12.831   22.988    4.523
 Delete             3.142   10.302   14.292    3.350

The average and 95th/99th are all improved.

Fix details:

The roundRobin/vbReady code has a bufferLog.pauseIfFull call on the
"hot" part of the loop, this is the main cause of the regression.

With that fixed CPU profiling and benchmarking shows that DCP is back
to 3.1.3 levels but highlighted that:

1. DcpProducer::getNextItem was hot (5% of a DcpProducer thread).
2. DcpConsumer::processBufferedItems was hitting SpinLock hard.
   20 to 30% at times was consumed by SpinLock code.
3. snapshot creation was frequently yielding even though it had work todo.

So to address 1. the fix is actually to remove the roundRobin/vbReady
code. It is actually no better and in some cases a little slower than
the orginal. This code is replaces with std:: structures *but* the
Mutex used has a much smaller scope.

Note the DcpProducerReadyQueue has been profiled and proven that having
the std::map powering find() is much faster than searching the list.
This is important because the find method is part of the front-end
operation thread.

To address 2. it was observed that the consumer code is constructing
a passive_stream_t frequently, then testing if there is a pointer.
The construction uses the SpinLock code and can be avoided just by
testing the streams[vb] directly and only then do we construct
a copy of the passive_stream_t. This avoids the SpinLock code on
every iteration of the for loop in the affected function.

To address 3. ensure that the snapshot tasks work queue doesn't have
duplicates, there's no need. Then raise the number of snapshots before
yield. Various rebalances showed that around 250 was enough, so let's go
with 256.

Change-Id: I8fb0bd30f8e07d000192675de425726ad26e403a
Reviewed-on: http://review.couchbase.org/58886
Well-Formed: buildbot <build@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
Tested-by: buildbot <build@couchbase.com>
5 years agoMB-17517: check for invalid CAS value in [set/delete]WithMeta 10/58910/3
Sriram Ganesan [Thu, 21 Jan 2016 20:00:36 +0000 (12:00 -0800)]
MB-17517: check for invalid CAS value in [set/delete]WithMeta

In a [set/delete]WithMeta call from either XDCR or from DCP,
a corrupt CAS value for the incoming item should return an
error

Change-Id: Id87627ae35ef711de4704a960b3aa7d1e9b9a48b
Reviewed-on: http://review.couchbase.org/58910
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>
5 years agoMB-17517: return EINVAL instead of assert in arithmetic 68/58868/3
Sriram Ganesan [Thu, 21 Jan 2016 02:15:56 +0000 (18:15 -0800)]
MB-17517: return EINVAL instead of assert in arithmetic

If a get performed on an item returns a CAS value of zero, then
return EINVAL as opposed to asserting

Change-Id: If3d43c270bcc627029d0954dab0e570c83ddca74
Reviewed-on: http://review.couchbase.org/58868
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: abhinav dangeti <abhinav@couchbase.com>
Reviewed-by: Chiyoung Seo <chiyoung@couchbase.com>