MB-19886: In markDiskSnapshot() get current vb snapshot info outside streamMutex 40/64840/4
authorManu Dhundi <manu@couchbase.com>
Thu, 9 Jun 2016 20:22:45 +0000 (13:22 -0700)
committerDave Rigby <daver@couchbase.com>
Mon, 13 Jun 2016 09:26:49 +0000 (09:26 +0000)
We need this to overcome the lock inversion detected in
http://cv.jenkins.couchbase.com/job/ep-engine-threadsanitizer-3.0.x/263/console.

Explaining the lock inversion:
(1) Backfill thread sending disk snapshot:
    streamMutex (class Stream) ==> snapshotMutex (class VBucket)

(2) Front End thread receiving DCP mutation from active vb:
    snapshotMutex (class VBucket) ==> stateLock (class VBucket) ==>
                                              streamsMutex (class DcpProducer)

(3) Another front end thread disconnecting the view engine connection:
    streamsMutex (class DcpProducer) ==> streamMutex (class Stream)

Solution:
Break streamMutex (class Stream) ==> snapshotMutex (class VBucket).
This is done by making snapshot variables atomic and it should be good
as the backfill thread needs only snapshot_end.

Change-Id: Id1cff42dfe39151d9a19c826d7e47e23b3fc4d21
Reviewed-on: http://review.couchbase.org/64840
Tested-by: buildbot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Well-Formed: buildbot <build@couchbase.com>

src/dcp-stream.cc
src/vbucket.h

index 8e6a9aa..f211a51 100644 (file)
@@ -368,8 +368,8 @@ void ActiveStream::markDiskSnapshot(uint64_t startSeqno, uint64_t endSeqno) {
         if (end_seqno_ > endSeqno) {
             /* We possibly have items in the open checkpoint
                (incomplete snapshot) */
-            uint64_t snapshot_start, snapshot_end;
-            vb->getCurrentSnapshot(snapshot_start, snapshot_end);
+            uint64_t snapshot_end;
+            vb->getCurrentSnapshotEnd(snapshot_end);
             LOG(EXTENSION_LOG_WARNING,
                 "(vb %" PRIu16 ") Merging backfill and memory snapshot for a "
                 "replica vbucket, backfill start seqno %" PRIu64 ", "
index db5fb52..fc2f951 100644 (file)
@@ -225,8 +225,8 @@ public:
             cb_assert(end >= (uint64_t)checkpointManager.getHighSeqno());
         }
 
-        cur_snapshot_start = start;
-        cur_snapshot_end = end;
+        cur_snapshot_start.store(start);
+        cur_snapshot_end.store(end);
     }
 
     void getCurrentSnapshot(uint64_t& start, uint64_t& end) {
@@ -235,8 +235,12 @@ public:
     }
 
     void getCurrentSnapshot_UNLOCKED(uint64_t& start, uint64_t& end) {
-        start = cur_snapshot_start;
-        end = cur_snapshot_end;
+        start = cur_snapshot_start.load();
+        end = cur_snapshot_end.load();
+    }
+
+    void getCurrentSnapshotEnd(uint64_t& end) {
+        end = cur_snapshot_end.load();
     }
 
     int getId(void) const { return id; }
@@ -451,9 +455,11 @@ private:
     Mutex pendingBGFetchesLock;
     vb_bgfetch_queue_t pendingBGFetches;
 
+    /* snapshotMutex is used to update/read the pair {start, end} atomically,
+       but not if reading a single field. */
     Mutex snapshotMutex;
-    uint64_t cur_snapshot_start;
-    uint64_t cur_snapshot_end;
+    AtomicValue<uint64_t> cur_snapshot_start;
+    AtomicValue<uint64_t> cur_snapshot_end;
 
     Mutex hpChksMutex;
     std::list<HighPriorityVBEntry> hpChks;