MB-21617: Change CAS resolution to nanoseconds
[ep-engine.git] / docs / stats.org
1 #+TITLE:     EP Stats
2 #+AUTHOR:    Dustin Sallings
3 #+EMAIL:     dustin@spy.net
4 #+DATE:      2010-02-08 Mon
5 #+DESCRIPTION:
6 #+KEYWORDS:
7 #+LANGUAGE:  en
8 #+OPTIONS:   H:3 num:t toc:t \n:nil @:t ::t |:t ^:nil -:t f:t *:t <:t
9 #+OPTIONS:   TeX:t LaTeX:nil skip:nil d:nil todo:t pri:nil tags:not-in-toc
10 #+INFOJS_OPT: view:nil toc:nil ltoc:t mouse:underline buttons:0 path:http://orgmode.org/org-info.js
11 #+EXPORT_SELECT_TAGS: export
12 #+EXPORT_EXCLUDE_TAGS: noexport
13 #+LINK_UP:
14 #+LINK_HOME:
15 #+STYLE:  <link rel="stylesheet" type="text/css" href="myorg.css" />
16
17 * Getting Started
18
19 For introductory information on stats within Couchbase, start with the
20 [[http://docs.couchbase.com/][Couchbase server documentations]].
21
22 * Stats Definitions
23
24 ** Toplevel Stats
25
26 | Stat                               | Description                            |
27 |------------------------------------+----------------------------------------|
28 | uuid                               | The unique identifier for the bucket   |
29 | ep_version                         | Version number of ep_engine            |
30 | ep_storage_age                     | Seconds since most recently            |
31 |                                    | stored object was initially queued     |
32 | ep_storage_age_highwat             | ep_storage_age high water mark         |
33 | ep_startup_time                    | System-generated engine startup time   |
34 | ep_data_age                        | Seconds since most recently            |
35 |                                    | stored object was modified             |
36 | ep_data_age_highwat                | ep_data_age high water mark            |
37 | ep_num_workers                     | Global number of shared worker threads |
38 | ep_bucket_priority                 | Priority assigned to the bucket        |
39 | ep_total_enqueued                  | Total number of items queued for       |
40 |                                    | persistence                            |
41 | ep_total_new_items                 | Total number of persisted new items    |
42 | ep_total_del_items                 | Total number of persisted deletions    |
43 | ep_total_persisted                 | Total number of items persisted        |
44 | ep_item_flush_failed               | Number of times an item failed to      |
45 |                                    | flush due to storage errors            |
46 | ep_item_commit_failed              | Number of times a transaction failed   |
47 |                                    | to commit due to storage errors        |
48 | ep_item_begin_failed               | Number of times a transaction failed   |
49 |                                    | to start due to storage errors         |
50 | ep_expired_access                  | Number of times an item was expired on |
51 |                                    | application access.                    |
52 | ep_expired_compactor               | Number of times an item was expired by |
53 |                                    | the ep engine compactor                |
54 | ep_expired_pager                   | Number of times an item was expired by |
55 |                                    | ep engine item pager                   |
56 | ep_item_flush_expired              | Number of times an item is not flushed |
57 |                                    | due to the expiry of the item          |
58 | ep_queue_size                      | Number of items queued for storage     |
59 | ep_flusher_todo                    | Number of items currently being        |
60 |                                    | written                                |
61 | ep_flusher_state                   | Current state of the flusher thread    |
62 | ep_commit_num                      | Total number of write commits          |
63 | ep_commit_time                     | Number of milliseconds of most recent  |
64 |                                    | commit                                 |
65 | ep_commit_time_total               | Cumulative milliseconds spent          |
66 |                                    | committing                             |
67 | ep_vbucket_del                     | Number of vbucket deletion events      |
68 | ep_vbucket_del_fail                | Number of failed vbucket deletion      |
69 |                                    | events                                 |
70 | ep_vbucket_del_max_walltime        | Max wall time (µs) spent by deleting   |
71 |                                    | a vbucket                              |
72 | ep_vbucket_del_avg_walltime        | Avg wall time (µs) spent by deleting   |
73 |                                    | a vbucket                              |
74 | ep_pending_compactions             | Number of pending vbucket compactions  |
75 | ep_rollback_count                  | Number of rollbacks on consumer        |
76 | ep_flush_duration_total            | Cumulative seconds spent flushing      |
77 | ep_flush_all                       | True if disk flush_all is scheduled    |
78 | ep_num_ops_get_meta                | Number of getMeta operations           |
79 | ep_num_ops_set_meta                | Number of setWithMeta operations       |
80 | ep_num_ops_del_meta                | Number of delWithMeta operations       |
81 | ep_num_ops_set_meta_res_failed     | Number of setWithMeta ops that failed  |
82 |                                    | conflict resolution                    |
83 | ep_num_ops_del_meta_res_failed     | Number of delWithMeta ops that failed  |
84 |                                    | conflict resolution                    |
85 | ep_num_ops_set_ret_meta            | Number of setRetMeta operations        |
86 | ep_num_ops_del_ret_meta            | Number of delRetMeta operations        |
87 | ep_num_ops_get_meta_on_set_meta    | Num of background getMeta operations   |
88 |                                    | spawn due to setWithMeta operations    |
89 | curr_items                         | Num items in active vbuckets (temp +   |
90 |                                    | live)                                  |
91 | curr_temp_items                    | Num temp items in active vbuckets      |
92 | curr_items_tot                     | Num current items including those not  |
93 |                                    | active (replica, dead and pending      |
94 |                                    | states)                                |
95 | ep_kv_size                         | Memory used to store item metadata,    |
96 |                                    | keys and values, no matter the         |
97 |                                    | vbucket's state. If an item's value is |
98 |                                    | ejected, this stats will be            |
99 |                                    | decremented by the size of the item's  |
100 |                                    | value.                                 |
101 | ep_blob_num                        | The number of blob objects in the cache|
102 | ep_blob_overhead                   | The "unused" memory caused by the      |
103 |                                    | allocator returning bigger chunks than |
104 |                                    | requested                              |
105 | ep_value_size                      | Memory used to store values for        |
106 |                                    | resident keys                          |
107 | ep_storedval_size                  | Memory used by storedval objects       |
108 | ep_storedval_overhead              | The "unused" memory caused by the      |
109 |                                    | allocator returning bigger chunks than |
110 |                                    | requested                              |
111 | ep_storedval_num                   | The number of storedval objects        |
112 |                                    | allocated                              |
113 | ep_overhead                        | Extra memory used by transient data    |
114 |                                    | like persistence queues, replication   |
115 |                                    | queues, checkpoints, etc               |
116 | ep_item_num                        | The number of item objects allocated   |
117 | ep_mem_low_wat                     | Low water mark for auto-evictions      |
118 | ep_mem_low_wat_percent             | Low water mark (as a percentage)       |
119 | ep_mem_high_wat                    | High water mark for auto-evictions     |
120 | ep_mem_high_wat_percent            | High water mark (as a percentage)      |
121 | ep_total_cache_size                | The total byte size of all items, no   |
122 |                                    | matter the vbucket's state, no matter  |
123 |                                    | if an item's value is ejected          |
124 | ep_oom_errors                      | Number of times unrecoverable OOMs     |
125 |                                    | happened while processing operations   |
126 | ep_tmp_oom_errors                  | Number of times temporary OOMs         |
127 |                                    | happened while processing operations   |
128 | ep_mem_tracker_enabled             | True if memory usage tracker is        |
129 |                                    | enabled                                |
130 | ep_bg_fetched                      | Number of items fetched from disk      |
131 | ep_bg_meta_fetched                 | Number of meta items fetched from disk |
132 | ep_bg_remaining_jobs               | Number of remaining bg fetch jobs      |
133 | ep_max_bg_remaining_jobs           | Max number of remaining bg fetch jobs  |
134 |                                    | that we have seen in the queue so far  |
135 | ep_tap_bg_fetched                  | Number of tap disk fetches             |
136 | ep_tap_bg_fetch_requeued           | Number of times a tap bg fetch task is |
137 |                                    | requeued                               |
138 | ep_num_pager_runs                  | Number of times we ran pager loops     |
139 |                                    | to seek additional memory              |
140 | ep_num_expiry_pager_runs           | Number of times we ran expiry pager    |
141 |                                    | loops to purge expired items from      |
142 |                                    | memory/disk                            |
143 | ep_num_access_scanner_runs         | Number of times we ran accesss scanner |
144 |                                    | to snapshot working set                |
145 | ep_num_access_scanner_skips        | Number of times accesss scanner task   |
146 |                                    | decided not to generate access log     |
147 | ep_access_scanner_num_items        | Number of items that last access       |
148 |                                    | scanner task swept to access log.      |
149 | ep_access_scanner_task_time        | Time of the next access scanner task   |
150 |                                    | (GMT), NOT_SCHEDULED if access scanner |
151 |                                    | has been disabled                      |
152 | ep_access_scanner_last_runtime     | Number of seconds that last access     |
153 |                                    | scanner task took to complete.         |
154 | ep_expiry_pager_task_time          | Time of the next expiry pager task     |
155 |                                    | (GMT), NOT_SCHEDULED if expiry pager   |
156 |                                    | has been disabled
157 | ep_items_rm_from_checkpoints       | Number of items removed from closed    |
158 |                                    | unreferenced checkpoints               |
159 | ep_num_value_ejects                | Number of times item values got        |
160 |                                    | ejected from memory to disk            |
161 | ep_num_eject_failures              | Number of items that could not be      |
162 |                                    | ejected                                |
163 | ep_num_not_my_vbuckets             | Number of times Not My VBucket         |
164 |                                    | exception happened during runtime      |
165 | ep_tap_keepalive                   | Tap keepalive time                     |
166 | ep_dbname                          | DB path                                |
167 | ep_pending_ops                     | Number of ops awaiting pending         |
168 |                                    | vbuckets                               |
169 | ep_pending_ops_total               | Total blocked pending ops since reset  |
170 | ep_pending_ops_max                 | Max ops seen awaiting 1 pending        |
171 |                                    | vbucket                                |
172 | ep_pending_ops_max_duration        | Max time (µs) used waiting on pending  |
173 |                                    | vbuckets                               |
174 | ep_bg_num_samples                  | The number of samples included in the  |
175 |                                    | average                                |
176 | ep_bg_min_wait                     | The shortest time (µs) in the wait     |
177 |                                    | queue                                  |
178 | ep_bg_max_wait                     | The longest time (µs) in the wait      |
179 |                                    | queue                                  |
180 | ep_bg_wait_avg                     | The average wait time (µs) for an item |
181 |                                    | before it's serviced by the dispatcher |
182 | ep_bg_min_load                     | The shortest load time (µs)            |
183 | ep_bg_max_load                     | The longest load time (µs)             |
184 | ep_bg_load_avg                     | The average time (µs) for an item to   |
185 |                                    | be loaded from the persistence layer   |
186 | ep_num_non_resident                | The number of non-resident items       |
187 | ep_bg_wait                         | The total elapse time for the wait     |
188 |                                    | queue                                  |
189 | ep_bg_load                         | The total elapse time for items to be  |
190 |                                    | loaded from the persistence layer      |
191 | ep_allow_data_loss_during_shutdown | Whether data loss is allowed during    |
192 |                                    | server shutdown                        |
193 | ep_alog_block_size                 | Access log block size                  |
194 | ep_alog_path                       | Path to the access log                 |
195 | ep_access_scanner_enabled          | Status of access scanner task          |
196 | ep_alog_sleep_time                 | Interval between access scanner runs   |
197 |                                    | in minutes                             |
198 | ep_alog_task_time                  | Hour in GMT time when access scanner   |
199 |                                    | task is scheduled to run               |
200 | ep_backend                         | The backend that is being used for     |
201 |                                    | data persistence                       |
202 | ep_backfill_mem_threshold          | The maximum percentage of memory that  |
203 |                                    | the backfill task can consume before   |
204 |                                    | it is made to back off.                |
205 | ep_bg_fetch_delay                  | The amount of time to wait before      |
206 |                                    | doing a background fetch               |
207 | ep_bfilter_enabled                 | Bloom filter use: enabled or disabled  |
208 | ep_bfilter_key_count               | Minimum key count that bloom filter    |
209 |                                    | will accomodate                        |
210 | ep_bfilter_fp_prob                 | Bloom filter's allowed false positive  |
211 |                                    | probability                            |
212 | ep_bfilter_residency_threshold     | Resident ratio threshold for full      |
213 |                                    | eviction policy, after which bloom     |
214 |                                    | switches modes from accounting just    |
215 |                                    | non resident items and deletes to      |
216 |                                    | accounting all items                   |
217 | ep_chk_max_items                   | The number of items allowed in a       |
218 |                                    | checkpoint before a new one is created |
219 | ep_chk_period                      | The maximum lifetime of a checkpoint   |
220 |                                    | before a new one is created            |
221 | ep_chk_persistence_remains         | Number of remaining vbuckets for       |
222 |                                    | checkpoint persistence                 |
223 | ep_chk_persistence_timeout         | Timeout for vbucket checkpoint         |
224 |                                    | persistence                            |
225 | ep_chk_remover_stime               | The time interval for purging closed   |
226 |                                    | checkpoints from memory                |
227 | ep_config_file                     | The location of the ep-engine config   |
228 |                                    | file                                   |
229 | ep_couch_bucket                    | The name of this bucket                |
230 | ep_couch_host                      | The hostname that the couchdb views    |
231 |                                    | server is listening on                 |
232 | ep_couch_port                      | The port the couchdb views server is   |
233 |                                    | listening on                           |
234 | ep_couch_reconnect_sleeptime       | The amount of time to wait before      |
235 |                                    | reconnecting to couchdb                |
236 | ep_data_traffic_enabled            | Whether or not data traffic is enabled |
237 |                                    | for this bucket                        |
238 | ep_db_data_size                    | Total size of valid data in db files   |
239 | ep_db_file_size                    | Total size of the db files             |
240 | ep_degraded_mode                   | True if the engine is either warming   |
241 |                                    | up or data traffic is disabled         |
242 | ep_enable_chk_merge                | True if merging closed checkpoints is  |
243 |                                    | enabled.                               |
244 | ep_exp_pager_enabled               | True if the expiry pager is enabled    |
245 | ep_exp_pager_stime                 | The time interval for purging expired  |
246 |                                    | items from memory                      |
247 | ep_exp_pager_initial_run_time      | An initial start time for the expiry   |
248 |                                    | pager task in GMT                      |
249 | ep_flushall_enabled                | True if this bucket allows the use of  |
250 |                                    | the flush_all command                  |
251 | ep_getl_default_timeout            | The default getl lock duration         |
252 | ep_getl_max_timeout                | The maximum getl lock duration         |
253 | ep_ht_locks                        | The amount of locks per vb hashtable   |
254 | ep_ht_size                         | The initial size of each vb hashtable  |
255 | ep_item_num_based_new_chk          | True if the number of items in the     |
256 |                                    | current checkpoint plays a role in a   |
257 |                                    | new checkpoint creation                |
258 | ep_keep_closed_chks                | True if we want to keep the closed     |
259 |                                    | checkpoints for each vbucket unless    |
260 |                                    | the memory usage is above high water   |
261 |                                    | mark                                   |
262 | ep_max_checkpoints                 | The maximum amount of checkpoints that |
263 |                                    | can be in memory per vbucket           |
264 | ep_max_item_size                   | The maximum value size                 |
265 | ep_max_size                        | The maximum amount of memory this      |
266 |                                    | bucket can use                         |
267 | ep_max_vbuckets                    | The maximum amount of vbuckets that    |
268 |                                    | can exist in this bucket               |
269 | ep_mutation_mem_threshold          | The ratio of total memory available    |
270 |                                    | that we should start sending temp oom  |
271 |                                    | or oom message when hitting            |
272 | ep_pager_active_vb_pcnt            | Active vbuckets paging percentage      |
273 | ep_tap_ack_grace_period            | The amount of time to wait for a tap   |
274 |                                    | acks before disconnecting              |
275 | ep_tap_ack_initial_sequence_number | The initial sequence number for a tap  |
276 |                                    | ack when a tap stream is created       |
277 | ep_tap_ack_interval                | The amount of messages a tap producer  |
278 |                                    | should send before requesting an ack   |
279 | ep_tap_ack_window_size             | The maximum amount of ack requests     |
280 |                                    | that can be sent before the consumer   |
281 |                                    | sends a response ack. When the window  |
282 |                                    | is full the tap stream is paused.      |
283 | ep_tap_backfill_resident           | The resident ratio for deciding how to |
284 |                                    | do backfill. If under the ratio we     |
285 |                                    | schedule full disk backfill. If above  |
286 |                                    | the ratio then we do bg fetches for    |
287 |                                    | non-resident items.                    |
288 | ep_tap_backlog_limit               | The maximum amount of backfill items   |
289 |                                    | that can be in memory waiting to be    |
290 |                                    | sent to the tap consumer               |
291 | ep_tap_backoff_period              | The number of seconds the tap          |
292 |                                    | connection                             |
293 | ep_tap_bg_fetch_requeued           | Number of times a tap bg fetch task is |
294 |                                    | requeued                               |
295 | ep_tap_bg_max_pending              | The maximum number of bg jobs a tap    |
296 |                                    | connection may have                    |
297 | ep_tap_noop_interval               | Number of seconds between a noop is    |
298 |                                    | sent on an idle connection             |
299 | ep_tap_requeue_sleep_time          | The amount of time to wait before a    |
300 |                                    | failed tap item is requeued            |
301 | ep_replication_throttle_cap_pcnt   | Percentage of total items in write     |
302 |                                    | queue at which we throttle tap input   |
303 | ep_replication_throttle_queue_cap  | Max size of a write queue to throttle  |
304 |                                    | incoming tap input                     |
305 | ep_replication_throttle_threshold  | Percentage of max mem at which we      |
306 |                                    | begin NAKing tap input                 |
307 | ep_uncommitted_items               | The amount of items that have not been |
308 |                                    | written to disk                        |
309 | ep_warmup                          | Shows if warmup is enabled / disabled  |
310 | ep_warmup_batch_size               | The size of each batch loaded during   |
311 |                                    | warmup                                 |
312 | ep_warmup_dups                     | Number of Duplicate items encountered  |
313 |                                    | during warmup                          |
314 | ep_warmup_min_items_threshold      | Percentage of total items warmed up    |
315 |                                    | before we enable traffic               |
316 | ep_warmup_min_memory_threshold     | Percentage of max mem warmed up before |
317 |                                    | we enable traffic                      |
318 | ep_warmup_oom                      | The amount of oom errors that occured  |
319 |                                    | during warmup                          |
320 | ep_warmup_thread                   | The status of the warmup thread        |
321 | ep_warmup_time                     | The amount of time warmup took         |
322 | ep_workload_pattern                | Workload pattern (mixed, read_heavy,   |
323 |                                    | write_heavy) monitored at runtime      |
324 | ep_defragmenter_interval           | How often defragmenter task should be  |
325 |                                    | run (in seconds).                      |
326 | ep_defragmenter_num_moved          | Number of items moved by the           |
327 |                                    | defragmentater task.                   |
328 | ep_defragmenter_num_visited        | Number of items visited (considered    |
329 |                                    | for defragmentation) by the            |
330 |                                    | defragmenter task.                     |
331 | ep_cursor_dropping_lower_threshold | Memory threshold below which checkpoint|
332 |                                    | remover will discontinue cursor        |
333 |                                    | dropping.                              |
334 | ep_cursor_dropping_upper_threshold | Memory threshold above which checkpoint|
335 |                                    | remover will start cursor dropping     |
336 | ep_cursors_dropped                 | Number of cursors dropped by the       |
337 |                                    | checkpoint remover                     |
338 | ep_active_hlc_drift                | The total absolute drift for all active|
339 |                                    | vbuckets. This is microsecond          |
340 |                                    | granularity.                           |
341 | ep_active_hlc_drift_count          | The number of updates applied to       |
342 |                                    | ep_active_hlc_drift.                   |
343 | ep_replica_hlc_drift               | The total absolute drift for all       |
344 |                                    | replica vbuckets. This is microsecond  |
345 |                                    | granularity.                           |
346 | ep_replica_hlc_drift_count         | The number of updates applied to       |
347 |                                    | ep_replica_hlc_drift.                  |
348 | ep_active_ahead_exceptions         | The total number of ahead exceptions   |
349 |                                    | for all active vbuckets.               |
350 | ep_active_behind_exceptions        | The total number of behind exceptions  |
351 |                                    | for all active vbuckets.               |
352 | ep_replica_ahead_exceptions        | The total number of ahead exceptions   |
353 |                                    | for all replica vbuckets.              |
354 | ep_replica_behind_exceptions       | The total number of behind exceptions  |
355 |                                    | for all replica vbuckets.              |
356 | ep_clock_cas_drift_threshold_excee-| ep_active_ahead_exceptions +           |
357 | ded                                | ep_replica_ahead_exceptions            |
358
359 ** vBucket total stats
360
361 | Stat                     | Description                                    |
362 |--------------------------+------------------------------------------------|
363 | ep_vb_total              | Total vBuckets (count)                         |
364 | curr_items_tot           | Total number of items                          |
365 | curr_items               | Number of active items in memory               |
366 | curr_temp_items          | Number of temporary items in memory            |
367 | vb_dead_num              | Number of dead vBuckets                        |
368 | ep_diskqueue_items       | Total items in disk queue                      |
369 | ep_diskqueue_memory      | Total memory used in disk queue                |
370 | ep_diskqueue_fill        | Total enqueued items on disk queue             |
371 | ep_diskqueue_drain       | Total drained items on disk queue              |
372 | ep_diskqueue_pending     | Total bytes of pending writes                  |
373 | ep_persist_vbstate_total | Total VB persist state to disk                 |
374 | ep_meta_data_memory      | Total memory used by meta data                 |
375 | ep_meta_data_disk        | Total disk used by meta data                   |
376
377 *** Active vBucket class stats
378
379 | Stat                          | Description                                |
380 |-------------------------------+--------------------------------------------|
381 | vb_active_num                 | Number of active vBuckets                  |
382 | vb_active_curr_items          | Number of in memory items                  |
383 | vb_active_num_non_resident    | Number of non-resident items               |
384 | vb_active_perc_mem_resident   | % memory resident                          |
385 | vb_active_eject               | Number of times item values got ejected    |
386 | vb_active_expired             | Number of times an item was expired        |
387 | vb_active_ht_memory           | Memory overhead of the hashtable           |
388 | vb_active_itm_memory          | Total item memory                          |
389 | vb_active_meta_data_memory    | Total metadata memory                      |
390 | vb_active_meta_data_disk      | Total metadata disk                        |
391 | vb_active_ops_create          | Number of create operations                |
392 | vb_active_ops_update          | Number of update operations                |
393 | vb_active_ops_delete          | Number of delete operations                |
394 | vb_active_ops_reject          | Number of rejected operations              |
395 | vb_active_queue_size          | Active items in disk queue                 |
396 | vb_active_queue_memory        | Memory used for disk queue                 |
397 | vb_active_queue_age           | Sum of disk queue item age in milliseconds |
398 | vb_active_queue_pending       | Total bytes of pending writes              |
399 | vb_active_queue_fill          | Total enqueued items                       |
400 | vb_active_queue_drain         | Total drained items                        |
401 | vb_active_rollback_item_count | Num of items rolled back                   |
402
403 *** Replica vBucket stats
404
405 | Stat                          | Description                                |
406 |-------------------------------+--------------------------------------------|
407 | vb_replica_num                | Number of replica vBuckets                 |
408 | vb_replica_curr_items         | Number of in memory items                  |
409 | vb_replica_num_non_resident   | Number of non-resident items               |
410 | vb_replica_perc_mem_resident  | % memory resident                          |
411 | vb_replica_eject              | Number of times item values got ejected    |
412 | vb_replica_expired            | Number of times an item was expired        |
413 | vb_replica_ht_memory          | Memory overhead of the hashtable           |
414 | vb_replica_itm_memory         | Total item memory                          |
415 | vb_replica_meta_data_memory   | Total metadata memory                      |
416 | vb_replica_meta_data_disk     | Total metadata disk                        |
417 | vb_replica_ops_create         | Number of create operations                |
418 | vb_replica_ops_update         | Number of update operations                |
419 | vb_replica_ops_delete         | Number of delete operations                |
420 | vb_replica_ops_reject         | Number of rejected operations              |
421 | vb_replica_queue_size         | Replica items in disk queue                |
422 | vb_replica_queue_memory       | Memory used for disk queue                 |
423 | vb_replica_queue_age          | Sum of disk queue item age in milliseconds |
424 | vb_replica_queue_pending      | Total bytes of pending writes              |
425 | vb_replica_queue_fill         | Total enqueued items                       |
426 | vb_replica_queue_drain        | Total drained items                        |
427 | vb_replica_rollback_item_count| Num of items rolled back                   |
428
429 *** Pending vBucket stats
430
431 | Stat                          | Description                                |
432 |-------------------------------+--------------------------------------------|
433 | vb_pending_num                | Number of pending vBuckets                 |
434 | vb_pending_curr_items         | Number of in memory items                  |
435 | vb_pending_num_non_resident   | Number of non-resident items               |
436 | vb_pending_perc_mem_resident  | % memory resident                          |
437 | vb_pending_eject              | Number of times item values got ejected    |
438 | vb_pending_expired            | Number of times an item was expired        |
439 | vb_pending_ht_memory          | Memory overhead of the hashtable           |
440 | vb_pending_itm_memory         | Total item memory                          |
441 | vb_pending_meta_data_memory   | Total metadata memory                      |
442 | vb_pending_meta_data_disk     | Total metadata disk                        |
443 | vb_pending_ops_create         | Number of create operations                |
444 | vb_pending_ops_update         | Number of update operations                |
445 | vb_pending_ops_delete         | Number of delete operations                |
446 | vb_pending_ops_reject         | Number of rejected operations              |
447 | vb_pending_queue_size         | Pending items in disk queue                |
448 | vb_pending_queue_memory       | Memory used for disk queue                 |
449 | vb_pending_queue_age          | Sum of disk queue item age in milliseconds |
450 | vb_pending_queue_pending      | Total bytes of pending writes              |
451 | vb_pending_queue_fill         | Total enqueued items                       |
452 | vb_pending_queue_drain        | Total drained items                        |
453 | vb_pending_rollback_item_count| Num of items rolled back                   |
454
455
456 ** vBucket detail stats
457
458 The stats below are listed for each vbucket.
459
460 | Stat                          | Description                                |
461 |-------------------------------+--------------------------------------------|
462 | num_items                     | Number of items in this vbucket            |
463 | num_tmp_items                 | Number of temporary items in memory        |
464 | num_non_resident              | Number of non-resident items               |
465 | vb_pending_perc_mem_resident  | % memory resident                          |
466 | vb_pending_eject              | Number of times item values got ejected    |
467 | vb_pending_expired            | Number of times an item was expired        |
468 | ht_memory                     | Memory overhead of the hashtable           |
469 | ht_item_memory                | Total item memory                          |
470 | ht_cache_size                 | Total size of cache (Includes non resident |
471 |                               | items)                                     |
472 | num_ejects                    | Number of times an item was ejected from   |
473 |                               | memory                                     |
474 | ops_create                    | Number of create operations                |
475 | ops_update                    | Number of update operations                |
476 | ops_delete                    | Number of delete operations                |
477 | ops_reject                    | Number of rejected operations              |
478 | queue_size                    | Pending items in disk queue                |
479 | queue_memory                  | Memory used for disk queue                 |
480 | queue_age                     | Sum of disk queue item age in milliseconds |
481 | queue_fill                    | Total enqueued items                       |
482 | queue_drain                   | Total drained items                        |
483 | pending writes                | Total bytes of pending writes              |
484 | db_data_size                  | Total size of valid data on disk           |
485 | db_file_size                  | Total size of the db file                  |
486 | high_seqno                    | The last seqno assigned by this vbucket    |
487 | purge_seqno                   | The last seqno purged by the compactor     |
488 | bloom_filter                  | Status of the vbucket's bloom filter       |
489 | bloom_filter_size             | Size of the bloom filter bit array         |
490 | bloom_filter_key_count        | Number of keys inserted into the bloom     |
491 |                               | filter, considers overlapped items as one, |
492 |                               | so this may not be accurate at times.      |
493 | uuid                          | The current vbucket uuid                   |
494 | rollback_item_count           | Num of items rolled back                   |
495 | max_cas                       | Maximum CAS of all items in the vbucket.   |
496 |                               | This is a hybrid logical clock value in    |
497 |                               | nanoseconds.                               |
498 | max_cas_str                   | max_cas as a time stamp string (seconds    |
499 |                               | since epoch).                              |
500 | total_abs_drift               | The accumulated absolute drift for this    |
501 |                               | vbucket's hybrid logical clock in          |
502 |                               | microseconds.                              |
503 | total_abs_drift_count         | The number of updates applied to           |
504 |                               | total_abs_drift.                           |
505 | drift_ahead_threshold_exceeded| The number of HLC updates that had a value |
506 |                               | ahead of the local HLC and were over the   |
507 |                               | drift_ahead_threshold.                     |
508 | drift_ahead_threshold         | The ahead threshold in ns.                 |
509 |drift_behind_threshold_exceeded| The number of HLC updates that had a value |
510 |                               | behind the local HLC and were over the     |
511 |                               | drift_behind_threshold.                    |
512 | drift_behind_threshold        | The behind threshold in ns.                |
513 | logical_clock_ticks           | How many times this vbucket's HLC has      |
514 |                               | returned logical clock ticks.              |
515
516 ** vBucket seqno stats
517
518 | Stats                         | Description                                |
519 | ------------------------------+--------------------------------------------|
520 | abs_high_seqno                | The last seqno assigned by this vbucket    |
521 | high_seqno                    | The last seqno assigned by this vbucket, in|
522 |                               | in case of replica, the last closed check- |
523 |                               | point's end seqno.                         |
524 | last_persisted_seqno          | The last persisted seqno for the vbucket   |
525 | purge_seqno                   | The last seqno purged by the compactor     |
526 | uuid                          | The current vbucket uuid                   |
527 | last_persisted_snap_start     | The last persisted snapshot start seqno for|
528 |                               | the vbucket                                |
529 | last_persisted_snap_end       | The last persisted snapshot end seqno for  |
530 |                               | the vbucket                                |
531
532 ** vBucket failover stats
533
534 | Stats                         | Description                                |
535 | ------------------------------+--------------------------------------------|
536 | num_entries                   | Number of entries in the failover table of |
537 |                               | this vbucket                               |
538 | erroneous_entries_erased      | Number of erroneous entries erased in the  |
539 |                               | failover table of this vbucket             |
540 | n:id                          | vb_uuid of nth failover entry in the       |
541 |                               | failover table of this vbucket             |
542 | n:seq                         | seqno of nth failover entry in the         |
543 |                               | failover table of this vbucket             |
544
545 ** Tap stats
546
547 | ep_tap_ack_grace_period          | The amount of time to wait for a tap acks |
548 |                                  | before disconnecting                      |
549 | ep_tap_ack_interval              | The amount of messages a tap producer     |
550 |                                  | should send before requesting an ack      |
551 | ep_tap_ack_window_size           | The maximum amount of ack requests that   |
552 |                                  | can be sent before the consumer sends a   |
553 |                                  | response ack. When the window is full the |
554 |                                  | tap stream is paused                      |
555 | ep_tap_queue_backfillremaining   | Number of items needing to be backfilled  |
556 | ep_tap_total_backlog_size        | Number of remaining items for replication |
557 | ep_tap_total_queue               | Sum of tap queue sizes on the current     |
558 |                                  | tap queues                                |
559 | ep_tap_total_fetched             | Sum of all tap messages sent              |
560 | ep_tap_bg_max_pending            | The maximum number of bg jobs a tap       |
561 |                                  | connection may have                       |
562 | ep_tap_bg_fetched                | Number of tap disk fetches                |
563 | ep_tap_bg_fetch_requeued         | Number of times a tap bg fetch task is    |
564 |                                  | requeued                                  |
565 | ep_tap_fg_fetched                | Number of tap memory fetches              |
566 | ep_tap_deletes                   | Number of tap deletion messages sent      |
567 | ep_replication_throttled         | Number of tap messages refused due to     |
568 |                                  | throttling                                |
569 | ep_tap_count                     | Number of tap connections                 |
570 | ep_tap_bg_num_samples            | The number of tap bg fetch samples        |
571 |                                  | included in the avg                       |
572 | ep_tap_bg_min_wait               | The shortest time (µs) for a tap item     |
573 |                                  | before it is serviced by the dispatcher   |
574 | ep_tap_bg_max_wait               | The longest time (µs) for a tap item      |
575 |                                  | before it is serviced by the dispatcher   |
576 | ep_tap_bg_wait_avg               | The average wait time (µs) for a tap item |
577 |                                  | before it is serviced by the dispatcher   |
578 | ep_tap_bg_min_load               | The shortest time (µs) for a tap item to  |
579 |                                  | be loaded from the persistence layer      |
580 | ep_tap_bg_max_load               | The longest time (µs) for a tap item to   |
581 |                                  | be loaded from the persistence layer      |
582 | ep_tap_bg_load_avg               | The average time (µs) for a tap item to   |
583 |                                  | be loaded from the persistence layer      |
584 | ep_tap_noop_interval             | The number of secs between a noop is      |
585 |                                  | added to an idle connection               |
586 | ep_tap_backoff_period            | The number of seconds the tap connection  |
587 |                                  | should back off after receiving ETMPFAIL  |
588 | ep_tap_queue_fill                | Total enqueued items                      |
589 | ep_tap_queue_drain               | Total drained items                       |
590 | ep_tap_queue_backoff             | Total back-off items                      |
591 | ep_tap_queue_backfill            | Number of backfill remaining              |
592 | ep_tap_queue_itemondisk          | Number of items remaining on disk         |
593 | ep_replication_throttle_threshold| Percentage of memory in use before we     |
594 |                                  | throttle tap streams                      |
595 | ep_replication_throttle_queue_cap| Disk write queue cap to throttle          |
596 |                                  | tap streams                               |
597
598
599 *** Per Tap Client Stats
600
601 Each stat begins with =ep_tapq:= followed by a unique /client_id/ and
602 another colon.  For example, if your client is named, =slave1=, the
603 =qlen= stat would be =ep_tapq:slave1:qlen=.
604
605 | type                        | The kind of tap connection (producer or  | PC |
606 |                             | consumer)                                |    |
607 | created                     | Creation time for the tap connection     | PC |
608 | supports_ack                | true if the connection use acks          | PC |
609 | connected                   | true if this client is connected         | PC |
610 | disconnects                 | Number of disconnects from this client   | PC |
611 | reserved                    | true if the tap stream is reserved       | P  |
612 | suspended                   | true if the tap stream is suspended      | P  |
613 | qlen                        | Queue size for the given client_id       | P  |
614 | qlen_high_pri               | High priority tap queue items            | P  |
615 | qlen_low_pri                | Low priority tap queue items             | P  |
616 | vb_filters                  | Size of connection vbucket filter set    | P  |
617 | vb_filter                   | The content of the vbucket filter        | P  |
618 | rec_fetched                 | Tap messages sent to the client          | P  |
619 | rec_skipped                 | Number of messages skipped due to        | P  |
620 |                             | tap reconnect with a different filter    | P  |
621 | idle                        | True if this connection is idle          | P  |
622 | has_queued_item             | True if there are any remaining items    | P  |
623 |                             | from hash table or disk                  |    |
624 | bg_result_size              | Number of ready background results       | P  |
625 | bg_jobs_issued              | Number of background jobs started        | P  |
626 | bg_jobs_completed           | Number of background jobs completed      | P  |
627 | flags                       | Connection flags set by the client       | P  |
628 | pending_disconnect          | true if we're hanging up on this client  | P  |
629 | paused                      | true if this client is blocked           | P  |
630 | pending_backfill            | true if we're still backfilling keys     | P  |
631 |                             | for this connection                      | P  |
632 | pending_disk_backfill       | true if we're still backfilling keys     | P  |
633 |                             | from disk for this connection            | P  |
634 | backfill_completed          | true if all items from backfill is       | P  |
635 |                             | successfully transmitted to the client   | P  |
636 | backfill_start_timestamp    | Timestamp of backfill start              | P  |
637 | reconnects                  | Number of reconnects from this client    | P  |
638 | backfill_age                | The age of the start of the backfill     | P  |
639 | ack_seqno                   | The current tap ACK sequence number      | P  |
640 | recv_ack_seqno              | Last receive tap ACK sequence number     | P  |
641 | ack_log_size                | Tap ACK backlog size                     | P  |
642 | ack_window_full             | true if our tap ACK window is full       | P  |
643 | seqno_ack_requested         | The seqno of the ack message that the    | P  |
644 |                             | producer is wants to get a response for  |    |
645 | expires                     | When this ACK backlog expires            | P  |
646 | queue_memory                | Memory used for tap queue                | P  |
647 | queue_fill                  | Total queued items                       | P  |
648 | queue_drain                 | Total drained items                      | P  |
649 | queue_backoff               | Total back-off items                     | P  |
650 | queue_backfillremaining     | Number of backfill remaining             | P  |
651 | queue_itemondisk            | Number of items remaining on disk        | P  |
652 | total_backlog_size          | Num of remaining items for replication   | P  |
653 | total_noops                 | Number of NOOP messages sent             | P  |
654 | num_checkpoint_end          | Number of chkpoint end operations        |  C |
655 | num_checkpoint_end_failed   | Number of chkpoint end operations failed |  C |
656 | num_checkpoint_start        | Number of chkpoint end operations        |  C |
657 | num_checkpoint_start_failed | Number of chkpoint end operations failed |  C |
658 | num_delete                  | Number of delete operations              |  C |
659 | num_delete_failed           | Number of failed delete operations       |  C |
660 | num_flush                   | Number of flush operations               |  C |
661 | num_flush_failed            | Number of failed flush operations        |  C |
662 | num_mutation                | Number of mutation operations            |  C |
663 | num_mutation_failed         | Number of failed mutation operations     |  C |
664 | num_opaque                  | Number of opaque operation               |  C |
665 | num_opaque_failed           | Number of failed opaque operations       |  C |
666 | num_vbucket_set             | Number of vbucket set operations         |  C |
667 | num_vbucket_set_failed      | Number of failed vbucket set operations  |  C |
668 | num_unknown                 | Number of unknown operations             |  C |
669
670 ** Tap Aggregated Stats
671
672 Aggregated tap stats allow named tap connections to be logically
673 grouped and aggregated together by prefixes.
674
675 For example, if all of your tap connections started with =rebalance_=
676 or =replication_=, you could call =stats tapagg _= to request stats
677 grouped by everything before the first =_= character, giving you a set
678 for =rebalance= and a set for =replication=.
679
680 *** Results
681
682 | [prefix]:count              | Number of connections matching this prefix |
683 | [prefix]:qlen               | Total length of queues with this prefix    |
684 | [prefix]:backfill_remaining | Number of items needing to be backfilled   |
685 | [prefix]:backoff            | Total number of backoff events             |
686 | [prefix]:drain              | Total number of items drained              |
687 | [prefix]:fill               | Total number of items filled               |
688 | [prefix]:itemondisk         | Number of items remaining on disk          |
689 | [prefix]:total_backlog_size | Num of remaining items for replication     |
690
691 ** Dcp Stats
692
693 Each stat begins with =ep_dcpq:= followed by a unique /client_id/ and
694 another colon.  For example, if your client is named, =slave1=, the
695 =created= stat would be =ep_dcpq:slave1:created=.
696
697 ***Consumer Connections
698
699 | connected          | True if this client is connected                            |
700 | created            | Creation time for the tap connection                        |
701 | pending_disconnect | True if we're hanging up on this client                     |
702 | reserved           | True if the dcp stream is reserved                          |
703 | supports_ack       | True if the connection use flow control                     |
704 | total_acked_bytes  | The amount of bytes that the consumer has acked             |
705 | unacked_bytes      | The amount of bytes the consumer has processed but not acked|
706 | type               | The connection type (producer, consumer, or notifier)       |
707 | max_buffer_bytes   | Size of flow control buffer                                 |
708
709 ****Per Stream Stats
710
711 | buffer_bytes       | The amount of unprocessed bytes                       |
712 | buffer_items       | The amount of unprocessed items                       |
713 | end_seqno          | The seqno where this stream should end                |
714 | flags              | The flags used to create this stream                  |
715 | items_ready        | Whether the stream has messages ready to send         |
716 | ready_queue_memory | Memory occupied by elements in the DCP readyQ         |
717 | opaque             | The unique stream identifier                          |
718 | snap_end_seqno     | The start seqno of the last snapshot received         |
719 | snap_start_seqno   | The end seqno of the last snapshot received           |
720 | start_seqno        | The start start seqno used to create this stream      |
721 | state              | The stream state (pending, reading, or dead)          |
722 | vb_uuid            | The vb uuid used to create this stream                |
723
724 ***Producer/Notifier Connections
725
726 | buf_backfill_bytes    | The amount of bytes backfilled but not sent            |
727 | buf_backfill_items    | The amount of items backfilled but not sent            |
728 | bytes_sent            | The amount of unacked bytes sent to the consumer       |
729 | connected             | True if this client is connected                       |
730 | created               | Creation time for the tap connection                   |
731 | flow_control          | True if the connection use flow control                |
732 | items_remaining       | The amount of items remaining to be sent               |
733 | items_sent            | The amount of items already sent to the consumer       |
734 | last_sent_time        | The last time this connection sent a message           |
735 | max_buffer_bytes      | The maximum amount of bytes that can be sent without   |
736 |                       | receiving an ack from the consumer                     |
737 | noop_enabled          | Whether or not this connection sends noops             |
738 | noop_wait             | Whether or not this connection is waiting for a        |
739 |                       | noop response from the consumer                        |
740 | pending_disconnect    | True if we're hanging up on this client                |
741 | priority              | The connection priority for streaming data             |
742 | num_streams           | Total number of streams in the connection in any state |
743 | reserved              | True if the dcp stream is reserved                     |
744 | supports_ack          | True if the connection use flow control                |
745 | total_acked_bytes     | The amount of bytes that have been acked by the        |
746 |                       | consumer when flow control is enabled                  |
747 | total_bytes_sent      | The amount of bytes already sent to the consumer       |
748 | type                  | The connection type (producer, consumer, or notifier)  |
749 | unacked_bytes         | The amount of bytes the consumer has no acked          |
750 | backfill_num_active   | Number of active (running) backfills                   |
751 | backfill_num_snoozing | Number of snoozing (running) backfills                 |
752 | backfill_num_pending  | Number of pending (not running) backfills              |
753
754 ****Per Stream Stats
755
756 | backfill_disk_items      | The amount of items read during backfill from disk    |
757 | backfill_mem_items       | The amount of items read during backfill from memory  |
758 | backfill_sent            | The amount of items sent to the consumer during the   |
759 | end_seqno                | The seqno send mutations up to                        |
760 | flags                    | The flags supplied in the stream request              |
761 | items_ready              | Whether the stream has items ready to send            |
762 | last_sent_seqno          | The last seqno sent by this stream                    |
763 | last_sent_snap_end_seqno | The last snapshot end seqno sent by active stream     |
764 | last_read_seqno          | The last seqno read by this stream from disk or memory|
765 | ready_queue_memory       | Memory occupied by elements in the DCP readyQ         |
766 | memory_phase             | The amount of items sent during the memory phase      |
767 | opaque                   | The unique stream identifier                          |
768 | snap_end_seqno           | The last snapshot end seqno (Used if a consumer is    |
769 |                          | resuming a stream)                                    |
770 | snap_start_seqno         | The last snapshot start seqno (Used if a consumer is  |
771 |                          | resuming a stream)                                    |
772 | start_seqno              | The seqno to start sending mutations from             |
773 | state                    | The stream state (pending, backfilling, in-memory,    |
774 |                          | takeover-send, takeover-wait, or dead)                |
775 | vb_uuid                  | The vb uuid used in the stream request                |
776 | cur_snapshot_type        | The type of the current snapshot being received       |
777 | cur_snapshot_start       | The start seqno of the current snapshot being         |
778 |                          | received                                              |
779 | cur_snapshot_end         | The end seqno of the current snapshot being received  |
780
781 ** Dcp Aggregated Stats
782
783 Aggregated dcp stats allow dcp connections to be logically grouped and
784 aggregated together by prefixes.
785
786 For example, if all of your dcp connections started with =xdcr:= or
787 =replication=, you could call =stats dcpagg := to request stats grouped by
788 everything before the first =:= character, giving you a set for =xdcr= and a
789 set for =replication=.
790
791 *** Results
792
793 | [prefix]:count              | Number of connections matching this prefix   |
794 | [prefix]:producer_count     | Total producer connections with this prefix  |
795 | [prefix]:items_sent         | Total items sent with this prefix            |
796 | [prefix]:items_remaining    | Total items remaining to be sent with this   |
797 |                             | prefix                                       |
798 | [prefix]:total_bytes        | Total number of bytes sent with this prefix  |
799 | [prefix]:total_backlog_size | Total backfill items remaining to be sent    |
800 |                             | with this prefix                             |
801 | ep_dcp_num_running_backfills| Total number of running backfills across all |
802 |                             | dcp connections                              |
803 | ep_dcp_max_running_backfills| Max running backfills we can have across all |
804 |                             | dcp connections                              |
805 | ep_dcp_dead_conn_count      | Total dead connections                       |
806
807 ** Timing Stats
808
809 Timing stats provide histogram data from high resolution timers over
810 various operations within the system.
811
812 *** General Form
813
814 As this data is multi-dimensional, some parsing may be required for
815 machine processing.  It's somewhat human readable, but the =stats=
816 script mentioned in the Getting Started section above will do fancier
817 formatting for you.
818
819 Consider the following sample stats:
820
821 : STAT disk_insert_8,16 9488
822 : STAT disk_insert_16,32 290
823 : STAT disk_insert_32,64 73
824 : STAT disk_insert_64,128 86
825 : STAT disk_insert_128,256 48
826 : STAT disk_insert_256,512 2
827 : STAT disk_insert_512,1024 12
828 : STAT disk_insert_1024,2048 1
829
830 This tells you that =disk_insert= took 8-16µs 9,488 times, 16-32µs
831 290 times, and so on.
832
833 The same stats displayed through the =stats= CLI tool would look like
834 this:
835
836 : disk_insert (10008 total)
837 :    8us - 16us    : ( 94.80%) 9488 ###########################################
838 :    16us - 32us   : ( 97.70%)  290 #
839 :    32us - 64us   : ( 98.43%)   73
840 :    64us - 128us  : ( 99.29%)   86
841 :    128us - 256us : ( 99.77%)   48
842 :    256us - 512us : ( 99.79%)    2
843 :    512us - 1ms   : ( 99.91%)   12
844 :    1ms - 2ms     : ( 99.92%)    1
845
846
847 *** Available Stats
848
849 The following histograms are available from "timings" in the above
850 form to describe when time was spent doing various things:
851
852 | bg_wait               | bg fetches waiting in the dispatcher queue     |
853 | bg_load               | bg fetches waiting for disk                    |
854 | set_with_meta         | set_with_meta latencies                        |
855 | access_scanner        | access scanner run times                       |
856 | checkpoint_remover    | checkpoint remover run times                   |
857 | item_pager            | item pager run times                           |
858 | expiry_pager          | expiry pager run times                         |
859 | bg_tap_wait           | tap bg fetches waiting in the dispatcher queue |
860 | bg_tap_load           | tap bg fetches waiting for disk                |
861 | pending_ops           | client connections blocked for operations      |
862 |                       | in pending vbuckets                            |
863 | storage_age           | Analogous to ep_storage_age in main stats      |
864 | data_age              | Analogous to ep_data_age in main stats         |
865 | get_cmd               | servicing get requests                         |
866 | arith_cmd             | servicing incr/decr requests                   |
867 | get_stats_cmd         | servicing get_stats requests                   |
868 | get_vb_cmd            | servicing vbucket status requests              |
869 | set_vb_cmd            | servicing vbucket set state commands           |
870 | del_vb_cmd            | servicing vbucket deletion commands            |
871 | chk_persistence_cmd   | waiting for checkpoint persistence             |
872 | tap_vb_set            | servicing tap vbucket set state commands       |
873 | tap_vb_reset          | servicing tap vbucket reset commands           |
874 | tap_mutation          | servicing tap mutations                        |
875 | notify_io             | waking blocked connections                     |
876 | paged_out_time        | time (in seconds) objects are non-resident     |
877 | disk_insert           | waiting for disk to store a new item           |
878 | disk_update           | waiting for disk to modify an existing item    |
879 | disk_del              | waiting for disk to delete an item             |
880 | disk_vb_del           | waiting for disk to delete a vbucket           |
881 | disk_commit           | waiting for a commit after a batch of updates  |
882 | item_alloc_sizes      | Item allocation size counters (in bytes)       |
883
884 The following histograms are available from "scheduler" and "runtimes"
885 describing the scheduling overhead times and task runtimes incurred by various
886 IO and Non-IO tasks respectively:
887
888 | READ tasks                  |                                          |
889 | bg_fetcher_tasks            | histogram of scheduling overhead/task    |
890 |                             | runtimes for background fetch tasks      |
891 | bg_fetcher_meta_tasks       | histogram of scheduling overhead/task    |
892 |                             | runtimes for background fetch meta tasks |
893 | vkey_stat_bg_fetcher_tasks  | histogram of scheduling overhead/task    |
894 |                             | runtimes for fetching item from disk for |
895 |                             | vkey stat tasks                          |
896 | warmup_tasks                | histogram of scheduling overhead/task    |
897 |                             | runtimes for warmup tasks                |
898 |-----------------------------+------------------------------------------|
899 | WRITE tasks                 |                                          |
900 | vbucket_persist_high_tasks  | histogram of scheduling overhead/task    |
901 |                             | runtimes for snapshot vbucket state in   |
902 |                             | high priority tasks                      |
903 | vbucket_persist_low_tasks   | histogram of scheduling overhead/task    |
904 |                             | runtimes for snapshot vbucket state in   |
905 |                             | low priority tasks                       |
906 | vbucket_deletion_tasks      | histogram of scheduling overhead/task    |
907 |                             | runtimes for vbucket deletion tasks      |
908 | flusher_tasks               | histogram of scheduling overhead/task    |
909 |                             | runtimes for flusher tasks               |
910 | flush_all_tasks             | histogram of scheduling overhead/task    |
911 |                             | runtimes for flush all tasks             |
912 | compactor_tasks             | histogram of scheduling overhead/task    |
913 |                             | runtimes for vbucket level compaction    |
914 |                             | tasks                                    |
915 | statsnap_tasks              | histogram of scheduling overhead/task    |
916 |                             | runtimes for stats snapshot tasks        |
917 | mutation_log_compactor_tasks| histogram of scheduling overhead/task    |
918 |                             | runtimes for access log compaction tasks |
919 |-----------------------------+------------------------------------------|
920 | AUXIO tasks                 |                                          |
921 | tap_bg_fetcher_tasks        | histogram of scheduling overhead/task    |
922 |                             | runtimes for tap background fetch tasks  |
923 | access_scanner_tasks        | histogram of scheduling overhead/task    |
924 |                             | runtimes for access scanner tasks        |
925 | backfill_tasks              | histogram of scheduling overhead/task    |
926 |                             | runtimes for backfill tasks              |
927 |-----------------------------+------------------------------------------|
928 | NONIO tasks                 |                                          |
929 | conn_notification_tasks     | histogram of scheduling overhead/task    |
930 |                             | runtimes for connection notification     |
931 |                             | tasks                                    |
932 | checkpoint_remover_tasks    | histogram of scheduling overhead/task    |
933 |                             | runtimes for checkpoint removal tasks    |
934 | vb_memory_deletion_tasks    | histogram of scheduling overhead/task    |
935 |                             | runtimes for memory deletion of vbucket  |
936 |                             | tasks                                    |
937 | checkpoint_stats_tasks      | histogram of scheduling overhead/task    |
938 |                             | runtimes for checkpoint stats tasks      |
939 | item_pager_tasks            | histogram of scheduling overhead/task    |
940 |                             | runtimes for item pager tasks            |
941 | tap_resume_tasks            | histogram of scheduling overhead/task    |
942 |                             | runtimes for resume suspended tap        |
943 |                             | connection tasks                         |
944 | tapconnection_reaper_tasks  | histogram of scheduling overhead/task    |
945 |                             | runtimes for tap/dcp connection reaper   |
946 |                             | tasks                                    |
947 | hashtable_resize_tasks      | histogram of scheduling overhead/task    |
948 |                             | runtimes for hash table resizer tasks    |
949 | pending_ops_tasks           | histogram of scheduling overhead/task    |
950 |                             | runtimes for processing dcp bufferred    |
951 |                             | items tasks                              |
952 | conn_manager_tasks          | histogram of scheduling overhead/task    |
953 |                             | runtimes for dcp/tap connection manager  |
954 |                             | tasks                                    |
955 | defragmenter_tasks          | histogram of scheduling overhead/task    |
956 |                             | runtimes for the in-memory defragmenter  |
957 |                             | tasks                                    |
958 | workload_monitor_tasks      | histogram of scheduling overhead/task    |
959 |                             | runtimes for the workload monitor which  |
960 |                             | detects and sets the workload pattern    |
961
962 ** Hash Stats
963
964 Hash stats provide information on your vbucket hash tables.
965
966 Requesting these stats does affect performance, so don't do it too
967 regularly, but it's useful for debugging certain types of performance
968 issues.  For example, if your hash table is tuned to have too few
969 buckets for the data load within it, the =max_depth= will be too large
970 and performance will suffer.
971
972 | avg_count    | The average number of items per vbucket                  |
973 | avg_max      | The average max depth of a vbucket hash table            |
974 | avg_min      | The average min depth of a vbucket hash table            |
975 | largest_max  | The largest hash table depth of in all vbuckets          |
976 | largest_min  | The the largest minimum hash table depth of all vbuckets |
977 | max_count    | The largest number of items in a vbucket                 |
978 | min_count    | The smallest number of items in a vbucket                |
979 | total_counts | The total numer of items in all vbuckets                 |
980
981 It is also possible to get more detailed hash tables stats by using
982 'hash detail'. This will print per-vbucket stats.
983
984 Each stat is prefixed with =vb_= followed by a number, a colon, then
985 the individual stat name.
986
987 For example, the stat representing the size of the hash table for
988 vbucket 0 is =vb_0:size=.
989
990 | state            | The current state of this vbucket                |
991 | size             | Number of hash buckets                           |
992 | locks            | Number of locks covering hash table operations   |
993 | min_depth        | Minimum number of items found in a bucket        |
994 | max_depth        | Maximum number of items found in a bucket        |
995 | reported         | Number of items this hash table reports having   |
996 | counted          | Number of items found while walking the table    |
997 | resized          | Number of times the hash table resized           |
998 | mem_size         | Running sum of memory used by each item          |
999 | mem_size_counted | Counted sum of current memory used by each item  |
1000
1001 ** Checkpoint Stats
1002
1003 Checkpoint stats provide detailed information on per-vbucket checkpoint
1004 datastructure.
1005
1006 Like Hash stats, requesting these stats has some impact on performance.
1007 Therefore, please do not poll them from the server frequently.
1008 Each stat is prefixed with =vb_= followed by a number, a colon, and then
1009 each stat name.
1010
1011 | cursor_name:cursor_checkpoint_id | Checkpoint ID at which the cursor is      |
1012 |                                  | name 'cursor_name' is pointing now        |
1013 | cursor_name:cursor_seqno         | The seqno at which the cursor             |
1014 |                                  | 'cursor_name' is pointing now             |
1015 | open_checkpoint_id               | ID of the current open checkpoint         |
1016 | num_conn_cursors                 | Number of referencing dcp/tap cursors     |
1017 | num_checkpoint_items             | Number of total items in a checkpoint     |
1018 |                                  | datastructure                             |
1019 | num_open_checkpoint_items        | Number of items in the open checkpoint    |
1020 | num_checkpoints                  | Number of checkpoints in a checkpoint     |
1021 |                                  | datastructure                             |
1022 | num_items_for_persistence        | Number of items remaining for persistence |
1023 | state                            | The state of the vbucket this checkpoint  |
1024 |                                  | contains data for                         |
1025 | last_closed_checkpoint_id        | The last closed checkpoint number         |
1026 | persisted_checkpoint_id          | The slast persisted checkpoint number     |
1027 | mem_usage                        | Total memory taken up by items in all     |
1028 |                                  | checkpoints under given manager           |
1029
1030 ** Memory Stats
1031
1032 This provides various memory-related stats including the stats from tcmalloc.
1033 Note that tcmalloc stats are not available on some operating systems
1034 (e.g., Windows) that do not support tcmalloc.
1035
1036 | mem_used (deprecated)               | Engine's total memory usage          |
1037 | bytes                               | Engine's total memory usage          |
1038 | ep_kv_size                          | Memory used to store item metadata,  |
1039 |                                     | keys and values, no matter the       |
1040 |                                     | vbucket's state. If an item's value  |
1041 |                                     | is ejected, this stat will be        |
1042 |                                     | decremented by the size of the       |
1043 |                                     | item's value.                        |
1044 | ep_value_size                       | Memory used to store values for      |
1045 |                                     | resident keys                        |
1046 | ep_overhead                         | Extra memory used by transient data  |
1047 |                                     | like persistence queue, replication  |
1048 |                                     | queues, checkpoints, etc             |
1049 | ep_max_size                         | Max amount of data allowed in memory |
1050 | ep_mem_low_wat                      | Low water mark for auto-evictions    |
1051 | ep_mem_low_wat_percent              | Low water mark (as a percentage)       |
1052 | ep_mem_high_wat                     | High water mark for auto-evictions   |
1053 | ep_mem_high_wat_percent             | High water mark (as a percentage)      |
1054 | ep_oom_errors                       | Number of times unrecoverable OOMs   |
1055 |                                     | happened while processing operations |
1056 | ep_tmp_oom_errors                   | Number of times temporary OOMs       |
1057 |                                     | happened while processing operations |
1058 | ep_blob_num                         | The number of blob objects in the    |
1059 |                                     | cache                                |
1060 | ep_blob_overhead                    | The "unused" memory caused by the    |
1061 |                                     | allocator returning bigger chunks    |
1062 |                                     | than requested                       |
1063 | ep_storedval_size                   | Memory used by storedval objects     |
1064 | ep_storedval_overhead               | The "unused" memory caused by the    |
1065 |                                     | allocator returning bigger chunks    |
1066 |                                     | than requested                       |
1067 | ep_storedval_num                    | The number of storedval objects      |
1068 |                                     | allocated                            |
1069 | ep_item_num                         | The number of item objects allocated |
1070 | ep_mem_tracker_enabled              | If smart memory tracking is enabled  |
1071 | total_allocated_bytes               | Engine's total memory usage reported |
1072 |                                     | from the underlying memory allocator |
1073 | total_heap_size                     | Bytes of system memory reserved by   |
1074 |                                     | the underlying memory allocator      |
1075 | total_free_mapped_bytes             | Number of bytes in free, mapped      |
1076 |                                     | pages in the underlying allocator's  |
1077 |                                     | page heap                            |
1078 | total_free_unmapped_bytes           | Number of bytes in free, unmapped    |
1079 |                                     | pages in page heap. These are bytes  |
1080 |                                     | that have been released back to OS   |
1081 |                                     | by the underlying memory allocator   |
1082 | total_fragmentation_bytes           | Bytes of the fragmented memory in    |
1083 |                                     | the underlying allocator. Note that  |
1084 |                                     | the free and mapped pages inside the |
1085 |                                     | allocator are not considered as the  |
1086 |                                     | fragmentation as they can be used    |
1087 |                                     | for incoming memory allocations.     |
1088 | tcmalloc_max_thread_cache_bytes     | A limit to how much memory the       |
1089 |                                     | underlying memory allocator TCMalloc |
1090 |                                     | dedicates for small objects          |
1091 | tcmalloc_current_thread_cache_bytes | A measure of some of the memory that |
1092 |                                     | the underlying allocator TCMalloc is |
1093 |                                     | using for small objects              |
1094
1095
1096 ** Stats Key and Vkey
1097 | key_cas                       | The keys current cas value             |KV|
1098 | key_data_age                  | How long the key has waited for its    |KV|
1099 |                               | value to be persisted (0 if clean)     |KV|
1100 | key_exptime                   | Expiration time from the epoch         |KV|
1101 | key_flags                     | Flags for this key                     |KV|
1102 | key_is_dirty                  | If the value is not yet persisted      |KV|
1103 | key_last_modified_time        | Last updated time                      |KV|
1104 | key_valid                     | See description below                  | V|
1105 | key_vb_state                  | The vbucket state of this key          |KV|
1106
1107 =key_valid= can have the following responses:
1108
1109 this_is_a_bug - Some case we didn't take care of.
1110 dirty - The value in memory has not been persisted yet.
1111 length_mismatch - The key length in memory doesn't match the length on disk.
1112 data_mismatch - The data in memroy doesn't match the data on disk.
1113 flags_mismatch - The flags in memory don't match the flags on disk.
1114 valid - The key is both on disk and in memory
1115 ram_but_not_disk - The value doesn't exist yet on disk.
1116 item_deleted - The item has been deleted.
1117
1118 ** Warmup
1119
1120 Stats =warmup= shows statistics related to warmup logic
1121
1122 | ep_warmup                       | Shows if warmup is enabled / disabled      |
1123 | ep_warmup_estimated_key_count   | Estimated number of keys in database       |
1124 | ep_warmup_estimated_value_count | Estimated number of values in database     |
1125 | ep_warmup_state                 | The current state of the warmup thread     |
1126 | ep_warmup_thread                | Warmup thread status                       |
1127 | ep_warmup_key_count             | Number of keys warmed up                   |
1128 | ep_warmup_value_count           | Number of values warmed up                 |
1129 | ep_warmup_dups                  | Duplicates encountered during warmup       |
1130 | ep_warmup_oom                   | OOMs encountered during warmup             |
1131 | ep_warmup_time                  | Time (µs) spent by warming data            |
1132 | ep_warmup_keys_time             | Time (µs) spent by warming keys            |
1133 | ep_warmup_mutation_log          | Number of keys present in mutation log     |
1134 | ep_warmup_access_log            | Number of keys present in access log       |
1135 | ep_warmup_min_items_threshold   | Percentage of total items warmed up        |
1136 |                                 | before we enable traffic                   |
1137 | ep_warmup_min_memory_threshold  | Percentage of max mem warmed up before     |
1138 |                                 | we enable traffic                          |
1139
1140
1141 ** KV Store Stats
1142
1143 These provide various low-level stats and timings from the underlying KV
1144 storage system and useful to understand various states of the storage
1145 system.
1146
1147 The following stats are available for all database engine:
1148
1149 | open              | Number of database open operations                 |
1150 | close             | Number of database close operations                |
1151 | readTime          | Time spent in read operations                      |
1152 | readSize          | Size of data in read operations                    |
1153 | writeTime         | Time spent in write operations                     |
1154 | writeSize         | Size of data in write operations                   |
1155 | delete            | Time spent  in delete() calls                      |
1156
1157 The following stats are available for the CouchStore database engine:
1158
1159 | backend_type              | Type of backend database engine                                                           |
1160 | commit                    | Time spent in CouchStore commit operation                                                 |
1161 | compaction                | Time spent in compacting vbucket database file                                            |
1162 | numLoadedVb               | Number of Vbuckets loaded into memory                                                     |
1163 | lastCommDocs              | Number of docs in the last commit                                                         |
1164 | failure_set               | Number of failed set operation                                                            |
1165 | failure_get               | Number of failed get operation                                                            |
1166 | failure_vbset             | Number of failed vbucket set operation                                                    |
1167 | save_documents            | Time spent in CouchStore save documents operation                                         |
1168 | io_num_read               | Number of io read operations                                                              |
1169 | io_num_write              | Number of io write operations                                                             |
1170 | io_read_bytes             | Number of bytes read (key + values + rev_meta)                                            |
1171 | io_write_bytes            | Number of bytes written (key + values + rev_meta                                          |
1172 | io_total_read_bytes       | Number of bytes read (total, including Couchstore B-Tree and other overheads)             |
1173 | io_total_write_bytes      | Number of bytes written (total, including Couchstore B-Tree and other overheads)          |
1174 | io_compaction_read_bytes  | Number of bytes read (compaction only, includes Couchstore B-Tree and other overheads)    |
1175 | io_compaction_write_bytes | Number of bytes written (compaction only, includes Couchstore B-Tree and other overheads) |
1176
1177 ** KV Store Timing Stats
1178
1179 KV Store Timing stats provide timing information from the underlying storage
1180 system. These stats are on shard (group of partitions) level.
1181
1182 *** Available Stats
1183 The following histograms are available from "kvtimings" in the form
1184 described in Timings section above. These stats are prefixed with the
1185 rw_<Shard number>: indicating the times spent doing various things:
1186
1187 | commit                | time spent in commit operations                |
1188 | compact               | time spent in file compaction operations       |
1189 | snapshot              | time spent in VB state snapshot operations     |
1190 | delete                | time spent in delete operations                |
1191 | save_documents        | time spent in persisting documents in storage  |
1192 | writeTime             | time spent in writing to storage subsystem     |
1193 | writeSize             | sizes of writes given to storage subsystem     |
1194 | bulkSize              | batch sizes of the save documents calls        |
1195 | fsReadTime            | time spent in doing filesystem reads           |
1196 | fsWriteTime           | time spent in doing filesystem writes          |
1197 | fsSyncTime            | time spent in doing filesystem sync operations |
1198 | fsReadSize            | sizes of various filesystem reads issued       |
1199 | fsWriteSize           | sizes of various filesystem writes issued      |
1200 | fsReadSeek            | values of various seek operations in file      |
1201
1202
1203 ** Workload Raw Stats
1204 Some information about the number of shards and Executor pool information.
1205 These are available as "workload" stats:
1206
1207 | ep_workload:num_shards  | number of shards or groups of partitions     |
1208 | ep_workload:num_writers | number of threads that prioritize write ops  |
1209 | ep_workload:num_readers | number of threads that prioritize read ops   |
1210 | ep_workload:num_auxio   | number of threads that prioritize aux io ops |
1211 | ep_workload:num_nonio   | number of threads that prioritize non io ops |
1212 | ep_workload:max_writers | max number of threads doing write ops        |
1213 | ep_workload:max_readers | max number of threads doing read ops         |
1214 | ep_workload:max_auxio   | max number of threads doing aux io ops       |
1215 | ep_workload:max_nonio   | max number of threads doing non io ops       |
1216 | ep_workload:num_sleepers| number of threads that are sleeping |
1217 | ep_workload:ready_tasks | number of global tasks that are ready to run |
1218
1219 Additionally the following stats on the current state of the TaskQueues are
1220 also presented
1221 | HiPrioQ_Writer:InQsize   | count high priority bucket writer tasks waiting  |
1222 | HiPrioQ_Writer:OutQsize  | count high priority bucket writer tasks runnable |
1223 | HiPrioQ_Reader:InQsize   | count high priority bucket reader tasks waiting  |
1224 | HiPrioQ_Reader:OutQsize  | count high priority bucket reader tasks runnable |
1225 | HiPrioQ_AuxIO:InQsize    | count high priority bucket auxio  tasks waiting  |
1226 | HiPrioQ_AuxIO:OutQsize   | count high priority bucket auxio  tasks runnable |
1227 | HiPrioQ_NonIO:InQsize    | count high priority bucket nonio  tasks waiting  |
1228 | HiPrioQ_NonIO:OutQsize   | count high priority bucket nonio  tasks runnable |
1229 | LowPrioQ_Writer:InQsize  | count low priority bucket writer tasks waiting   |
1230 | LowPrioQ_Writer:OutQsize | count low priority bucket writer tasks runnable  |
1231 | LowPrioQ_Reader:InQsize  | count low priority bucket reader tasks waiting   |
1232 | LowPrioQ_Reader:OutQsize | count low priority bucket reader tasks runnable  |
1233 | LowPrioQ_AuxIO:InQsize   | count low priority bucket auxio  tasks waiting   |
1234 | LowPrioQ_AuxIO:OutQsize  | count low priority bucket auxio  tasks runnable  |
1235 | LowPrioQ_NonIO:InQsize   | count low priority bucket nonio  tasks waiting   |
1236 | LowPrioQ_NonIO:OutQsize  | count low priority bucket nonio  tasks runnable  |
1237
1238 ** Dispatcher Stats/JobLogs
1239
1240 This provides the stats from AUX dispatcher and non-IO dispatcher, and
1241 from all the reader and writer threads running for the specific bucket.
1242 Along with stats, the job logs for each of the dispatchers and worker
1243 threads is also made available.
1244
1245 The following stats are available for the workers and dispatchers:
1246
1247 | state             | Threads's current status: running, sleeping etc.              |
1248 | runtime           | The amount of time since the thread started running           |
1249 | task              | The activity/job the thread is involved with at the moment    |
1250
1251 The following stats are for individual job logs:
1252
1253 | starttime         | The timestamp when the job started                            |
1254 | runtime           | Time it took for the job to run                               |
1255 | task              | The activity/job the thread ran during that time              |
1256
1257
1258 ** Stats Reset
1259
1260 Resets the list of stats below.
1261
1262 Reset Stats:
1263
1264 | ep_bg_load                        |
1265 | ep_bg_wait                        |
1266 | ep_bg_max_load                    |
1267 | ep_bg_min_load                    |
1268 | ep_bg_max_wait                    |
1269 | ep_bg_min_wait                    |
1270 | ep_commit_time                    |
1271 | ep_flush_duration                 |
1272 | ep_flush_duration_highwat         |
1273 | ep_io_num_read                    |
1274 | ep_io_num_write                   |
1275 | ep_io_read_bytes                  |
1276 | ep_io_write_bytes                 |
1277 | ep_items_rm_from_checkpoints      |
1278 | ep_num_eject_failures             |
1279 | ep_num_pager_runs                 |
1280 | ep_num_not_my_vbuckets            |
1281 | ep_num_value_ejects               |
1282 | ep_pending_ops_max                |
1283 | ep_pending_ops_max_duration       |
1284 | ep_pending_ops_total              |
1285 | ep_storage_age                    |
1286 | ep_storage_age_highwat            |
1287 | ep_tap_bg_load_avg                |
1288 | ep_tap_bg_max_load                |
1289 | ep_tap_bg_max_wait                |
1290 | ep_tap_bg_min_load                |
1291 | ep_tap_bg_min_wait                |
1292 | ep_tap_bg_wait_avg                |
1293 | ep_replication_throttled          |
1294 | ep_tap_total_fetched              |
1295 | ep_vbucket_del_max_walltime       |
1296 | pending_ops                       |
1297
1298 Reset Histograms:
1299
1300 | bg_load                           |
1301 | bg_wait                           |
1302 | bg_tap_load                       |
1303 | bg_tap_wait                       |
1304 | chk_persistence_cmd               |
1305 | data_age                          |
1306 | del_vb_cmd                        |
1307 | disk_insert                       |
1308 | disk_update                       |
1309 | disk_del                          |
1310 | disk_vb_del                       |
1311 | disk_commit                       |
1312 | get_stats_cmd                     |
1313 | item_alloc_sizes                  |
1314 | get_vb_cmd                        |
1315 | notify_io                         |
1316 | pending_ops                       |
1317 | set_vb_cmd                        |
1318 | storage_age                       |
1319 | tap_mutation                      |
1320 | tap_vb_reset                      |
1321 | tap_vb_set                        |
1322
1323
1324 * Details
1325
1326 ** Ages
1327
1328 The difference between =ep_storage_age= and =ep_data_age= is somewhat
1329 subtle, but when you consider that a given record may be updated
1330 multiple times before hitting persistence, it starts to be clearer.
1331
1332 =ep_data_age= is how old the data we actually wrote is.
1333
1334 =ep_storage_age= is how long the object has been waiting to be
1335 persisted.
1336
1337 ** Warming Up
1338
1339 Opening the data store is broken into three distinct phases:
1340
1341 *** Initializing
1342
1343 During the initialization phase, the server is not accepting
1344 connections or otherwise functional.  This is often quick, but in a
1345 server crash can take some time to perform recovery of the underlying
1346 storage.
1347
1348 This time is made available via the =ep_dbinit= stat.
1349
1350 *** Warming Up
1351
1352 After initialization, warmup begins.  At this point, the server is
1353 capable of taking new writes and responding to reads.  However, only
1354 records that have been pulled out of the storage or have been updated
1355 from other clients will be available for request.
1356
1357 (note that records read from persistence will not overwrite new
1358 records captured from the network)
1359
1360 During this phase, =ep_warmup_thread= will report =running= and
1361 =ep_warmed_up= will be increasing as records are being read.
1362
1363 *** Complete
1364
1365 Once complete, =ep_warmed_up= will stop increasing and
1366 =ep_warmup_thread= will report =complete=.
1367
1368 * Uuid
1369 The uuid stats allows clients to check if the unique identifier created
1370 and assigned to the bucket when it is created. By looking at this a client
1371 can verify that the bucket hasn't been recreated since it was used.