Cassandra Monitoring: A Best Practice Guide

Cassandra Monitoring

Monitoring Terminologies

JVM Based Monitoring

Metrics management in Cassandra — Cassandra Monitoring

Metrics

Metrics Types

  • Gauge: A single value representing a metric at a specific point in time, e.g. value of memory allocated or a number of active tasks.
  • Counter: Counters are the same as a gauge but are used for value comparisons. Generally, a counter is only incremented, and it is reset when the functionality gets disrupted like a node restart. An example is cache_hit count.
  • Histogram: Histogram is a count of data elements from a data stream grouped in fixed intervals. A histogram gives a statistical distribution of values. The data elements are provided over min, max, mean, median, 75th, 90th, 95th, 98th, 99th, 99.9th percentile value intervals.
  • Timer: Timer keeps the rate of execution and histogram of duration for a metric.
  • Latency: This is a special type to measure latency. It includes Timer and the latency is in microseconds. There is also a TotalLatency with each latency metric. The total latency is the count of latency since the beginning. The beginning means the start of a node.
  • Meter: Meter is a unit to measure throughput. It also includes a weighted moving average for first, fifth, and fifteenth minute.

Metrics Categories

Metrics Format

Essential Metrics

Cassandra Metrics

Node Status

Client Request Metrics

Compaction Statistics

Garbage Collector Metrics

Memory Metrics

Threadpool Metrics

Table Metrics

Partition Size

Tombstone Scanned

SSTable Per Read

Additional Metrics

Dropped Messages

Caches For Tables

Data Streaming

Hinted Handoff

CQL and Batch

System Metrics

Disk Usage

CPU Usage

Monitoring tools

Prometheus

Prometheus - time-series based cassandra monitoring

Grafana

Grafana - Time series metrics visualization

Cassandra Exporter

Conclusion

  • Cassandra exporter is an excellent open source tool for optimal monitoring performance on large Cassandra clusters.
  • Instaclustr Cassandra managed service uses a comprehensive monitoring-alerting service with 24x7 support and it is a good option to outsource all Cassandra operations and it comes with a free trial.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store