Maintenance and Monitoring

Maintenance

Reaper

Reaper is an open source tool that aims to schedule and orchestrate repairs of Apache Cassandra clusters.

It improves the existing nodetool repair process by

  • Splitting repair jobs into smaller tunable segments.
  • Handling back-pressure through monitoring running repairs and pending compactions.
  • Adding ability to pause or cancel repairs and track progress precisely.

Download

  • Go to the download page at http://cassandra-reaper.io/docs/download/
  • Click on Download Tarball 2.2.2
  • You are redirected to reaper page on JFrog Bintry
  • Select the Files tab
  • Click on cassandra-reaper-2.2.2.tar.gz to start downloading.

Install

  • cd /opt
  • tar -xvzf cassandra-reaper-2.2.2.tar.gz
  • cd cassandra-reaper-2.2.2.
  • bin/cassandra-reaper

Recommended Settings

Below are the recommended starting settings in the cassandra-reaper.yaml. All other settings at the default

  • segmentCountPerNode: 16
  • repairIntensity: 0.20
  • repairParallelism: Parallel
  • maxPendingCompactions: 20
  • repairThreadCount: 1
  • repairRunThreadCount: 8
  • hangingRepairTimeoutMins: 30

Documentation for Reaper is at http://cassandra-reaper.io/docs/

Esop

Esop is a backup and restoration tools from Instaclustr

  • Go to Maven Central at https://search.maven.org/artifact/com.instaclustr/esop
  • Click on version 1.09. You will be redirected to the download page
  • Click on Downloads (in the upper right corner) and select JAR file from the pull down menu.
  • Rename the jar file to instaclustr_esop.jar
  • Copy the jar file to all nodes in the Cassandra cluster. You can use any location designated by your organization for jar files or you can use the directory where Cassandra stores jar files. This location depends on the type of Cassandra install you performed.
  • Run Package install: /usr/share/cassandra
  • Run Tarball install: $INSTALL_DIR/tools/bin

Below is a sample bash script that can be used to to run an esop backup.

#! /bin/bash
#  script to run esop backups
# <path to backup> must have the form 
#  /<path to backup share>/<name of your cluster>/<datacenter>/node name
#  node name is the name of the local host
#  /backups/cassandra/kineticdata/datacenter1/node1

DATA_DIRECTORY="/opt/cassandra/data"
BACKUP_DIRECTORY="/backups/cassandra/local_cassandra/datacenter1/cassandra01"   #replace with backup directory
ESOP_PATH="/scripts"      #location of esop jar
JMX_PORT=7199
NODE_IP=127.0.0.1

java -jar /scripts/instaclustr-esop.jar backup \
   --jmx-service "$NODE_IP:$JMX_PORT" \
   --storage-location=file://"$BACKUP_DIRECTORY" \
   --data-directory="$DATA_DIRECTORY" \
   --snapshot-tag=kd_backup

Documentation for Esop is at https://github.com/instaclustr/esop

Monitoring

At a minimum, the following metrics should be monitored for each Cassandra node:

  • CPU Uage
  • Filesystem Usage
  • OS Load
  • Average Memory Use
  • Time CPU Spent Waiting for IO to complete