Maintenance and Monitoring
Maintenance
Reaper
Reaper is an open source tool that aims to schedule and orchestrate repairs of Apache Cassandra clusters.
It improves the existing nodetool repair process by
- Splitting repair jobs into smaller tunable segments.
- Handling back-pressure through monitoring running repairs and pending compactions.
- Adding ability to pause or cancel repairs and track progress precisely.
Download
- Go to the download page at http://cassandra-reaper.io/docs/download/
- Click on Download Tarball 2.2.2
- You are redirected to reaper page on JFrog Bintry
- Select the Files tab
- Click on
cassandra-reaper-2.2.2.tar.gz
to start downloading.
Install
cd /opt
tar -xvzf cassandra-reaper-2.2.2.tar.gz
cd cassandra-reaper-2.2.2.
bin/cassandra-reaper
Recommended Settings
Below are the recommended starting settings in the cassandra-reaper.yaml
. All other settings at the default
segmentCountPerNode: 16
repairIntensity: 0.20
repairParallelism: Parallel
maxPendingCompactions: 20
repairThreadCount: 1
repairRunThreadCount: 8
hangingRepairTimeoutMins: 30
Documentation for Reaper is at http://cassandra-reaper.io/docs/
Esop
Esop is a backup and restoration tools from Instaclustr
- Go to Maven Central at https://search.maven.org/artifact/com.instaclustr/esop
- Click on version 1.09. You will be redirected to the download page
- Click on Downloads (in the upper right corner) and select JAR file from the pull down menu.
- Rename the jar file to instaclustr_esop.jar
- Copy the jar file to all nodes in the Cassandra cluster. You can use any location designated by your organization for jar files or you can use the directory where Cassandra stores jar files. This location depends on the type of Cassandra install you performed.
- Run
Package install: /usr/share/cassandra
- Run
Tarball install: $INSTALL_DIR/tools/bin
Below is a sample bash script that can be used to to run an esop backup.
#! /bin/bash
# script to run esop backups
# <path to backup> must have the form
# /<path to backup share>/<name of your cluster>/<datacenter>/node name
# node name is the name of the local host
# /backups/cassandra/kineticdata/datacenter1/node1
DATA_DIRECTORY="/opt/cassandra/data"
BACKUP_DIRECTORY="/backups/cassandra/local_cassandra/datacenter1/cassandra01" #replace with backup directory
ESOP_PATH="/scripts" #location of esop jar
JMX_PORT=7199
NODE_IP=127.0.0.1
java -jar /scripts/instaclustr-esop.jar backup \
--jmx-service "$NODE_IP:$JMX_PORT" \
--storage-location=file://"$BACKUP_DIRECTORY" \
--data-directory="$DATA_DIRECTORY" \
--snapshot-tag=kd_backup
Documentation for Esop is at https://github.com/instaclustr/esop
Monitoring
At a minimum, the following metrics should be monitored for each Cassandra node:
- CPU Uage
- Filesystem Usage
- OS Load
- Average Memory Use
- Time CPU Spent Waiting for IO to complete
Updated almost 2 years ago