Requirements

Hardware Requirements

For an on-premises production installation, you will need three physical or virtual machines with the following specifications:

CategoryRecommend Minimum
CPU4 x 2.0 Ghz or higher
Memory16 GB
Storage300 GB direct attached storage for data. SSD preferred.

🚧

Dedicated Cassandra Servers

It is advised not to run other applications alongside Cassandra. If you choose to do so, you will need to adjust the specifications to accommodate the application.

Storage Requirements

  • 24GB direct attached storage for the commit log. An SSD on a different filesystem than the data is preferred.
  • Solid State Drives (SSD) are preferred for production clusters. However, if hard disk drives are used:
    • Use RAID-0 or JBOD than RAID-1 or RAID-5 if using a server with multiple disk drives.
    • Place the data, saved_caches, and hints directory on one disk (or set of disk drives) and the commit logs on another.
  • Avoid using NFS or a SAN for data directories. Shared storage is a single point of failure and in some instances cause performance issues.
  • Keeping 50% free space on the disks used to store Cassandra data is important. Cassandra has background processes that require sufficient disk space.

OS Requirements

Kinetic Supports running Cassandra on one of the following Linux distributions:

  • Ubuntu 20.04 or higher
  • Debian 8 and 9
  • CentOS 7 or higher
  • RedHat Enterprise Linux (RHEL) 8 or higher

On all Linux distributions, you should

  • Disable Swap
  • Set maximum open files to 100000

Important: The following practices are not recommended:

  • Running Cassandra on any of the less popular Linux distributions without conducting extensive testing
  • Deploying Cassandra on older OS versions unless you have previous experience with the older distribution in a production environment
  • Running Cassandra on a Windows host

Software Requirements

On each of the servers which will be running Cassandra, verify or install :

  • Java 8: The latest version of Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8. To verify that you have the correct version of java installed, type java -version.
  • Python 3: The latest version of Python 3 (currently 3.9). To verify that you have the correct version of Python installed, type python --version.
  • Package Installer (optional):
    • APT for Debian and Ubuntu systems
    • YUM for RHEL system
    • Otherwise, a binary tarball installation can be used as outlined in Installation

Other Requirements

  • If you are installing Cassandra using a binary tarball
    • Create a user and group named cassandra
    • Create a folder /var/log/cassandra.
    • Make cassandra the owner of /var/log/cassandra
  • Create an xfs filesystem named /cassandra with 300GB allocated. If you are installing Cassandra using a binary tarball, this will also be your installation directory
  • Make cassandra:cassandra the owner of /cassandra
  • Create a folder named /cassandra/tmp if you prohibit applications from mounting /tmp

❗️

Should some of the above be added into the installation steps directly?

Firewall Settings

Cassandra uses the following TCP ports for inbound and outbound traffic:

PortUse
7000cluster communication
7001if using internode encryption (recommended)
9042native protocol clients (cql)
9142encrypted native protocol clients
7199JMX (nodetool)
80Yum and Apt

Sizing Your Cassandra Cluster

Production

For a production cluster, the minimum number of nodes required for high availability is three (3). A three node cluster will be sufficient for most Kinetic Data Platform workloads.

Development

For initial development work, a one node cluster will work. For example, you can set up a one node cluster on a desktop using products such as Oracle VirtualBox or Docker Desktop.