For an on-premises production installation, you will need three physical or virtual machines with the following specifications:
|CPU||4 x 2.0 Ghz or higher|
|Storage||300 GB direct attached storage for data. SSD preferred.|
Dedicated Cassandra Servers
It is advised not to run other applications alongside Cassandra. If you choose to do so, you will need to adjust the specifications to accommodate the application.
- 24GB direct attached storage for the commit log. An SSD on a different filesystem than the data is preferred.
- Solid State Drives (SSD) are preferred for production clusters. However, if hard disk drives are used:
- Use RAID-0 or JBOD than RAID-1 or RAID-5 if using a server with multiple disk drives.
- Place the data, saved_caches, and hints directory on one disk (or set of disk drives) and the commit logs on another.
- Avoid using NFS or a SAN for data directories. Shared storage is a single point of failure and in some instances cause performance issues.
- Keeping 50% free space on the disks used to store Cassandra data is important. Cassandra has background processes that require sufficient disk space.
Kinetic Supports running Cassandra on one of the following Linux distributions:
- Ubuntu 20.04 or higher
- Debian 8 and 9
- CentOS 7 or higher
- RedHat Enterprise Linux (RHEL) 8 or higher
On all Linux distributions, you should
- Disable Swap
- Set maximum open files to 100000
Important: The following practices are not recommended:
- Running Cassandra on any of the less popular Linux distributions without conducting extensive testing
- Deploying Cassandra on older OS versions unless you have previous experience with the older distribution in a production environment
- Running Cassandra on a Windows host
On each of the servers which will be running Cassandra, verify or install :
- Java 8: The latest version of Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8. To verify that you have the correct version of java installed, type
- Python 3: The latest version of Python 3 (currently 3.9). To verify that you have the correct version of Python installed, type
- Package Installer (optional):
- APT for Debian and Ubuntu systems
- YUM for RHEL system
- Otherwise, a binary tarball installation can be used as outlined in Installation
- If you are installing Cassandra using a binary tarball
- Create a user and group named
- Create a folder
cassandrathe owner of
- Create a user and group named
- Create an xfs filesystem named
/cassandrawith 300GB allocated. If you are installing Cassandra using a binary tarball, this will also be your installation directory
cassandra:cassandrathe owner of
- Create a folder named
/cassandra/tmpif you prohibit applications from mounting
Should some of the above be added into the installation steps directly?
Cassandra uses the following TCP ports for inbound and outbound traffic:
|7001||if using internode encryption (recommended)|
|9042||native protocol clients (cql)|
|9142||encrypted native protocol clients|
|80||Yum and Apt|
For a production cluster, the minimum number of nodes required for high availability is three (3). A three node cluster will be sufficient for most Kinetic Data Platform workloads.
For initial development work, a one node cluster will work. For example, you can set up a one node cluster on a desktop using products such as Oracle VirtualBox or Docker Desktop.
Updated 2 months ago