Requirements
Hardware Requirements
For an on-premises production installation, you will need three physical or virtual machines with the following specifications:
Category | Recommend Minimum |
---|---|
CPU | 4 x 2.0 Ghz or higher |
Memory | 16 GB |
Storage | 300 GB direct attached storage for data. SSD preferred. |
Dedicated Cassandra Servers
It is advised not to run other applications alongside Cassandra. If you choose to do so, you will need to adjust the specifications to accommodate the application.
Storage Requirements
- 24GB direct attached storage for the commit log. An SSD on a different filesystem than the data is preferred.
- Solid State Drives (SSD) are preferred for production clusters. However, if hard disk drives are used:
- Use RAID-0 or JBOD than RAID-1 or RAID-5 if using a server with multiple disk drives.
- Place the data, saved_caches, and hints directory on one disk (or set of disk drives) and the commit logs on another.
- Avoid using NFS or a SAN for data directories. Shared storage is a single point of failure and in some instances cause performance issues.
- Keeping 50% free space on the disks used to store Cassandra data is important. Cassandra has background processes that require sufficient disk space.
OS Requirements
Kinetic Supports running Cassandra on one of the following Linux distributions:
- Ubuntu 20.04 or higher
- Debian 8 and 9
- CentOS 7 or higher
- RedHat Enterprise Linux (RHEL) 8 or higher
On all Linux distributions, you should
- Disable Swap
- Set maximum open files to 100000
Important: The following practices are not recommended:
- Running Cassandra on any of the less popular Linux distributions without conducting extensive testing
- Deploying Cassandra on older OS versions unless you have previous experience with the older distribution in a production environment
- Running Cassandra on a Windows host
Software Requirements
On each of the servers which will be running Cassandra, verify or install :
- Java 8: The latest version of Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8. To verify that you have the correct version of java installed, type
java -version
. - Python 3: The latest version of Python 3 (currently 3.9). To verify that you have the correct version of Python installed, type
python --version
. - Package Installer (optional):
- APT for Debian and Ubuntu systems
- YUM for RHEL system
- Otherwise, a binary tarball installation can be used as outlined in Installation
Other Requirements
- If you are installing Cassandra using a binary tarball
- Create a user and group named
cassandra
- Create a folder
/var/log/cassandra
. - Make
cassandra
the owner of/var/log/cassandra
- Create a user and group named
- Create an xfs filesystem named
/cassandra
with 300GB allocated. If you are installing Cassandra using a binary tarball, this will also be your installation directory - Make
cassandra:cassandra
the owner of/cassandra
- Create a folder named
/cassandra/tmp
if you prohibit applications from mounting/tmp
Should some of the above be added into the installation steps directly?
Firewall Settings
Cassandra uses the following TCP ports for inbound and outbound traffic:
Port | Use |
---|---|
7000 | cluster communication |
7001 | if using internode encryption (recommended) |
9042 | native protocol clients (cql) |
9142 | encrypted native protocol clients |
7199 | JMX (nodetool) |
80 | Yum and Apt |
Sizing Your Cassandra Cluster
Production
For a production cluster, the minimum number of nodes required for high availability is three (3). A three node cluster will be sufficient for most Kinetic Data Platform workloads.
Development
For initial development work, a one node cluster will work. For example, you can set up a one node cluster on a desktop using products such as Oracle VirtualBox or Docker Desktop.
Updated 2 months ago