Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

- Community
- DigitalOcean
- Community
- DigitalOcean

How To Configure a Multi-Node Cluster with Cassandra on a Ubuntu VPS

Published on September 11, 2013

By Henrique Pinheiro

Introduction

This tutorial will teach you how to configure a Multi-Node cluster with Cassandra on a VPS. Cassandra, a highly scalable open source database system that achieves great performance when setup with multiple-nodes – even on different data centers.

Installing Cassandra on Each Node

Before we begin configuring each node, you need to have Cassandra installed in every one of them. We have an easy tutorial on how to do that with VPS. After you've installed Cassandra on every node, you need to make sure it isn't running. To close Cassandra, type in:

sudo ps auwx | grep cassandra

If a process different from the "grep" one appears, copy the proccess ID and kill it:

sudo kill -9 PID

You'll also need to clear data. Do so by running:

sudo rm -rf /var/lib/cassandra/*

Configuring Cassandra

To configure Cassandra for multiple nodes, you'll need to know beforehand how many nodes you're going to use, and calculate token numbers for each. We've developed a tool to do this, and you can get it here. Simply write the number of nodes you're dealing with and you'll have tokens for each node. For example, if you have three nodes, you'd have these numbers:

Node 0: 0
Node 1: 3074457345618258602
Node 2: 6148914691236517205

Now you'll need to edit your configuration file for each node. To do so, open the nano text editor by running:

nano ~/cassandra/conf/cassandra.yaml

The information you'll need to edit can be the same for all nodes (cluster_name, seed_provider, rpc_address and endpoint_snitch) or different for each one (initial_token and listen_address). Choose a node to be your seed one, and look in the configuration file for the lines that refer to each of these attributes, and modify them to your needs:

cluster_name: 'Name'
initial_token: Token
seed_provider:
    - seeds:  "Seed IP"
listen_address: Droplet's IP
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

Substitute “Name” by your cluster name, “Token” by the number you generated earlier (depending on the node), “Seed IP” by your seed node’s IP, and “Droplet’s IP” by your droplet’s IP address. Do this for each node. Example of this filled on a 3-node setup:

Node 0
cluster_name: 'MyDigitalOceanCluster'
initial_token: 0
seed_provider:
    - seeds:  "198.211.xxx.0"
listen_address: 198.211.xxx.0
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

Node 1
cluster_name: 'MyDigitalOceanCluster'
initial_token: 3074457345618258602
seed_provider:
    - seeds:  "198.211.xxx.0"
listen_address: 192.241.xxx.0
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

Node 2
cluster_name: 'MyDigitalOceanCluster'
initial_token: 6148914691236517205
seed_provider:
    - seeds:  "198.211.xxx.0"
listen_address: 37.139.xxx.0
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

To run, simply type in:

sudo sh ~/cassandra/bin/cassandra

on the seed node and when it's finished, replicate this process on the other nodes. If you don't see any errors, your multi-node Cassandra setup should be successfully deployed.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

Henrique Pinheiro

Author

Category:

Tags:

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

dsalcedo691755

November 20, 2014

Was very useful! Thank you

victoraprea

December 19, 2014

http://db.tt/S5wHPN4f is a broken link… can you update it?

To configure Cassandra for multiple nodes, you'll need to know beforehand how many nodes you're going to use, and calculate token numbers for each. We've developed a tool to do this, and you can get it here.

mrsachinsharma12

August 18, 2017

Hi,

I have one question when setup multi cluster node, we have only cluster name unique for all node but we have not configured ip of all node in any cassandra.yaml file. In this case how it decides to which node it has to connect?

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Report this

How To Configure a Multi-Node Cluster with Cassandra on a Ubuntu VPS

Introduction

Installing Cassandra on Each Node

Configuring Cassandra

About the author

Still looking for an answer?

Join the Tech Talk

Deploy on DigitalOcean

Become a contributor for community

DigitalOcean Documentation

Resources for startups and AI-native businesses

The developer cloud

Start building today