Tutorial

How To Create a Riak Cluster on an Ubuntu VPS

Published on July 16, 2013
How To Create a Riak Cluster on an Ubuntu VPS

Status: Deprecated

This article covers a version of Ubuntu that is no longer supported. If you are currently operate a server running Ubuntu 12.04, we highly recommend upgrading or migrating to a supported version of Ubuntu:

Reason: Ubuntu 12.04 reached end of life (EOL) on April 28, 2017 and no longer receives security patches or updates. This guide is no longer maintained.

See Instead:
This guide might still be useful as a reference, but may not work on other Ubuntu releases. If available, we strongly recommend using a guide written for the version of Ubuntu you are using. You can use the search functionality at the top of the page to find a more recent version.

Introduction

Riak is a distributed database that offers highly available, fault-tolerant, scalable data management.

This guide will cover how to install and configure a Riak cluster using 64-bit Ubuntu 12.04 VPS instances. We will be using 5 separate cloud servers.

The Riak website recommends using machines with a minimum of 4GB of Ram for best performance, so we will be using cloud servers of that size.

We will be configuring our VPS with root. Be sure to log into each VPS as root or use "su" to obtain the appropriate privileges.

Installing

The following installation steps will be required on each node you will be setting up.

There are pre-compiled binary packages available for Ubuntu that can be downloaded from Riak's website.

First, we will configure apt-get to trust the Riak apt repository and add it to our sources:

<pre>curl http://apt.basho.com/gpg/basho.apt.key | apt-key add -

bash -c “echo deb http://apt.basho.com $(lsb_release -sc) main > /etc/apt/sources.list.d/basho.list”

We can now update the apt-get database and install Raik.

<pre>apt-get update

apt-get install riak

We now have Riak installed. Remember to repeat this step on the other machines you will be using.

Configuring Riak

Now that Riak has been installed, each node will need to be configured. We will complete the following steps on each machine.

Modifying app.config

Ensure that there are no instances of Riak currently running, change into the Riak configuration directory, and open the primary configuration file:

<pre>riak stop

cd /etc/riak nano app.config

We will be changing two values to reflect the network settings of this machine.

Search for line that reads "{pb, [ {"127.0.0.1", 8087 } ]}". Change the "127.0.0.1" to reflect the IP Address of your machine.

<pre>{pb, [ {"<span class="highlight">Your.IP.Address</span>", 8087 } ]},</pre>

Next, perform a similar replacement on line that reads "{http, [ {"127.0.0.1", 8098 } ]}". Again, use the IP address of your machine.

<pre>{http, [ {"<span class="highlight">Your.IP.Address</span>", 8098 } ]},</pre>

Save and close the file.

Modifying vm.args

Next, we will be modifying the "vm.args" file:

nano vm.args

Find and modify the line specifying the node name. It should read "-name riak@127.0.0.1". Keep everything the same but the IP Address:

-name riak@Your.IP.Address

Save and close the file.

Starting Riak

Starting the Riak nodes is simple:

<pre>riak start</pre>

<pre>!!!!

!!! WARNING: ulimit -n is 1024; 4096 is the recommended minimum. !!!

You will probably get the warning above. Let's fix that now temporarily. We will make this permanent later:

<pre>riak stop

ulimit -n 65536

Now we can restart Riak to see if the ulimit warning goes away.

riak start

Creating a Cluster

If you have been following the guide, you should now have five nodes configured and running.

However, currently they are operating independently. They are all handling 100% of their independent data sets and are not in communication. We will merge them into a cluster in this section.

The following steps will join all of the Riak nodes to our first node. Riak will redistribute the data between them automatically when complete.

On our second node, tell the local Riak instance to join the first Riak node:

riak-admin cluster join riak@First.Riak.IP
Success: staged join request for 'riak@Second.Riak.IP' to 'riak@1First.Riak.IP'

This will set up the action of joining, but it will not execute yet. We must view the planned changes first:

riak-admin cluster plan

This will show you the results of the planned change. Riak makes you view the purposed changes before it executes the action.

If the proposal looks correct, commit the changes:

<pre>riak-admin cluster commit</pre>
<pre>Cluster changes committed</pre>

We can see the new cluster group by typing:

riak-admin member-status

Repeat the procedure for the other nodes to form a full cluster group.

Optimizing Settings

Now that we are set up, it is important that we go back and fix some settings that are not ideal for our purposes.

One thing we need to change is the "ulimit" setting that we were warned about when starting Riak. We will create a file to permanently change this setting:

<pre>nano /etc/default/riak</pre>

Add the following line, which will be executed when the computer starts Riak each time:

<pre>ulimit -n 65536</pre>

Save and close the file.

Next, we need to see what Riak thinks we should optimize:

<pre>riak-admin diag</pre>
<pre>[critical] vm.swappiness is 60, should be no more than 0

[critical] net.core.wmem_default is 229376, should be at least 8388608 [critical] net.core.rmem_default is 229376, should be at least 8388608 [critical] net.core.netdev_max_backlog is 1000, should be at least 10000 [critical] net.core.somaxconn is 128, should be at least 4000 [critical] net.ipv4.tcp_max_syn_backlog is 2048, should be at least 40000 [critical] net.ipv4.tcp_fin_timeout is 60, should be no more than 15 [critical] net.ipv4.tcp_tw_reuse is 0, should be 1 [notice] Data directory /var/lib/riak/bitcask is not mounted with ‘noatime’. Please remount its disk with the ‘noatime’ flag to improve performance.

There is a chance that you will also see a large list of messages, the first of which starts with:

[warning] The following preflists do not satisfy the n_val:

This means that your cluster does not have enough nodes to correctly spread our data out. If we join more nodes to our cluster, these messages will disappear.

We will work on adjusting all of the "critical" notices. They can all be adjusted like this:

<pre>sysctl <span class="highlight">setting</span>=<span class="highlight">value</span></pre>

Each command will depend on the output of the "riak-admin diag" program, but will follow the same format.

Re-run the diagnostic command to see if the values are fixed:

<pre>riak-admin diag</pre>
<pre>[notice] Data directory /var/lib/riak/bitcask is not mounted with 'noatime'. Please remount its disk with the 'noatime' flag to improve performance.</pre>

We safely can ignore the notice message. Our new values have fixed the issues with our node.

These values will only exist for the current session. To make the values persist, we need to edit the "sysctl.conf" file:

<pre>nano /etc/sysctl.conf</pre>

Search for each of the different keys and adjust the values as suggested by the "riak-admin diag" command. If the settings don't exist, add them to the bottom of the list.

<pre><span class="highlight">setting</span>=<span class="highlight">value</span></pre>

Our node is now configured correctly. Repeat the above steps on each machine to continue.

Testing the Cluster

We can add a file to test our cluster easily. First, get an image you'd like to use. We will use an image off of the DigitalOcean website:

<pre>cd ~

wget https://www.digitalocean.com/assets/v2/footer_mascott.png

Now we can put the image into our cluster with the following command.

Replace the IP command with your node's IP address and the port with the http port from the "/etc/riak/app.config" file. By default, it should be "8098":

<pre>curl -XPUT http://<span class="highlight">IPAddress</span>:<span class="highlight">Port</span>/riak/images/sammy.png -H "Content-type: image/png" --data-binary @footer_mascott.png</pre>

Now, you should be able to see your image by pointing your browser to the url from the command:

<pre>http://<span class="highlight">IPAddress</span>:<span class="highlight">Port</span>/riak/images/sammy.png</pre>

You should be able to see the image.

Conclusion

You should now have a Riak cluster installed and configured correctly. Your cluster will now automatically distribute your data among the configured nodes.

By Justin Ellingwood

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about us


About the authors

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
4 Comments


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Nevermind. I discovered that riak doesn’t have a package that supports Ubuntu 12.10 (64-bit) yet.

When I attemped the first apt-update to get riak, I saw the following error:

W: Failed to fetch http://apt.basho.com/dists/quantal/main/binary-amd64/Packages 403 Forbidden

E: Some index files failed to download. They have been ignored, or old ones used instead.

Kamal Nasser
DigitalOcean Employee
DigitalOcean Employee badge
July 26, 2013

@Robin: I believe this setup authenticates via the IP address, see the “Modifying app.config” section.

how to create User Accounts so that i can also use this with an S3 Browser e.g???

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel