Tutorial

How To Set Up CouchDB with ElasticSearch on an Ubuntu 13.10 VPS

Published on December 30, 2013
How To Set Up CouchDB with ElasticSearch on an Ubuntu 13.10 VPS

Introduction


CouchDB


CouchDB is a NoSQL database that stores data as JSON documents. It is extremely helpful in situations where a schema would cause headaches and a flexible data model is required. CouchDB also supports master-master continuous replication, which means data can be continuously replicated between two databases without having to setup a complex system of master and slave databases.

ElasticSearch


ElasticSearch is a full-text search engine that indexes everything and makes pretty much anything searchable. This works extremely well with CouchDB because one of the limitations of CouchDB is that for all queries you have to either know the document ID or you have to use map/reduce.

Installing CouchDB


We will be installing CouchDB from source in order to get the latest version. A more thorough tutorial on this can be viewed here.

Setting up the Environment


Update the package manager:

apt-get update

Install the tools to compile couch:

apt-get install -y build-essential

Install Erlang, the programming language that CouchDB is written in:

apt-get install -y erlang-base erlang-dev erlang-nox erlang-eunit

Install the rest of the libraries that CouchDB needs:

apt-get install -y libmozjs185-dev libicu-dev libcurl4-gnutls-dev libtool

Aquire the Source Files


Go to the directory where the CouchDB source files will reside:

cd /usr/local/src

Get the source files:

curl -O http://apache.mirrors.tds.net/couchdb/source/1.5.0/apache-couchdb-1.5.0.tar.gz

Untar the source files:

tar xvzf apache-couchdb-1.5.0.tar.gz

Go to the new directory:

cd apache-couchdb-1.5.0

Configure the source and install it:

./configure
make && make install

Note: This step can take a while. Once it is done, CouchDB will be fully installed. Now we need to create the appropriate user and assign permissions

Finalizing the CouchDB Installation


Create a CouchDB user:

adduser --disabled-login --disabled-password --no-create-home couchdb

Note: The prompts asking for things such as Name can be ignored if you would like. You can use the default values for each one.

Assign the appropriate permissions to the CouchDB user:

chown -R couchdb:couchdb /usr/local/var/log/couchdb /usr/local/var/lib/couchdb /usr/local/var/run/couchdb

Setup CouchDB as a service so that it does not have to be started manually:

ln -s /usr/local/etc/init.d/couchdb  /etc/init.d
update-rc.d couchdb defaults

Start CouchDB:

service couchdb start

Verify that CouchDB is running

curl localhost:5984

You should see a response that starts with:

{"couchdb":"Welcome"...

Installing ElasticSearch


Initial Setup


Install the latest version of the headless open-jdk:

apt-get install openjdk-7-jre-headless

Get the latest version of ElasticSearch:

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.8.deb

Install the package:

dpkg -i elasticsearch-0.90.8.deb

Before continuing, you will want to configure Elasticsearch so it is not accessible to the public Internet–Elasticsearch has no built-in security and can be controlled by anyone who can access the HTTP API. This can be done by editing elasticsearch.yml. Assuming you installed with the package, open the configuration with this command:

sudo vi /etc/elasticsearch/elasticsearch.yml

Then find the line that specifies network.bind_host, then uncomment it and change the value to localhost so it looks like the following:

network.bind_host: localhost

Then insert the following line somewhere in the file, to disable dynamic scripts:

script.disable_dynamic: true

Save and exit. Now restart Elasticsearch to put the changes into effect:

sudo service elasticsearch restart

Verify that ElasticSearch is running (If the request fails the first time, try again. It can take a bit of time for it to start):

curl http://127.0.0.1:9200

You should see a response that starts with:

{ "ok" : true, "status" : 200, 

Change Where ElasticSearch Stores Indices


Stop ElasticSearch:

/etc/init.d/elasticsearch stop

Create the new directory:

mkdir /var/data/
mkdir /var/data/elasticsearch

Change ownership of the directory to the ‘elasticsearch’ user:

chown elasticsearch /var/data/elasticsearch

Change the group:

chgrp elasticsearch /var/data/elasticsearch

Change the ElasticSearch configuration file to reflect the new data directory


Use nano to open the ElasticSearch configuration file:

nano /etc/default/elasticsearch

Change the line containing:

DATA_DIR=

to

DATA_DIR= /var/data/elasticsearch

Save and close the file.

Make the Two Work Together


Install the CouchDB River Plugin for ElasticSearch


Navigate to the ElasticSearch directory:

cd /usr/share/elasticsearch/

Install the plugin:

./bin/plugin -install elasticsearch/elasticsearch-river-couchdb/1.2.0

Start ElasticSearch Back Up


Start ElasticSearch:

/etc/init.d/elasticsearch start

Create the CouchDB Database and ElasticSearch Index


Put Some Stuff into CouchDB


Create the CouchDB database:

curl -X PUT http://127.0.0.1:5984/testdb

Create some test documents:

curl -X PUT 'http://127.0.0.1:5984/testdb/1' -d '{"name":"My Name 1"}' 
curl -X PUT 'http://127.0.0.1:5984/testdb/2' -d '{"name":"My Name 2"}' 
curl -X PUT 'http://127.0.0.1:5984/testdb/3' -d '{"name":"My Name 3"}' 
curl -X PUT 'http://127.0.0.1:5984/testdb/4' -d '{"name":"My Name 4"}'

Setup ElasticSearch with the Database


Create the index:

curl -X PUT '127.0.0.1:9200/_river/testdb/_meta' -d '{ "type" : "couchdb", "couchdb" : { "host" : "localhost", "port" : 5984, "db" : "testdb", "filter" : null }, "index" : { "index" : "testdb", "type" : "testdb", "bulk_size" : "100", "bulk_timeout" : "10ms" } }'

Test it!


Do a test query with ElasticSearch:

curl http://127.0.0.1:9200/testdb/testdb/_search?pretty=true

You should see something similar to this:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "testdb",
      "_type" : "testdb",
      "_id" : "4",
      "_score" : 1.0, "_source" : {"_rev":"1-7e9376fc8bfa6b8c8788b0f408154584","_id":"4","name":"My Name 4"}
    }, {
      "_index" : "testdb",
      "_type" : "testdb",
      "_id" : "1",
      "_score" : 1.0, "_source" : {"_rev":"1-87386bd54c821354a93cf62add449d31","_id":"1","name":"My Name"}
    }, {
      "_index" : "testdb",
      "_type" : "testdb",
      "_id" : "2",
      "_score" : 1.0, "_source" : {"_rev":"1-194582c1e02d84ae36e59f568a459633","_id":"2","name":"My Name 2"}
    }, {
      "_index" : "testdb",
      "_type" : "testdb",
      "_id" : "3",
      "_score" : 1.0, "_source" : {"_rev":"1-62a53c50e7df02ec22973fc802fb9fc0","_id":"3","name":"My Name 3"}
    } ]
  }
}

Now, rather than being limited to using map/reduce or the _id of each document, you can do full text queries on your data by using ElasticSearch.

<div class=“author”>Submitted by: <a href=“http://blog.opendev.io”>Cooper Thompson</a></div>

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about us


About the authors

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
5 Comments


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

:( Hello!

It seams that this tutorial is not valid anymore or is incompatible with Ubuntu 14.10 x64 droplet.

When I get to the ./configure step I get the following error:

Erlang version compatibility… configure: error: The installed Erlang version must be >= R14B (erts-5.8.1) and <R17 (erts-5.11)

The if I go on, at the make && make install step I get the following error:

make: *** No targets specified and no makefile found. Stop.

Anybody know how to fix this or maybe it’s there a updated version for this tut.

Thanks!

Great tutorial! Works like a charm on the droplet. BUT…I can’t work out how to make a query from javascript. I’ve tried the elasticsearch.js library and also a plain old http request. That results in a cross domain error. I don’t have a lot of knowlegde about this but after reading a bit on wikipedia and stuff I found out that a different port on the same URL is considered a different domain. So I tried to allow requests in Nginx. Elasticsearch has CORS enabled by deafult and I’m still stuck. What am I missing?

I am new to this elasticsearch thing and sorry for the question it may seems stupid but, Can you describe a scenario where this is useful? According to this http://db-engines.com/en/system/CouchDB%3BElasticsearch Elasticsearch can do the almost same as CouchDB. Thanks in advance! :)

This tutorial was really helpful. Works perfect! Thanks!

Awesome tutorial!!

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel