How To Install and Use Beanstalkd Work Queue on a VPS
Carefully declaring the duties of each and every element of an application deployment stack brings along a lot of benefits with it, including simpler diagnosis of problems when they occur, capacity to scale rapidly, as well as a more clear scope of management for the components involved.
In today's world of web services engineering, a key component for achieving the above scenario involves making use of messaging and work (or task) queues. These usually resilient and flexible applications are easy to implement and set up. They are perfect for splitting the business logic between different parts of your application bundle when it comes to production.
In this DigitalOcean article, continuing our series on application level communication solutions, we will be looking at Beanstalkd to create this separation of pieces.
Beanstalkd was first developed to solve the needs of a popular web application (Causes on Facebook). Currently, it is an absolutely reliable, easy to install messaging service which is perfect to get started with and use.
As mentioned earlier, Beanstalkd's main use case is to manage the workflow between different parts and workers of your application deployment stack through work queues and messages, similar to other popular solutions such as RabbitMQ. However, the way Beanstalkd is created to work sets it apart from the rest.
Since its inception, unlike other solutions, Beanstalkd was intended to be a work queue and not an umbrella tool to cover many needs. To achieve this purpose, it was built as a lightweight and rapidly functioning application based on C programming language. Its lean architecture also allows it to be installed and used very simply, making it perfect for a majority of use cases.
Being able to monitor jobs with a returned ID, returned upon creation, is only one of the features of Beanstalkd that sets it apart from the rest. Some other interesting features offered are:
Persistence - Beanstalkd operates in-memory but offers persistence support as well.
Prioritisation - unlike most alternatives, Beanstalkd offers prioritisation for different tasks to handle urgent things when they are needed to.
Distribution - different server instances can be distributed similarly to how Memcached works.
Burying - it is possible to indefinitely postpone a job (i.e. a task) by burying it.
Third party tools - Beanstalkd comes with a variety of third-party tools including CLIs and web-based management consoles.
Expiry - jobs can be set to expire and auto-queue later (TTR - Time To Run).
Beanstalkd Use-case Examples
Some exemplary use-cases for Banstalkd are:
Allowing web servers to respond to requests quickly instead of being forced to perform resource-heavy procedures on the spot
Performing certain jobs at certain intervals (i.e. crawling the web)
Distributing a job to multiple workers for processing
Letting offline clients (e.g. a disconnected user) fetch data at a later time instead of having it lost permanently through a worker
Introducing fully asynchronous functionality to the backend systems
Ordering and prioritising tasks
Balancing application load between different workers
Greatly increase reliability and uptime of your application
Processing CPU intensive jobs (videos, images etc.) later
Sending e-mails to your lists
Just like most applications, Beanstalkd comes with its own jargon to explain its parts.
Tubes / Queues
Beanstalkd Tubes translate to queues from other messaging applications. They are through where jobs (or messages) are transferred to consumers (i.e. workers).
Jobs / Messages
Since Beanstalkd is a "work queue", what's transferred through tubes are referred as jobs - which are similar to messages being sent.
Producers / Senders
Producers, similar to Advanced Message Queuing Protocol's definition, are applications which create and send a job (or a message). They are to be used by the consumers.
Consumers / Receivers
Receivers are different applications of the stack which get a job from the tube, created by a producer for processing.
Installing Beanstalkd on Ubuntu 13
It is possible to very simply obtain Beanstalkd through package manager
aptitude and get started. However, in a few commands, you can also download it and install it from the source.
Note: We will be performing our installations and perform the actions listed here on a fresh and newly created droplet for various reasons. If you are actively serving clients and might have modified your system, to not to break anything working and to not to run in to issues, you are highly advised to try the following instructions on a new system.
Installing Using aptitude
Run the following command to download and install Beanstalkd:
aptitude install -y beanstalkd
Edit the default configuration using
nano for launch at system boot:
After opening the file, scroll down to the bottom and find the line
#START=yes. Change it to:
Press CTRL+X and confirm with Y to save and exit.
To start using the application, please skip to the next section or follow along to see how to install Beanstalkd from source.
Installing from Source
We are going to need a key tool for the installation process from source - Git.
Run the following to get Git on your droplet:
aptitude install -y git
Download the essential development tools package:
aptitude install -y build-essential
Using Git let's clone (download) the official repository:
git clone https://github.com/kr/beanstalkd
Enter the downloaded directory:
Build the application from source:
Upon installing, you can start working with the Beanstalkd server. Here are the options for running the daemon:
-b DIR wal directory -f MS fsync at most once every MS milliseconds (use -f0 for "always fsync") -F never fsync (default) -l ADDR listen on address (default is 0.0.0.0) -p PORT listen on port (default is 11300) -u USER become user and group -z BYTES set the maximum job size in bytes (default is 65535) -s BYTES set the size of each wal file (default is 10485760) (will be rounded up to a multiple of 512 bytes) -c compact the binlog (default) -n do not compact the binlog -v show version information -V increase verbosity -h show this help
# Usage: beanstalkd -l [ip address] -p [port #] # For local only access: beanstalkd -l 127.0.0.1 -p 11301 &
Managing The Service:
If installed through the package manager (i.e. aptitude), you will be able to manage the Beanstalkd daemon as a service.
# To start the service: service beanstalkd start # To stop the service: service beanstalkd stop # To restart the service: service beanstalkd restart # To check the status: service beanstalkd status
Obtaining Beanstalkd Client Libraries
Beanstalkd comes with a long list of support client libraries to work with many different application deployments. This list of support languages - and frameworks - include:
For a full list of support languages and installation instructions for your favourite, check out the client libraries page on Github for Beanstalkd.
Working with Beanstalkd
In this section - before completing the article - let's quickly go over basic usage of Beanstalkd. In our examples, we will be working with the Python language and Beanstald's Python bindings - beanstalkc.
To install beanstalkc, run the following commands:
pip install pyyaml pip install beanstalkc
In all your Python files in which you are thinking of working with Beanstalkd, you need to import beanstalkc and connect:
import beanstalkc # Connection beanstalk = beanstalkc.Connection(host='localhost', port=11301)
To enqueue a job:
To receive a job:
job = beanstalk.reserve() # job.body == 'job_one'
To delete a job after processing it:
To use a specific tube (i.e. queue / list):
To list all available tubes:
beanstalk.tubes() # ['default', 'tube_a']
Final example (
import beanstalkc # Connect beanstalk = beanstalkc.Connection(host='localhost', port=11301) # See all tubes: beanstalk.tubes() # Switch to the default (tube): beanstalk.use('default') # To enqueue a job: beanstalk.put('job_one') # To receive a job: job = beanstalk.reserve() # Work with the job: print job.body # Delete the job: job.delete()
Press CTRL+X and confirm with Y to save and exit.
When you run the above script, you should see the job's body being printed:
python btc_ex.py # job_one
To see more about beanstalkd (and beanstalkc) operations, check out its Getting Started tutorial.