Tutorial

How to Improve Website Performance With Caching in Rails

Draft updated on Invalid Date

Ruby on Rails

By Ilya Bodrov

How to Improve Website Performance With Caching in Rails

This tutorial is out of date and no longer maintained.

Introduction

Caching is a very important, yet often overlooked technique that can boost the website’s performance quite significantly. To put a long story short, caching means storing the results of a complex (or not so complex) computation in some storage and later returning them right away without the need to re-compute everything. Also, “cache” is a French word that means “to hide” and “cache-cache” is a hide-and-seek game.

In this article we are going to discuss various types of caching in Ruby on Rails and see them in action:

Model caching
Fragment caching
Action caching
Page caching
HTTP caching

The source code for the article can be found on GitHub. So, shall we start?

Laying Foundations

I usually prefer to demonstrate all the concepts in practice, so go ahead and create a new Rails application without a testing suite:

rails new Booster -T

I will be using Rails 5.1 for this demo but most of the core concepts can be applied to earlier versions as well.

Some Configuration

Before proceeding to the main part we need to do some groundwork. First and foremost let’s take a look at config/environments/development.rb file which has all the configurations for the development environment. We are interested in the following code piece:

# config/environments/development.rb
# ...
  if Rails.root.join('tmp/caching-dev.txt').exist?
    config.action_controller.perform_caching = true

    config.cache_store = :memory_store
    config.public_file_server.headers = {
      'Cache-Control' => "public, max-age=#{2.days.seconds.to_i}"
    }
  else
    config.action_controller.perform_caching = false

    config.cache_store = :null_store
  end

You can see that if the tmp/caching-dev.txt file exists then caching is enabled (this does not apply to low-level caching though). By default, this file does not exist so we need to create it either manually or by running the following command:

rails dev:cache

Also note that cache_store is set to :memory_store which is totally okay for small websites, but is not suitable for large applications. Of course, there are other cache-store options that you may utilize, like file storage or MemCache. A bit more detailed information can be found in the official Rails guide.

Now that we have tackled the configuration, let’s also prepare the ground for our experiments by creating a new controller, view, model, and route.

Preparing the Application

Suppose we are creating an application to keep track of the employees. So, go ahead and create a new model called Employee with the following fields:

full_name (string)
email (string) — indexed field
role (string)
salary (integer)

rails g model Employee full_name email:string:index role salary:integer
rails db:migrate

I don’t want to populate all sample database records by hand, so let’s take advantage of the Faker gem and the seeds.rb file. This will greatly simplify things for us because Faker can generate sample data of different kinds: names, emails, numbers, addresses, and even Star Wars-related stuff. First, add the new gem to the Gemfile:

# Gemfile
# ...
group :development, :test do
  gem 'faker'
end

Run:

bundle install

And then modify db/seeds.rb file to create 100 employees:

# db/seeds.rb
100.times do
  Employee.create full_name: Faker::StarWars.character,
                  email: Faker::Internet.unique.email,
                  role: (rand > 0.5 ? 'manager' : 'employee'),
                  salary: Faker::Number.between(1, 5000)
end

Here we are using:

StarWars module to generate random names for our employees (which means that even Darth Vader himself may work for us!).
Internet module to generate emails. Note the unique method which should guarantee that the returned values do not repeat.
Number module to generate an integer between 1 and 5000.
As for the role, we have only two options so rand is used to pick one of them.

Now let’s populate the database:

rails db:seed

What I’d like to do next is create an EmployeesController with a sole index action that is going to fetch employees from the database based on some criteria:

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
  def index
      def index
        @employees = Employee.by_role params[:role]
      end
  end
end

We are using by_role class method to request all employees based on the value of the role GET param. Let’s code the method itself now (which is going to be presented as a scope):

# models/employee.rb
class Employee < ApplicationRecord
# ...
  VALID_ROLE = %w(manager employee)

  scope :by_role, ->(role) do
    if VALID_ROLE.include?(role)
      where role: role
    else
      all
    end
  end
end

The idea is simple: if the requested role is valid, then get all managers or non-managerial employees. Otherwise simply load all the people from the database.

Now create a view and a partial:

<!-- views/employees/index.html.erb -->
<h1>Our brilliant employees</h1>

<ul>
  <%= render @employees %>
</ul>

<!-- views/employees/_employee.html.erb -->
<li>
  <%= employee.full_name %> (<%= employee.role %>)<br>
  <%= mail_to employee.email %><br>
  <b>Salary</b>: <%= employee.salary %>
</li>

Lastly, add the route:

# config/routes.rb
Rails.application.routes.draw do
  resources :employees, only: [:index]
end

Now you may boot the server

rails s

And navigate to http://localhost:3000/employees to make sure everything is working fine.

Setting the ?employees GET param to either manager or employee should limit the scope of the query.

Low-Level Caching

The first type of caching I’d like to discuss is called low level or model caching. In my opinion, this is the simplest type of caching that still can (when used properly) boost the performance of some particular page quite significantly. The idea is that we cache the results of a complex query and return them without re-running the same query over and over again.

To some extent, Rails automatically uses this type of caching but only for the same queries performed in the same controller action. After the action finishes its job, the cached data are no longer available. So, in our case, we need more long-term storage. Luckily, Rails’ core already has all the necessary methods to read and write cached data. There are three main methods to do that:

Rails.cache.write 'some_key', 5 # => true
Rails.cache.read 'some_key' # => 5
Rails.cache.fetch('some_other_key') { 50 + 50 } #=> 100 (saved to cache)
Rails.cache.fetch('some_other_key') { 50 + 50 } #=> 100 (fetched from cache)

read and write methods are pretty self-explanatory. write accepts a key and a value, whereas read accepts a key and returns the corresponding value (if any). fetch method is a bit more complex:

It tries to find a value under the given key.
If it exists, the corresponding value is returned right away.
If it does not exist, the given block is evaluated. The returned value is then stored under the key and is also returned as a result of the method call.

This behavior is illustrated above. On the first call of the fetch method the some_other_key does not exist, therefore the 50 + 50 expression is being evaluated and 100 is being saved to the cache. On subsequent method call, some_other_key is already present and 100 is returned right away, without evaluating the block. We are going to use this approach to cache the queries.

Tweak the model like this:

# models/employee.rb
class Employee < ApplicationRecord
# ...
  scope :by_role, ->(role) do
    if VALID_ROLE.include?(role)
      Rails.cache.fetch("employees_#{role}") { puts 'evaluating...' ; where role: role }
    else
      Rails.cache.fetch('all_employees') { puts 'evaluating...' ; all }
    end
  end
  end

If the requested role is valid, we need to generate the cache key on the fly — it is going to be either employees_manager or employees_employee. For all employees, the cache name is static.

Try to reload the http://localhost:3000/employees a couple of times and open the console. You will note that on the first request the evaluating... string is printed out, whereas on the second request it is not there which means the cached result was utilized.

Nice! Of course, you will not see any major performance boost as you are working locally and don’t have any complex queries. But when the same technique is applied in the real world the benefits can be really significant.

But what if one of the records is modified later? Let’s try doing that by opening the console and changing an email for the first employee (which happens to be Mace Windu in my case):

rails c
mace = Employee.first
mace.email = 'macejedi@example.com'
mace.save

Now reload the page… and the email is incorrect.

Well, this is explainable as we are not invalidating the cache anywhere. Let’s take care of that by using the after_commit callback. It is going to be run whenever we are committing something to the table:

# models/employee.rb
class Employee < ApplicationRecord
# ...
  after_commit :flush_cache!

  private

  def flush_cache!
    puts 'flushing the cache...'
    Rails.cache.delete 'all_employees'
    Rails.cache.delete "employees_#{role}"
  end
end

So, we are always deleting the all_employees cache as well as the data under the employees_manager or employees_employee key.

Now let’s try changing the e-mail for Mace Windu again:

rails c
mace = Employee.first
mace.email = 'mwindu@example.com'
mace.save

You are going to see an output similar to this:

It means that the cache was flushed. After reloading the server and the page the e-mail should have a proper value.

Note About Cache Keys

Under some circumstances, we can simplify the process of invalidating the cache by employing special cache keys. Let’s add a new instance method that applies some random tax to the employee’s salary and returns the result. In this example I’m going to introduce a new method cache_key:

# models/employee.rb
class Employee < ApplicationRecord
# ...
  def final_salary
    Rails.cache.fetch("#{cache_key}/tax") { puts 'calculating tax...' ; salary - salary * 0.13 }
  end

  private
  # ... private methods here
end

This method is going to generate a unique key based on the record’s id and update_at attributes (though you may specify other attributes as well):

Employee.first.cache_key #=> employees/1-20171220151513642857

It means that whenever the record is updated, the updated_at column changes and so the cache is invalidated automatically.

Fragment Caching

The next type of caching we are going to discuss is called fragment which means that only part of the page is being cached. That’s a pretty popular type of caching that may come in really handy. Using it can be as simple as wrapping your code with the cache method:

<!-- views/employees/_employee.html.erb -->
<% cache employee do %>
  <li>
    <%= employee.full_name %> (<%= employee.role %>)<br>
    <%= mail_to employee.email %><br>
    <b>Salary</b>: <%= employee.salary %><br>
    <b>Final salary</b>: <%= employee.final_salary %>
  </li>
<% end %>

This is going to cache all the data given to the block. As for the key, it will be generated using the cache_key method called on the employee object. What’s interesting, the cache will be invalidated when the record changes or when the markup itself changes. This is possible because the resulting cache key also contains the markup digest. Note that you may clear the cache manually with the help of expire_fragment method.

If, for some reason, you’d like to customize the key’s name, simply pass it as the first argument:

<!-- views/employees/_employee.html.erb -->
<% cache "employee_#{employee.email}" do %>
<!-- ... other content here -->
<% end %>

Also, you may take advantage of the cache_if and cache_unless methods that accept the condition as the first argument. They may come in handy if the caching should occur only under specific circumstances.

Also, the above code can be simplified even more by setting the cached option to true when rendering the collection:

<!-- views/employees/index.html.erb -->
<ul>
  <%= render @employees, cached: true %>
</ul>

Having this in place, the cache method can be removed from the partial.

What’s interesting, fragment caching can become nested (which is called Russian doll caching). It means that the following code is totally valid:

<% cache @employees do %>
  <ul>
    <%= render @employees, cached: true %>
  </ul>
<% end %>

Action Caching

If the fragment caching works with individual parts of the page, action caching is used to cache the page as a whole. This may offer a really nice performance boost, because the HTML markup is sent to the client nearly right away, without the need to interpret all the embedded Ruby code. Why nearly? Well, because prior to that all before_action blocks are being run as usual. This is really convenient in situations when you first need to perform authentication or authorization.

Let’s see action caching in action (duh!). Since Rails 4 this functionality was stripped out from the framework’s core and extracted as a separate gem, so add it now:

# Gemfile
# ...
gem 'actionpack-action_caching'

Then install it:

bundle install

Now suppose we have a ridiculously simple authentication system that simply checks whether the admin GET param is set or not. This stunt was performed by a professional so never try doing it yourself at home (that is, for real-world apps):

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
before_action :stupid_authentication!, only: [:index]

  # your actions here...

  private

  def stupid_authentication!
    redirect_to '/404' and return unless params[:admin].present?
  end
 end

Next let’s do some caching which is as simple as adding the caches_action method:

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
  caches_action :index

  def index
    @employees = Employee.by_role params[:role]
  end

  # ... private methods here
end

Now simply try navigating to http://localhost:3000/employees with and without the admin GET param. You will note that on every request the before_action runs and only after that the cached version of the page is fetched. This is possible because under the hoods the gem utilizes around filters and fragment caching with the requested path set as a cache key.

caches_action accepts a bunch of options including :if, :unless, :expires_in and :cache_path (to rename the cache key. You may find examples you using these options at the official gem’s page.

The cached actions can be flushed by utilizing the expire_action method:

expire_action controller: "employees", action: "index"

Page Caching

Page caching is very similar to action caching, but it does not run any before action and rather fully caches the page. Subsequent requests are not processed by the Rails stack and a static page is served instead. This type of caching can be rocket-fast, but of course, it has somewhat limited usage. In many modern applications, visitors should be treated differently and so page caching is not really suitable. Still, it shines on wiki resources and blogs.

Page caching was also extracted as a separate gem so let’s add it now:

# Gemfile
# ...
gem "actionpack-page_caching"

Install it:

bundle install

Now we need to do some configuration:

# config/environment/development.rb
# ...
config.action_controller.page_cache_directory = "#{Rails.root}/public/cached_pages"

The pages will be cached as static HTML so the public directory is the best place to store it. This setting can be overridden for individual controllers:

class SomeController < ApplicationController
  self.page_cache_directory = -> { Rails.root.join("public", request.domain) }
end

Now let’s add a non-standard action that will render an informational page with some statistics (you may use model caching here as well):

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
  # ... other actions here
  def info
    @employees = Employee.all
  end

  private

  # ... private methods here
end

Add a route:

# config/routes.rb
Rails.application.routes.draw do
  resources :employees, only: [:index] do
    collection do
      get 'info'
    end
  end
end

Create a view:

<!-- views/employees/info.html.erb -->

<h1>Some info</h1>
<p>Greetings! We have the best team in the world.</p>
<p>There are <strong><%= @employees.count %></strong> employees in total.</p>
<p>There are <strong><%= @employees.where(role: 'manager').count %></strong> manager employees.</p>
<p>There are <strong><%= @employees.where(role: 'employee').count %></strong> non-mmanagerial employees.</p>

I’ve added some queries right to the view for simplicity though, of course, that’s not the best practice.

Now let’s cache this page by adding the following code:

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
  caches_page :info
  # actions here...
end

This is pretty much it! Now the public/cached_pages/employees directory should have an info.html file with the static HTML markup. You may find some sample configuration for Apache and Nginx to serve the caches page at the docs. Of course, it is also important to have a fast and reliable hosting provider], because it plays the main role of serving responses to your users.

In order to expire the cache, utilize the expire_page method:

expire_page controller: 'employees', action: 'info'

You may also utilize rails-observers gem to set up a cache sweeper.

HTTP Caching

HTTP caching is yet another common type of caching that relies on HTTP headers, specifically HTTP_IF_NONE_MATCH and HTTP_IF_MODIFIED_SINCE. By using these headers the client’s browser can check when the page was lastly modified and what is the value of its unique id (called ETag).

The idea is not that complex. On the first request, the browser records the ETag of the page and caches it to the disk. On subsequent requests, the browser sends the recorded ETag to the server. If the tag returned by the server and the one sent by the client do not match, it means that the page was modified and, obviously, should be requested again. If the ETags are the same, a 304 status code (“Not modified”) is returned and the browser can display the page from the cache.

One thing to note is that there are two types of ETags: strong and weak. Weak tags are prefixed with the W/ part and they allow the page to have some minor changes. Strong tags require the response to be completely identical, otherwise, the page is downloaded again.

To enable support for HTTP caching in Rails, you may utilize one of two methods: stale? and fresh_when. stale? is a bit more complex method that accepts a block (useful in conjunction with respond_to). fresh_when is simpler and can be utilized when you don’t need to respond with various formats.

So, let’s now set the information about the last page’s update and provide its ETag:

# controllers/employees_controller.rb
class EmployeesController < ApplicationController
# ...
  def index
    @employees = Employee.by_role params[:role]
    fresh_when etag: @employees, last_modified: @employees.first.updated_at
  end
 # other actions here...
end

Also, tweak our scope by introducing a custom ordering so that the newly edited records appear first:

# models/employee.rb
class Employee < ApplicationRecord
  scope :by_role, ->(role) do
    if VALID_ROLE.include?(role)
      Rails.cache.fetch("employees_#{role}") { where(role: role).order(updated_at: :desc) }
    else
      Rails.cache.fetch('all_employees') { all.order(updated_at: :desc) }
    end
  end
 end

Now disable action caching for the index, reload the page and open the Network tab in your developer’s tools (which usually can be opened by pressing F12). You will see something like this:

In the console you will note that the status code is 304, not 200 as usual:

It means that both the ETag and Last-Modified header are sent properly. Note that by default Rails generates a weak ETag. You may set a strong one instead by specifying the :strong_etag option.

Both stale? and fresh_when may also accept an ActiveRecord object. In this case, all the necessary options will be set automatically:

fresh_when @employee

Conclusion

In this article we have discussed ways to boost our web application’s performance by implemented various types of caches:

Low-level (model) to cache complex queries
Fragment to cache parts of the page
Action to cache the whole page still allowing before filters to be executed as usual
Page caching to directly respond with static HTML
HTTP caching to allow browser cache the response based on Last-Modified and ETag headers

Hopefully, now you are ready to implement these techniques into practice! Having a fast website is really important as users, as you probably know, don’t like to wait so it’s our job to please them. On the other hand, preliminary optimization is often evil so introducing all caching types for each page is generally a bad idea. Try assessing the site’s performance, detect the bottlenecks and only then decide which type of caching may solve the problem.

I wish you good luck and see you soon!

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products