Question

Navigating High Availability with Keepalived

  • Posted on September 16, 2023• Last validated on September 16, 2023
  • Linux Basics
  • KFSysAsked by KFSys

What is Keepalived?

Keepalived is an open-source software that provides high availability by using the Virtual Router Redundancy Protocol (VRRP) for Linux systems. Its primary use is to ensure service availability by routing network traffic to a backup server if the primary server fails.


Submit an answer


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

KFSys
Site Moderator
Site Moderator badge
September 16, 2023
Accepted Answer

Part 1: Unpacking the Architecture and Theory

The Mechanics of VRRP:

The Virtual Router Redundancy Protocol (VRRP) is at the heart of keepalived. This protocol facilitates the creation of a virtual router, an abstracted set of machines that appear as a single entity to other network participants. This abstraction ensures uninterrupted service even if one of the participants (or nodes) becomes unavailable.

The VRRP setup includes:

  • Master: The primary node that handles traffic routed to the Virtual IP Address (VIP).
  • Backup(s): One or more nodes ready to take over should the Master fail.

Keepalived’s Dual Roles:

  1. High Availability: As highlighted before, keepalived is most known for this. By constantly checking the health of nodes, it quickly responds to failures, transitioning the VIP from a failed Master to a Backup.

  2. Load Balancing: Via integration with the Linux Virtual Server (LVS), keepalived can also distribute inbound traffic to optimize resource utilization and maximize throughput.

Part 2: ### Installation and Basic Configuration of Keepalived for High Availability

Update Your System: Before installing any new software, it’s a good practice to update the system packages.

sudo apt update && sudo apt upgrade -y

Use the package manager to install keepalived.

sudo apt install keepalived -y

Now, let’s delve into the basic configuration.

Define the VRRP Instance: Let’s set up a basic VRRP instance. This example assumes you are setting up the master server. For backup servers, adjust the state and priority fields.

vrrp_instance VI_1 {
    interface eth0                 # Change to your active network interface, e.g., ens33
    state MASTER
    virtual_router_id 51          # A unique number [1-255] for this VRRP instance
    priority 100                  # 100 for master, 50 for backup
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass mysecretpass   # A password for authentication, should be the same on all servers
    }
    virtual_ipaddress {
        192.168.1.10             # The virtual IP address shared between master and backup
    }
}

Enable and start your keepalived service

sudo systemctl start keepalived 
sudo systemctl enable keepalived

Use your preferred text editor to edit the keepalived configuration file.

Outcome of the Configuration

  1. Establishment of a Virtual Router:

    • This configuration creates a virtual router through the VRRP instance named VI_1.
    • This “router” isn’t a physical device; instead, it’s an abstracted entity that’s represented by the machine running this keepalived configuration.
  2. Virtual IP Ownership:

    • The server with this configuration is set to act as the initial MASTER for the virtual IP 192.168.1.10.
    • As long as this server is healthy and operational, any network traffic directed towards 192.168.1.10 will be received by it.
  3. Automatic Failover:

    • If there are other servers in the network set as BACKUP with a similar keepalived configuration (and sharing the same virtual_router_id and authentication details), they’ll be in standby mode.
    • Should this server (initially set as MASTER) encounter an issue or go down, one of the BACKUP servers will detect the absence of the regular VRRP advertisements and promote itself to MASTER status, thereby taking over the virtual IP 192.168.1.10.
    • The decision of which backup server takes over is determined by the priority value. The backup server with the highest priority will become the new MASTER.
  4. Protected Communication:

    • All communication between the MASTER and BACKUP servers is secured by a basic password authentication mechanism (auth_type PASS). This ensures that only servers with the correct password (mysecretpass in this configuration) can participate in the VRRP grouping for this virtual router.
  5. Regular Health Announcements:

    • The server, when acting as the MASTER, sends out health announcements or “advertisements” every second (advert_int 1). This lets all other participating servers know that it’s active and healthy.
    • These regular updates act as a heartbeat. If the backup servers stop receiving them, they’ll initiate a failover process.
  6. Network Presence:

    • The keepalived process will ensure that the virtual IP (192.168.1.10) is attached to the specified interface (eth0) whenever this server is in the MASTER state. If it transitions to BACKUP state (e.g., another server with a higher priority comes online), the virtual IP will be relinquished.

In essence, the outcome of this configuration is a resilient and adaptive networking setup where the system ensures uninterrupted traffic flow to the virtual IP (192.168.1.10), irrespective of individual server failures. This forms the crux of high availability setups, minimizing downtime and ensuring consistent service availability.

Further Configuration and Testing:

  • Setting Up a Backup Node: For the backup node, copy the same configuration, but change state to BACKUP and priority to a lower value, like 50.

  • Testing Failover: To test the failover mechanism, you can temporarily bring down the master node’s networking or stop its keepalived service. Monitor the backup server to see if it takes over the virtual IP.

  • Firewall Considerations: Ensure that the VRRP protocol (protocol number 112) is allowed in your firewall settings on both master and backup servers. This is crucial for the servers to communicate their statuses.

KFSys
Site Moderator
Site Moderator badge
September 16, 2023

Part 3: Dive into Configuration

Deeper into VRRP Configuration:

  1. SMTP Notifications: Integrate with an SMTP server to receive notifications regarding the state of the keepalived instance:
global_defs {
   notification_email {
       admin@example.com
   }
   notification_email_from notify@example.com
   smtp_server 192.168.1.1
   smtp_connect_timeout 30
}
  1. Preempting Behavior: By default, if a Master recovers, it will reclaim the Master status (and hence the VIP) due to its higher priority. If this behavior is not desired, use the nopreempt directive:
vrrp_instance VI_1 {
   ...
   nopreempt
}

Advanced Load Balancing Configuration:

Defining Virtual Servers:

virtual_server 192.168.1.10 80 {
   delay_loop 5
   lb_algo rr
   lb_kind DR
   protocol TCP

   real_server 192.168.1.2 80 {
      weight 100
      HTTP_GET {
         url {
            path /healthcheck
            status_code 200
         }
         connect_timeout 3
         nb_get_retry 2
         delay_before_retry 2
      }
   }
}

Here, the lb_algo is set to round-robin (rr). The LVS mode is Direct Routing (DR). The real_server section describes backend servers, with a health check endpoint /healthcheck.

Persistent Connections: For some applications, ensuring a user remains connected to the same backend server is essential. This can be achieved with the persistence_timeout setting.

Part 4: Best Practices and Troubleshooting

  1. Configuration Validation: Always check your keepalived configuration for syntactical correctness using:
keepalived --check
  1. Logging: Monitor /var/log/syslog for keepalived logs. For deeper insights, you can adjust keepalived’s verbosity levels.

  2. Priority Management: The priority setting in your VRRP instance is crucial. In setups with multiple backups, ensure each has a unique priority to dictate the failover order.

  3. Optimized Health Checks: Design lightweight health check endpoints for backend servers. Overly complex health checks can add unnecessary load.

Conclusion

Keepalived offers a blend of simplicity and functionality, making it a staple in many high-availability setups. By deeply understanding its mechanics and nuances, administrators can craft resilient infrastructure landscapes that gracefully handle node failures and maintain service continuity.

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel