I have configured a failover load balancer, so that it acts as a backup whenever my primary goes down. So I have setup Keepalived that switches the floating virtual IP address to the other machine whenever it is unable to find the service HAProxy running on other machine. The IP addresses mentioned in conf file are present on my eth1 interface.
On my primary load balancer I am getting
systemctl status keepalived
`● keepalived.service - Keepalive Daemon (LVS and VRRP)
Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-05-15 18:06:32 UTC; 21min ago
Main PID: 659 (keepalived)
Tasks: 2 (limit: 1131)
Memory: 4.7M
CGroup: /system.slice/keepalived.service
├─659 /usr/sbin/keepalived --dont-fork
└─711 /usr/sbin/keepalived --dont-fork
May 15 18:27:57 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2250]: only one argument, a signal number, allowed
May 15 18:28:01 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2252]: only one argument, a signal number, allowed
May 15 18:28:03 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2253]: only one argument, a signal number, allowed
May 15 18:28:05 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2256]: only one argument, a signal number, allowed
May 15 18:28:07 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2259]: only one argument, a signal number, allowed
May 15 18:28:09 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2260]: only one argument, a signal number, allowed
May 15 18:28:11 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2261]: only one argument, a signal number, allowed
May 15 18:28:13 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2262]: only one argument, a signal number, allowed
May 15 18:28:15 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2263]: only one argument, a signal number, allowed
May 15 18:28:17 ubuntu-s-1vcpu-1gb-blr1-01 killall5[2264]: only one argument, a signal number, allowed`
sudo nano /etc/keepalived/keepalived.conf
`vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2
}
vrrp_instance VI_1 {
interface eth1
state MASTER
priority 200
virtual_router_id 33
unicast_src_ip 10.122.0.2
unicast_peer {
10.122.0.3
}
authentication {
auth_type PASS
auth_pass password
}
track_script {
chk_haproxy
}
notify_master /etc/keepalived/master.sh
}`
On my secondary load balancer
systemctl status keepalived
`` ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2022-05-15 17:57:16 UTC; 36min ago Main PID: 329993 (keepalived) Tasks: 2 (limit: 4677) Memory: 1.9M CGroup: /system.slice/keepalived.service ├─329993 /usr/sbin/keepalived --dont-fork └─330005 /usr/sbin/keepalived --dont-fork
May 15 17:57:16 ubuntu-s-2vcpu-4gb-blr1-01 Keepalived_vrrp[330005]: Script `chk_haproxy` now returning 1
May 15 17:57:16 ubuntu-s-2vcpu-4gb-blr1-01 Keepalived_vrrp[330005]: VRRP_Script(chk_haproxy) failed (exited with status 1)
May 15 17:57:16 ubuntu-s-2vcpu-4gb-blr1-01 Keepalived_vrrp[330005]: (VI_1) Entering FAULT STATE
May 15 18:05:21 ubuntu-s-2vcpu-4gb-blr1-01 killall5[330439]: only one argument, a signal number, allowed
May 15 18:10:13 ubuntu-s-2vcpu-4gb-blr1-01 killall5[330679]: only one argument, a signal number, allowed
May 15 18:11:37 ubuntu-s-2vcpu-4gb-blr1-01 killall5[330750]: only one argument, a signal number, allowed
May 15 18:17:53 ubuntu-s-2vcpu-4gb-blr1-01 killall5[331070]: only one argument, a signal number, allowed
May 15 18:24:21 ubuntu-s-2vcpu-4gb-blr1-01 killall5[331386]: only one argument, a signal number, allowed
May 15 18:28:11 ubuntu-s-2vcpu-4gb-blr1-01 killall5[331552]: only one argument, a signal number, allowed
May 15 18:30:31 ubuntu-s-2vcpu-4gb-blr1-01 killall5[331649]: only one argument, a signal number, allowed``
sudo nano /etc/keepalived/keepalived.conf
`vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2
}
vrrp_instance VI_1 {
interface eth1
state BACKUP
priority 100
virtual_router_id 33
unicast_src_ip 10.122.0.3
unicast_peer {
10.122.0.2
}
authentication {
auth_type PASS
auth_pass password
}
track_script {
chk_haproxy
}
notify_master /etc/keepalived/master.sh
}
Output of pidofpidof haproxy`
Primary
root@ubuntu-s-1vcpu-1gb-blr1-01:~# pidof haproxy
726 719
Secondary
root@ubuntu-s-2vcpu-4gb-blr1-01:~# pidof haproxy
328842 328841
Note : I ran the /etc/keepalived/master.sh script manually and it was working successfully.
EDIT1: It does not work even when I use pidof -s haproxy
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Hi there,
I might be missing something, but this looks less like Keepalived not running the check and more like how the check script exit code is handled.
vrrp_script only cares about the exit status. pidof haproxy returns 0 if it finds a PID and 1 if it doesn’t, which is fine in theory, but in practice this can break if the command behaves slightly differently under Keepalived’s execution environment.
A couple of things that usually trip people up here:
Keepalived runs scripts with a very minimal environment. Using a bare command like pidof without a full path can fail. Try /bin/pidof haproxy.
It’s often safer to wrap the check in a small shell script that explicitly returns 0 or 1, rather than calling pidof inline.
Make sure the script is executable and owned by root.
The repeated killall5 messages are usually unrelated noise from systemd shutdown checks, not the root cause.
In most setups, changing the script to something like a simple shell check (pgrep haproxy > /dev/null || exit 1) with a full path resolves this. If it still behaves oddly, enabling Keepalived debug logging can help confirm whether the script is actually being executed and what exit code it returns.
Heya,
Those killall5: only one argument, a signal number, allowed lines aren’t coming from Keepalived’s VRRP logic itself — they’re almost always coming from your notify_master /etc/keepalived/master.sh script being triggered and then calling killall5 incorrectly (for example, passing it a PID or a process name instead of a signal number).
So you’ve basically got two separate things going on:
First, fix master.sh. Don’t use killall5 here. If you’re trying to restart HAProxy, use systemctl restart haproxy (or service haproxy restart). If you’re trying to kill PIDs from pidof, use kill -TERM $(pidof haproxy) (or pkill -TERM -x haproxy). Once you remove the bad killall5 usage, that spam should stop.
Second, your secondary entering FAULT means chk_haproxy returned non-zero at the moment Keepalived ran it. Even if pidof haproxy works when you run it manually, Keepalived can still fail it if the command path/environment differs. The simplest fix is to use an explicit full path and a more predictable check, e.g.:
script “/usr/bin/pgrep -x haproxy >/dev/null 2>&1”
(or /bin/pidof haproxy >/dev/null 2>&1 if that’s where pidof lives on your distro).
After changing those, restart Keepalived on both nodes and watch journalctl -u keepalived -f , you should see the script stop failing and the FAULT state go away.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.