Question

stopping pacemaker doesn't move resources to other node

Hi Team,

First time user, please forgive my lack of knowledge about the service.

I installed Pacemaker corosync, but I am facing issues get getting the failover to work properly, for the failover testing purpose I stopped the service on the primary node using the command “#pcs cluster stop ip-172-31-0-11-primary” to test whether the resource will move to the other node (ip-172-31-0-10-backup), I noticed that the resource group members are in the stop state in the backup as shown below

Resource Group: networking-group privip (ocf::heartbeat:awsvip): Stopped vip (ocf::heartbeat:IPaddr2): Stopped elastic (ocf::heartbeat:awseip): Stopped

When I start the primary node again, the resource group is starting without issues on both nodes.

Is there any reason why the failover is not working the way it should be?

[root@ip-172-31-0-11 bin]# pcs status Cluster name: vpc-xxxxx Stack: corosync Current DC: ip-172-31-0-10 (version 1.1.23-1.amzn2.1-9acf116022) - partition with quorum Last updated: Sun Apr 25 02:59:26 2021 Last change: Sun Apr 25 02:33:00 2021 by root via crm_resource on ip-172-31-0-10

2 nodes configured 4 resource instances configured

Online: [ ip-172-31-0-10 ip-172-31-0-11 ]

Full list of resources:

ec2_fencing (stonith:fence_aws): Started ip-172-31-0-11 Resource Group: networking-group privip (ocf::heartbeat:awsvip): Started ip-172-31-0-11 vip (ocf::heartbeat:IPaddr2): Started ip-172-31-0-11 elastic (ocf::heartbeat:awseip): Started ip-172-31-0-11 (Monitoring)

++++++++++++++++++++++++++++++++++++

Cluster Name: vpc-xxxxxxx Corosync Nodes: ip-172-31-0-11 ip-172-31-0-10 Pacemaker Nodes: ip-172-31-0-10 ip-172-31-0-11

Resources: Group: networking-group Resource: privip (class=ocf provider=heartbeat type=awsvip) Attributes: secondary_private_ip=172.31.0.55 Operations: migrate_from interval=0s timeout=30 (privip-migrate_from-interval-0s) migrate_to interval=0s timeout=30 (privip-migrate_to-interval-0s) monitor interval=20 timeout=30 (privip-monitor-interval-20) start interval=0s timeout=30 (privip-start-interval-0s) stop interval=0s timeout=30 (privip-stop-interval-0s) validate interval=0s timeout=10 (privip-validate-interval-0s) Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=172.31.0.55 Operations: monitor interval=10s timeout=20s (vip-monitor-interval-10s) start interval=0s timeout=20s (vip-start-interval-0s) stop interval=0s timeout=20s (vip-stop-interval-0s) Resource: elastic (class=ocf provider=heartbeat type=awseip) Attributes: allocation_id=eipalloc-03e9d2c115c34e6ea elastic_ip=54.x.x.72 Operations: migrate_from interval=0s timeout=30 (elastic-migrate_from-interval-0s) migrate_to interval=0s timeout=30 (elastic-migrate_to-interval-0s) monitor interval=20 timeout=30 (elastic-monitor-interval-20) start interval=0s timeout=30 (elastic-start-interval-0s) stop interval=0s timeout=30 (elastic-stop-interval-0s) validate interval=0s timeout=10 (elastic-validate-interval-0s)

Location Constraints: Resource: ec2_fencing Enabled on: ip-172-31-0-11 (score:INFINITY) (role: Started) (id:cli-prefer-ec2_fencing) Resource: networking-group Enabled on: ip-172-31-0-10(score:INFINITY) (role: Started) (id:cli-prefer-networking-group) Ordering Constraints: Colocation Constraints: Ticket Constraints:

Alerts: No alerts defined

Resources Defaults: migration-threshold=10 Operations Defaults: No defaults set

Cluster Properties: cluster-infrastructure: corosync cluster-name: vpc-xxxxxxx dc-version: 1.1.23-1.amzn2.1-9acf116022 have-watchdog: false last-lrm-refresh: 1619275715

Quorum: Options:

Show comments

Submit an answer

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Accepted Answer

Found the issue.

Bug issue in 0.9.167, upgraded to 0.9.169 and it is now fully functional.

Ta

Want to learn more? Join the DigitalOcean Community!

Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.

For those who are using Amazon Linux 2 AMI, you will be installing 0.9.167 just bear that in mind and it won’t work especially with fencing on.

upgrade to the most recent version- https://github.com/ClusterLabs/pacemaker/releases