Hello !
Here is a quick summary. I have a namespace “apps” with tree apps deployed with horizontal pod autoscalers. I have the metrics server running fine and top commands work just fine. For a test I tried to bump autoscaler min replicas and that produced the desired effect:
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient cpu.
I have pod disruption budgets, but with max unavailable of 1, and currently I have tree pods in pending state that can’t be scheduled. So I would expect CA to launch a new node. Here is the update from the configmap:
Cluster-wide:
Health: Healthy (ready=1 unready=0 notStarted=0 longNotStarted=0 registered=1 longUnregistered=0)
LastProbeTime: 2020-05-07 07:35:46.149337873 +0000 UTC m=+302096.048832692
LastTransitionTime: 2020-05-03 19:42:01.523276381 +0000 UTC m=+71.422771221
ScaleUp: NoActivity (ready=1 registered=1)
LastProbeTime: 2020-05-07 07:35:46.149337873 +0000 UTC m=+302096.048832692
LastTransitionTime: 2020-05-03 19:42:01.523276381 +0000 UTC m=+71.422771221
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-05-07 05:46:12.682186809 +0000 UTC m=+295522.581681643
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
NodeGroups:
Name: 3bc1e609-708b-4cc3-a820-fe41079a702a
Health: Healthy (ready=1 unready=0 notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=1 (minSize=1, maxSize=2))
LastProbeTime: 2020-05-07 07:35:46.149337873 +0000 UTC m=+302096.048832692
LastTransitionTime: 2020-05-07 05:55:15.403329346 +0000 UTC m=+296065.302824136
ScaleUp: NoActivity (ready=1 cloudProviderTarget=1)
LastProbeTime: 2020-05-07 07:35:46.149337873 +0000 UTC m=+302096.048832692
LastTransitionTime: 2020-05-07 05:55:15.403329346 +0000 UTC m=+296065.302824136
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-05-07 05:46:12.682186809 +0000 UTC m=+295522.581681643
LastTransitionTime: 2020-05-07 05:46:12.682186809 +0000 UTC m=+295522.581681643
Name: 3ade42aa-f164-469f-916e-76112164a22e
Health: Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=3))
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleUp: NoActivity (ready=0 cloudProviderTarget=0)
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-05-07 05:46:12.682186809 +0000 UTC m=+295522.581681643
LastTransitionTime: 2020-05-03 19:54:05.185322073 +0000 UTC m=+795.084816916
I added a second node pool with autoscaling just to be sure. But the pods don’t even seem to get an event from cluster-autoscaler. I would expect a TriggeredScaleUp
event, or at least a failure from cluster-autoscaler, but it’s just completely silent. Any idea ?
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Sooo, after playing a bit with it. It seems that when I delete the second node pool it works as expected. I don’t think this should be expected and I believe it is a bug or a limitation that is not documented.
Good catch ! I was actually in the same case then. I also had a pool set with 0 min nodes. So that must be it then I guess.