Question

Pods stuck in pending state despite abundance of resources

Reposting from here: https://stackoverflow.com/questions/67542665/kubernetes-pod-stuck-pending-but-lacks-events-that-tell-me-why

I have a simple alpine:node kubernetes pod attempting to start from a deployment on a cluster with a large surplus of resources on every node. It’s failing to move out of the pending status. When I run kubectl describe, I get no events that explain why this is happening. What are the next steps for debugging a problem like this?

Here are some commands:

kubectl get events

60m         Normal   SuccessfulCreate    replicaset/frontend-r0ktmgn9-dcc95dfd8    Created pod: frontend-r0ktmgn9-dcc95dfd8-8wn9j
36m         Normal   ScalingReplicaSet   deployment/frontend-r0ktmgn9              Scaled down replica set frontend-r0ktmgn9-6d57cb8698 to 0
36m         Normal   SuccessfulDelete    replicaset/frontend-r0ktmgn9-6d57cb8698   Deleted pod: frontend-r0ktmgn9-6d57cb8698-q52h8
36m         Normal   ScalingReplicaSet   deployment/frontend-r0ktmgn9              Scaled up replica set frontend-r0ktmgn9-58cd8f4c79 to 1
36m         Normal   SuccessfulCreate    replicaset/frontend-r0ktmgn9-58cd8f4c79   Created pod: frontend-r0ktmgn9-58cd8f4c79-fn5q4

kubectl describe po/frontend-r0ktmgn9-58cd8f4c79-fn5q4 (some parts redacted)

Name:           frontend-r0ktmgn9-58cd8f4c79-fn5q4
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=frontend
                pod-template-hash=58cd8f4c79
Annotations:    kubectl.kubernetes.io/restartedAt: 2021-05-14T20:02:11-05:00
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/frontend-r0ktmgn9-58cd8f4c79
Containers:
  frontend:
    Image:      [Redacted]
    Port:       3000/TCP
    Host Port:  0/TCP
    Environment: [Redacted]
    Mounts:                 <none>
Volumes:                    <none>
QoS Class:                  BestEffort
Node-Selectors:             <none>
Tolerations:                node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                            node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                     <none>

I use loft virtual clusters, so the above commands were run in a virtual cluster context, where this pod’s deployment is the only resource. When run from the main cluster itself:

kubectl describe nodes

Name:               autoscale-pool-01-8bwo1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=g-8vcpu-32gb
                    beta.kubernetes.io/os=linux
                    doks.digitalocean.com/node-id=d7c71f70-35bd-4854-9527-28f56adfb4c4
                    doks.digitalocean.com/node-pool=autoscale-pool-01
                    doks.digitalocean.com/node-pool-id=c31388cc-29c8-4fb9-9c52-c309dba972d3
                    doks.digitalocean.com/version=1.20.2-do.0
                    failure-domain.beta.kubernetes.io/region=nyc1
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=autoscale-pool-01-8bwo1
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=g-8vcpu-32gb
                    region=nyc1
                    topology.kubernetes.io/region=nyc1
                    wireguard_capable=false
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.116.0.3
                    csi.volume.kubernetes.io/nodeid: {"dobs.csi.digitalocean.com":"246129007"}
                    io.cilium.network.ipv4-cilium-host: 10.244.0.171
                    io.cilium.network.ipv4-health-ip: 10.244.0.198
                    io.cilium.network.ipv4-pod-cidr: 10.244.0.128/25
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 14 May 2021 19:56:44 -0500
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  autoscale-pool-01-8bwo1
  AcquireTime:     <unset>
  RenewTime:       Fri, 14 May 2021 21:33:44 -0500
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Fri, 14 May 2021 19:57:01 -0500   Fri, 14 May 2021 19:57:01 -0500   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Fri, 14 May 2021 21:30:33 -0500   Fri, 14 May 2021 19:56:44 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Fri, 14 May 2021 21:30:33 -0500   Fri, 14 May 2021 19:56:44 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Fri, 14 May 2021 21:30:33 -0500   Fri, 14 May 2021 19:56:44 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Fri, 14 May 2021 21:30:33 -0500   Fri, 14 May 2021 19:57:04 -0500   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  Hostname:    autoscale-pool-01-8bwo1
  InternalIP:  10.116.0.3
  ExternalIP:  134.122.31.92
Capacity:
  cpu:                8
  ephemeral-storage:  103176100Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32941864Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  95087093603
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             29222Mi
  pods:               110
System Info:
  Machine ID:                 a98e294e721847469503cd531b9bc88e
  System UUID:                a98e294e-7218-4746-9503-cd531b9bc88e
  Boot ID:                    a16de75d-7532-441d-885a-de90fb2cb286
  Kernel Version:             4.19.0-11-amd64
  OS Image:                   Debian GNU/Linux 10 (buster)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.3
  Kubelet Version:            v1.20.2
  Kube-Proxy Version:         v1.20.2
ProviderID:                   digitalocean://246129007
Non-terminated Pods:          (28 in total) [Redacted]
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests          Limits
  --------           --------          ------
  cpu                2727m (34%)       3202m (40%)
  memory             9288341376 (30%)  3680Mi (12%)
  ephemeral-storage  0 (0%)            0 (0%)
  hugepages-1Gi      0 (0%)            0 (0%)
  hugepages-2Mi      0 (0%)            0 (0%)
Events:              <none>


Name:               autoscale-pool-02-8mly8
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m-2vcpu-16gb
                    beta.kubernetes.io/os=linux
                    doks.digitalocean.com/node-id=eb0f7d72-d183-4953-af0c-36a88bc64921
                    doks.digitalocean.com/node-pool=autoscale-pool-02
                    doks.digitalocean.com/node-pool-id=18a37926-d208-4ab9-b17d-b3f9acb3ce0f
                    doks.digitalocean.com/version=1.20.2-do.0
                    failure-domain.beta.kubernetes.io/region=nyc1
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=autoscale-pool-02-8mly8
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m-2vcpu-16gb
                    region=nyc1
                    topology.kubernetes.io/region=nyc1
                    wireguard_capable=true
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.116.0.12
                    csi.volume.kubernetes.io/nodeid: {"dobs.csi.digitalocean.com":"237830322"}
                    io.cilium.network.ipv4-cilium-host: 10.244.3.115
                    io.cilium.network.ipv4-health-ip: 10.244.3.96
                    io.cilium.network.ipv4-pod-cidr: 10.244.3.0/25
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sat, 20 Mar 2021 18:14:37 -0500
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  autoscale-pool-02-8mly8
  AcquireTime:     <unset>
  RenewTime:       Fri, 14 May 2021 21:33:44 -0500
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Tue, 06 Apr 2021 16:24:45 -0500   Tue, 06 Apr 2021 16:24:45 -0500   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Fri, 14 May 2021 21:33:35 -0500   Tue, 13 Apr 2021 18:40:21 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Fri, 14 May 2021 21:33:35 -0500   Wed, 05 May 2021 15:16:08 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Fri, 14 May 2021 21:33:35 -0500   Tue, 06 Apr 2021 16:24:40 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Fri, 14 May 2021 21:33:35 -0500   Tue, 06 Apr 2021 16:24:49 -0500   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  Hostname:    autoscale-pool-02-8mly8
  InternalIP:  10.116.0.12
  ExternalIP:  157.230.208.24
Capacity:
  cpu:                2
  ephemeral-storage:  51570124Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16427892Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  47527026200
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             13862Mi
  pods:               110
System Info:
  Machine ID:                 7c8d577266284fa09f84afe03296abe8
  System UUID:                cf5f4cc0-17a8-4fae-b1ab-e0488675ae06
  Boot ID:                    6698c614-76a0-484c-bb23-11d540e0e6f3
  Kernel Version:             4.19.0-16-amd64
  OS Image:                   Debian GNU/Linux 10 (buster)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.4
  Kubelet Version:            v1.20.5
  Kube-Proxy Version:         v1.20.5
ProviderID:                   digitalocean://237830322
Non-terminated Pods:          (73 in total) [Redacted]
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                1202m (60%)   202m (10%)
  memory             2135Mi (15%)  5170Mi (37%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
Events:              <none>

Submit an answer

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Want to learn more? Join the DigitalOcean Community!

Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.

Hi there,

I think that this could be due to the host port that you’ve defined in your template.

I could suggest removing the host port from the template and give it another try.

If this is not the case, feel free to share your yaml template here so that I could have another look.

Regards, Bobby