Question
Experiencing random sporadic 504 timeouts running Laravel on DigitalOcean Kubernetes
I’m running a Laravel application running on a DigitalOcean kubernetes cluster. The application consists of three pods running on one of three cluster nodes and has traffic routed to it via nginx-ingress
and a DigitalOcean Load Balancer. The application is using DigitalOcean managed databases for the backend and I’m using Grafana/Prometheus for monitoring and log collation.
Each application pod consists of three containers:
- Laravel Application (PHP-FPM and Nginx sidecar) - I’m aware it’s bad practice to group multiple services together in a single pod but I did this for simplicity until I get my head around the K8s architecture. I was experiencing some major performance issues with using shared volumes for both the
nginx
andfpm
containers to access the codebase. Pods were taking minutes to spin up because the files had to be copied from the Docker image to the shared volume. (If anyone has a better solution to this then I’m all ears!)- Nginx Metrics
- FPM Metrics
I’m using Helm3/Flux to deploy the application and the .yaml
files look like this:
Ingress:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: {{ template "app.fullname" . }}
labels:
app: {{ template "app.name" . }}
chart: {{ template "app.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
spec:
tls:
- hosts:
- [REDACTED].com
secretName: app-tls
rules:
- host: {{ .Values.ingress.host }}
http:
paths:
- path: {{ .Values.ingress.path }}
backend:
serviceName: {{ template "app.fullname" . }}
servicePort: http
Service
apiVersion: v1
kind: Service
metadata:
name: {{ template "app.fullname" . }}
labels:
app: {{ template "app.name" . }}
chart: {{ template "app.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 80
protocol: TCP
name: http
- port: 9253
targetPort: 9253
protocol: TCP
name: fpmmetrics
- port: 9113
targetPort: 9113
protocol: TCP
name: nginxmetrics
selector:
app: {{ template "app.name" . }}
release: {{ .Release.Name }}
With the above config I’m able to access the site via the public hostname. The vast majority of the time the site performs as expected, however around one in 10 requests never resolve and returns a 504
if I leave it long enough. If I initiate a refresh before the 504
is returned, chances are that the page loads almost instantly.
I’m aware that I haven’t given enough details for anyone to diagnose this but I’m at a loss where to start debugging this. I’ve checked the logs and nothing stands out as abnormal and when the site does load it performs as expected which leads me to believe that it’s an ingress/networking issue.
How would I go about starting to diagnose this?
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
×