On a project I am working on is hosted in an EKS cluster with the NGINX ingress controller (the one maintained by Kubernetes). It is deployed using it's official official Helm chart, which I realized, after a lengthy debugging session, was a mistake.
The initial setup I aimed to improve had several flaws. Firstly, we were using the AWS Classic Load Balancer in front of the nginx ingress in the cluster, which has been deprecated for some time (years?). Continuing to use it makes little sense to us.
The second issue was that we were only running one(!) nginx pod, which is quite sketchy since the exposed web services had essentially no high availability.
I switched to the Network Load Balancer (NLB), which was straightforward - I just needed to change the ingress-nginx service annotation to specify the load balancer type as NLB:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
However, increasing the replica count turned out to be tricky. When I bumped it up to two, I began to experience sporadic timeouts on new connections to the cluster services, but the logs showed no errors.
My first thought was that it could be related to leadership elections since, apparently, the pods should form a cluster. Running two pods could potentially lead to a split-brain scenario that disrupts service - or so I thought.
Ultimately, it was not that issue. After semi-blindly experimenting with different configurations without success, I joined the Kubernetes Slack channel and started searching through the message history, as LLMs were not providing the help I needed.
I eventually found the fix. The ingress service of type LoadBalancer (so the thing that handles incoming traffic from the internet-facing AWS NLB) had a misconfiguration. The externalTrafficPolicy
, was set to Cluster.
This effectively adds a load-spreading step before passing the traffic to the nginx pod.
And the fix is again a one-liner in the helm values file
externalTrafficPolicy: Local
Although I don't fully understand why this change resolved the issue, it worked. It likely has to do with the node security groups, it may be that we are just blocking traffic with the ingress port range.
As already mentioned, the root cause for this blunder is the fact that we used the helm chart to deploy the NGINX ingress. The official ingress documentation does not recommend this; instead, it provides a basic set of manifests for deployment. In those manifests, the externalTrafficPolicy is correctly set to Local. There are even open issues in the GitHub repository warning about the risks associated with using Helm deployments in cloud-provided clusters.
Comments
Post a Comment