How We Architected Karpenter NodePools for a Multi-Workload EKS Cluster
A practical guide to designing NodePool tiers, weights, taints, and disruption policies based on what actually worked for us in production. Why I Am Writing This Most Karpenter content online falls into two buckets. One is the AWS documentation, which is accurate but does not help you make architectural decisions. The other is the "Hello World" blog post that shows a single NodePool with one instance category and stops there. Neither helped us when we had to run a cluster with mixed workloads, where some pods needed memory-heavy nodes for Elasticsearch and JVM apps, some needed compute-heavy nodes for batch processing, some were latency sensitive web services, and a few were GPU workloads that we did not want anywhere near the rest of the cluster. So we ended up designing a five-tier NodePool architecture. It has been running in production for a while now and has survived enough incidents that I trust it. This post walks through the design, why each decision was made,...