Tuesday, March 14, 2023

Azure @ Enterprise - Windows containers in AKS - Memory allocation black magic

There were many predictions made in my blogs. Death of Silverlight and WPF. Oh WPF is still available Windows Phone prediction missed to post separately. Now I am predicting the future of Windows OS. The server side is already taken by Linux even in the Microsoft Azure infrastructure from 2015 onwards. The client side will soon be dominated by Chrome OS once the 5G is available to everyone and the browser gets more power with WebAssembly.

If we have a chance do not go for the Windows container. Go to the windows container if there is absolutely no way to use the Linux container. It may be due to budget constraints to migrate from .Net Framework to .Net Core, unavailability of certain legacy COM components, etc.

Kubernetes memory management

Kubernetes orchestrates containers by scheduling them into nodes which are mainly virtual machines. If the node has 32 GB memory will that all be available to containers? 
No. It will be distributed to the below participants
  • Node operating system is software running in node.
  • In order to manage the containers Kubernetes needs to run software in every node called kubelet, proxy, etc.
  • If we have installed third-party CSI drivers and monitoring agents, they also need to run on every node. 
  • Kubernetes keeps some aside to anticipate spikes. That is not considered when allocating
They all are processes from the side of the OS and all require node resources.

Not all memory, CPU and other node resources are available to containers

It's fine. Let's understand the calculation of how much these resources are needed for the infrastructure Since memory is easy to digest let's calculate how much memory is available to the containers.
It is documented in Kubernetes official docs so no need to repeat it here. The only thing missing is the difference between the Windows and Linux container and the driver memory.

Windows nodes in AKS resource reservations

AKS(Azure Kubernetes Service) is just a managed service of Kubernetes hosted by Azure. Something like a laundry shop managing washers and dryers for its customers.

The node memory reservation in AKS is well documented
Windows nodes are treated separately in AKS and they need additional 2GB
  • Eviction threshold
    • 750Mi
  • Kubelet
    • 25% of the first 4 GB of memory
    • 20% of the next 4 GB of memory (up to 8 GB)
    • 10% of the next 8 GB of memory (up to 16 GB)
    • 6% of the next 112 GB of memory (up to 128 GB)
    • 2% of any memory above 128 GB
  • Windows node
    • Additional 2 GB

The resource reservation calculation for 7GB nodes 

7 GB Linux node memory reservations.
0.75 + (0.25*4) + (0.20*3) = 0.75GB + 1GB + 0.6GB = 2.35GB  ~33.57%

7 GB Windows node memory reservations.
0.75 + (0.25*4) + (0.20*3) + 2 = 0.75GB + 1GB + 0.6GB + 2 GB = 4.35GB ~62.14%

Previously we saw that there can be drivers running inside the node that will also require memory. Below is a screenshot from the running node with a decent amount of drivers in a windows node.

7 GB Windows node memory reservations
Please note the memory requirement for same drivers is less in Linux nodes in the same cluster. Also the number of drivers varies by workload requirement
Let us do math again.

7 GB Windows node with some drivers running

Considering only the request as that is the minimum required.

Driver memory = 300+200+600+50+120+120+120+512 = 2.02GB

Total memory = 4.35GB + 2.02 GB =  6.37 GB ~91% of memory is reserved

The resource reservation calculation for 32GB windows nodes

Without drivers 0.75 + (0.25*4) + (0.20*4) + (0.10*8) + (.06*16) + 2 = 0.75GB + 1GB + 0.8GB + 0.8GB + 0.96GB + 2 GB = 6.31GB

With drivers 6.31GB + 2.02GB = 8.33GB ~26.03% reserved 

The resource reservation calculation for 128 GB windows nodes

Without drivers 0.75 + (0.25*4) + (0.20*4) + (0.10*8) + (.06*112) + 2 = 0.75GB + 1GB + 0.8GB + 0.8GB + 6.72GB + 2 GB = 12.07GB

With drivers 12.07GB + 2.02GB = 14.09GB ~11.00% reserved => 113.91GB to containers
All these calculations are done with default recommended values
In case there is any calculation mistake found, please respond in the comments section.

Lessons learned

  • Do not get into the windows container if possible.
  • If windows containers are inevitable, use bigger nodes such as 128GB.

References

No comments: