CPU Overcommitment

While I know there are a 1000 blog posts out there talking about CPU overcommitment, I felt compelled to write my own following a conversation I had recently.

Me: “I’ve noticed you’re running 6.25:1 overcommitment ratio. Have you looked into how your VMs are performing recently?

Customer: “No, the hosts CPU doesn’t look busy and we like to get value for money from our processors”

And they were, to some degree, right. At first inspection the host doesn’t look busy so you might assume the VMs are happy…

However when you dig a little deeper to see what’s actually going on with the VMs, suddenly we see not everything is quite as happy as it seems.

Now this is obviously a snapshot in time but it’s showing some average figures, I had seen %RDY times up in the 90s during peak load. So even though the host looks quiet you’re in-fact still seeing potentially poorly performing VMs.

So the moral of the story is, understand your environment and your design decisions such as overcommitment ratios. Just because you can do something, doesn’t always mean you should…. now if this was a dev environment you might be happy to overcommit further because ultimately it’s just a dev environment but if you’re talking about customer facing servers or your databases etc then 20+% can be devastating

