The Value of “Software Defined Management” for Data Centers

Currently data centers operate in a mode called "always-on", meaning that all servers, all equipment, cooling, is running 24×7, 365 days, irrespective of the need or demand. Furthermore, 30% of all servers are not even used. Jonathan Koomey and the Anthesis Group just published a study with the latest data from McKinsey and others indicating that utilization "rarely exceeds 6%".

Currently data centers operate in a mode called "always-on", meaning that all servers, all equipment, cooling, is running 24×7, 365 days, irrespective of the need or demand. Furthermore, 30% of all servers are not even used. Jonathan Koomey and the Anthesis Group just published a study¹ with the latest data from McKinsey and others indicating that utilization "rarely exceeds 6%".

You have to understand that there are reasons for such a low "average utilization". First, every production environment is setup in two redundant locations so that even when one goes offline or is impacted by a catastrophic event the other one can carry the full load. Second, as organizations only have historic data on utilization, they have an 80% rule, meaning that load should never exceed 80% before increasing the capacity.

However, this is at peak utilization – not during average times or even low times. While utilization depends heavily on the applications that run on the servers, monitoring them will give you an indication of what is actually happening, and now its on everyone to act upon that intelligence "properly" as it can reduce cost, help with stabilizing the power grid and most importantly it will make the applications more reliable.

"Properly" acting on it – what does that mean? Data centers are built and operated to keep applications up and online. Performing any action that could take an application down is a big no-no. With application outages costing a lot of money and reputational damages, neither energy savings nor other benefits are of any interest today. However, even when everything is on, applications do go down, power outages happen, failover configurations to the backup data center do not work, and so on. We see the results all the time when applications we like to use is not available.

Lets assume we can shift applications from one location to another with a single push of the button and turn on/off the IT and cooling equipment on the fly and can run enough IT infrastructure to support the current load PLUS a safety buffer, what would that do for application reliability? Can we in fact prevent a significant amount of causes for current application outages?

Yes, we can! When we at Tier44 Technology talk about "properly" acting on the intelligence, we mean exactly that. We mean shifting applications on the fly between various locations, turning on/off equipment as needed based on application demand, moving them proactively during maintenance, or based on forecasts of external events and warnings from utility companies. This is all done with management software, called Dynamic Power Management (DPM), that ties multiple data centers together, defines what needs to be done under what condition and executes the procedures accordingly.  

Why wait for a disaster? Why not shift the application away, knowing it runs in a safe location before anything bad could happen? Why not shift it to a location where energy pricing is more favorable. Software defined management is what data centers need.

Utility companies are forever doing dynamic management, matching generation capacity with demand based on forecasts and actual consumption. Data Centers should do the same.

Lets assume an organization actually implements Dynamic Power Management, what does it mean and what opportunities does it open up. First, applications become location independent and instead of all equipment being on all the time, a large amount of the equipment is actually turned off most of the time. Second, turning off at least half of the equipment reduces power consumption dramatically. Third, the data center is no longer a guaranteed base load for the local utility company. Power consumption can be low when the application runs in the other location, or high, when there is a peak event and all applications run in this location.

The following picture shows the impact on the Secondary Data Center. While in "Always-On" mode, running idle, every rack in this example consumes 4,840W. When DPM is turned on, and backup equipment is turned off most of the time, the savings are close to 50%.

Utility companies can benefit from this as well. When power load is high, they could ask the data center to shift out of their territory, when utilization is low, they can give the data center a low rate to put all load into their territory. In fact, they could treat a data center as a virtual power plant, generating "nega-watt" of power. If you add the backup generators from the data center into the mix, these up/down swings can be huge.

Here is a picture with two sites, assuming Site 1 runs the application/equipment/cooling at around 5MW and Site 2 (historically running at 5MW for backup purposes but has been turned off). Both sites are constructed for 10MW total power consumption and have 12MW in backup generation (N+1: five 2MW generators plus one spare generator). The impact of shifting all applications from Site 1 to Site 2 to the power grid can be huge.

  1. There will be a 5MW drop in consumption for Site 1 and a 5MW increase from Site 2 for the actual application load.
  2. Turning on the generators (assuming EPA approval) will add another 12MW drop.

So the result is a 17MW drop from Site 1 and a 5MW increase from Site 2. Shifting applications takes less than 5 minutes – so it’s a massive swing in each location during a very short timeframe.

As a data center operator you can leverage this for pricing advantages from different locations and time zones, or you can sign up for demand response participation or do some real time or day ahead energy trading depending on what the local market conditions are.

As a utility company you can leverage the data center to take capacity from high wind/solar generation, reduce demand during shortages, or even leverage the generation during emergencies.

Having established the benefits of Dynamic Power Management, it’s time to act, and it’s time to change the way organizations run data centers. It’s also time for utility companies and grid operators to provide incentives to kick start the adoption for the purpose of energy efficiency (remember the savings of ~50%) and grid stability (adding/removing grid load).

Reference:

¹ http://anthesisgroup.com/wp-content/uploads/2015/06/Case-Study_DataSupports30PercentComatoseEstimate-FINAL_06032015.pdf

To Top