The Case for the Performance Indicator: Answers to Your Questions

Mark Seymour and Lance Rütimann

It’s a fact: to more efficiently use resources throughout a data center, facility owners and operators need a more holistic view. 

To help them gain this view and measure the efficiency of their cooling efforts, The Green Grid has developed the Performance Indicator.

The Performance Indicator offers a visualization of the balance of three cooling performance metrics and enables a greater understanding of target performance ranges as determined by facility goals. This helps data center owners/operators improve their companies’ abilities to assess data center energy efficiency while better understanding performance and the business impact.

Many in the industry have expressed interest in the Performance Indicator, and as with any new metrics, have questions. Here we answer some that arose during our webcast, “The Case for the Performance Indicator,” which is also available for viewing on demand.

In a colo, there may be customers fitting in the three different triangles. What happens then?

If a Performance Indicator view were to be created for the whole facility, it would be advisable to set targets based on the most demanding areas in the entire data hall.

However, while the energy efficiency target and performance probably always have to be calculated and displayed for the entire data hall, it would be perfectly reasonable to calculate IT Thermal Conformance and IT Thermal Resilience for selected IT equipment to display the performance for a specific customer.

       

How does the Performance Indicator metric accommodate, plan for, and facilitate today’s dynamic/moving/software-managed loads? How does one of the most important thermal efficiency drivers/tools, thermal control and automation, come into the equation?­

In principle, there isn’t much difference between what you call today’s dynamic/moving/software-managed loads. In many traditional enterprise facilities, physical hardware is added and removed daily and loads vary depending on what applications are running. A more modern data center probably benefits since it is much more likely that data collection for monitored data can be automatic, allowing regular or even continuous update and display. Further, the Performance Indicator can be used to look at different possible load distributions to provide data based on cooling performance alongside the IT data to guide deployment choices in the software-driven system.

­

I don't understand how using the PUEr seven-level PUE ratings, A-G, is necessary or different than simply comparing a PUE baseline measurement and then comparing it to the PUE improvement after making a change.

We can’t argue that it is necessary when looking at the PUE change alone. It makes absolutely no difference to the PUE assessment as the change is relative whether you normalize the PUE data to your target.

However, the group agreed that using DCiE displayed as a percentage would produce a view for legacy data centers that overemphasizes PUE and could hide the need to improve IT Thermal Conformance or IT Thermal Resilience.

Taking a moderate PUE – for example, 1.5 – would give a DCiE of 67%. For a legacy data center, this may be a good performance and all that is realistically achievable. However, from a thermal perspective, to minimize risk, deviations of a few percent from 100% could represent many IT boxes being at risk due to being in higher than desirable temperatures. The idea of the ranges is simply to make the view easier to interpret so that the range does not distort a less technical person’s interpretation (e.g., when the views are reported up the management chain).

For a data center that monitors Tin and calculates PUE, what is the extra benefit if Performance Indicator metrics were added? It seems that getting the Performance Indicator requires extensive time and expense for larger scale data centers in terms of a statistical IT inlet temperature overview.

We are not sure why, if you already monitor Tin, it would be expensive to calculate IT Thermal Conformance as you already have the necessary data and just need to do some simple calculations.

The Performance Indicator view of IT Thermal Conformance and PUEr displays the data in a balanced way that is easier to interpret given targets are shown so that business, as well as technical, assessments can be made. Clearly calculating IT Thermal Resilience or looking forward to future scenarios using simulation data to guide planning does require additional effort, but this future performance is not covered by your monitored data that only tells you about today and yesterday.

There is plenty of data to show that forward-looking operational planning and scientifically analyzing the impact of changes can avoid poor operational decisions. Whether the business decides that this additional effort is justifiable for forward planning or considering failure scenarios depends on the value perceived by the business.

For many legacy data centers, stranded capacity is often of the order of 30%, which could equate to millions of dollars. For example, with Microsoft’s large scale data centers, standardization of infrastructure and IT may make forward planning less important during operation, but experience suggests that often this is because design included many more details of IT and load than is available for many enterprise data centers, so the evaluation has been done in detail at the design stage. In any case, it is a business decision.

ASHRAE’s new standard 90.4 recommends using MLC and ELC instead of PUE. For the Performance Indicator, do we have to follow the MLC/ELC or old method PUE = facility load/IT load?

ASHRAE 90.4 is intended for design rather than operation. In that situation, it makes sense to require the designer to evaluate the efficiency of the mechanical system and the electrical system with a view to making sure both are efficient.

Key performance indicators, however, are used to understand whether your system is performing well or not. If it is, an understanding of where the issues are, is not important. It, therefore, makes sense for monitoring data center efficiency to use PUE, which provides an overview. If there is a problem, or the business decides to make a concerted effort to improve performance substantially, then it will probably require a more detailed assessment than just an MLC and ELC to be calculated. So monitoring them (process is not defined anyway) individually, rather than PUE, which measures global power usage effectiveness, has no real benefit.

How do you prevent the limiting of the possible improvement horizon of the organization applying the Performance Indicator to its data center to only think inside the current “data center box”? For example, they have improved their PUE to the max of their PUE ratio, but are now not looking for further improvement choices outside their current existing location.

The Performance Indicator is intended for assessment and ongoing improvement of a selected facility. The targets should be set to reasonable aspirational values that will allow for the impact of specific characteristics, such as the local climate. If a facility were considered in a different climate, then the business should consider setting a different target, for example, an improved PUEr target range. The Performance Indicator should not limit such consideration. Although the existing facility might be performing well compared with the target range, the warm climate may mean that PUEr is PUEr(D) whereas the new facility may be PUEr(B) or even PUEr(A). Either way, the improved performance is explicitly indicated by the letter.

Do you have additional questions about the Performance Indicator? Download the white paper (it is free for members, $150 for non-members), watch “The Case for the Performance Indicator” or “The Performance Indicator: Assessing and Visualizing Data Center Cooling Performance” webcasts, or contact us. We are happy to answer your questions.