How to calculate efficiency across the AI power chain

Jeff Shepard
June 19, 2026

//

Bookmark

Like many aspects of artificial intelligence (AI), there’s not a single approach when calculating efficiency across the AI power chain. It depends on where you look. Are the calculations related to the facility layer, the compute and hardware layer, or the workload and algorithm layer?

At the facility level, efficiency is discussed in terms of power usage effectiveness (PUE) and its inverse, datacenter infrastructure efficiency (DCiE). Where PUE = total facility power/IT equipment power, and DCiE = 1/PUE.

A PUE of 1.0 indicates perfect efficiency with zero energy lost to non-compute functions. That’s not possible. But it is possible to get close, especially in modern hyperscale data centers that have PUEs as low as 1.05 (Table 1).

Table 1. Relationship between PUE and DCiE. (Table: Submer)

There’s a relationship between PUE and various types of data centers. Hyperscale data centers have the best PUEs. They are scalable and very large, with some reaching millions of square feet of computing space. They were developed to handle the needs of Cloud computing, but increasingly, they are being dedicated to AI training and inference operations. High utilization rates are one of the keys to achieving the lowest PUEs.

Modern and traditional data centers have higher PUEs and are used for more variable activities like data analytics, media delivery, E-commerce, financial services, healthcare records, and so on. Inefficient data centers with a PUE of 2.0 and higher are legacy sites, usually built before the development of PUE efficiency metrics (Figure 1).

Figure 1. PUE continuum and types of data centers. (Image: Dgtl Infra)

Training vs inference efficiency

Tera floating-point operations per second (TFLOPS)/W and tokens/W are alternative ways to measure the computing output or processing power efficiency of AI data centers. TFLOP/W measures raw compute capacity, while tokens/W are often preferred for measuring AI productivity.

Both are based on the total energy consumption of the data center. The best metric depends on the compute activity:

TFLOPS/W is the preferred efficiency formula for continuous compute-intensive activities like AI training. TFLOPS measures peak performance without considering memory bandwidth or other system-level design limitations.
Tokens/W is the preferred metric for AI inference operations.

More TFLOPS can be great, but the compute hardware can also be starved for data by memory limitations. Waiting for data is not productive. Usable AI work is related to tokens, not TFLOPS.

Tokens/W is a newer metric and is increasingly preferred for three reasons.

First, tokens are the data format required by AI models. Tokens can be text strings, images, audio clips, and videos. They are delivered as descriptive logical pieces that can be processed by AI models.

Second, tokens are how users pay for using AI models. More efficient token processing supports more AI revenue.

Third, as the processors used for AI inference advance, there is usually an order of magnitude increase in performance and an increase in power consumption. However, the token processing capacity is increasing faster than the power consumption. So, tracking tokens/W gives a better picture of the performance improvement of successive generations of AI processors.

It’s not just hardware that impacts AI energy consumption; optimizing AI inference algorithms to maximize queries processed per unit of energy is another important tool.

Improving inference efficiency

While maximizing efficiency of training and inference are both important, improvements in inference have a bigger long-term impact. Training can be intensive, but once completed, inference tasks and associated energy savings continue indefinitely.

There are two basic approaches to improving AI energy efficiency. One deploys a predictive estimation tool while the other uses empirical testing designed to measure real-world hardware performance and energy efficiency.

Tools for predictive optimization of energy efficiency at the AI workload and algorithm layers are rapidly maturing and can deliver an order of magnitude reduction in energy consumption for a given inference task, without sacrificing performance.

For example, researchers from MIT and the MIT-IBM Watson AI Lab developed a rapid prediction tool that estimates how much power will be consumed by running a particular AI workload on a certain processor or AI accelerator chip (Figure 2).

Figure 2. The EnergAIzer completes an end-to-end power estimation in about 1.8 seconds. (Image: arXiv)

Summary

AI energy efficiency can be calculated relative to the facility layer, the compute and hardware layer, or the workload and algorithm layer. Efficiency can differ for training versus inference activities. Software tools for estimating and optimizing AI energy consumption based on anticipated tasks and available hardware are being developed.

References

A faster way to estimate AI power consumption, MIT Office of Sustainability
AI energy use: New tools show which model consumes the most power, and why, Computer Science and Engineering, University of Michigan
EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads, arXiv
Energy Use of AI Inference: Efficiency Pathways and Test-Time Compute, Microsoft
How Next-Gen AI Data Centers Are Optimizing Power Efficiency With SiC, Microchip
How to Calculate the PUE of a Datacenter, Submer
How to Measure AI Model Energy Efficiency, Naitive
Optimize AI Data Center Power Integrity and Efficiency, Keysignt
Power for AI Data Centers: Energy Demand, Grid Impacts, Challenges and Perspectives, MDPI energies
PUE (Power Usage Effectiveness): Optimizing Data Centers, Dgtl Infra
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization, 39th Conference on Neural Information Processing Systems
What Are the Power Requirements for AI Data Centers?, Hanwha Data Centers
Why ‘tokens per watt’ is crucial for measuring AI efficiency, Schneider Electric

How is physical artificial intelligence used to optimize data center efficiency?
What is the mathematics behind artificial intelligence?
How is power limiting the adoption of physical artificial intelligence in humanoid robotics?
How do ML and AI work in power conversion? part 1
OpenClaw is open-source edge AI for (almost) every application

/

/

/

/

ASK EEWORLD'S AI ANYTHING: POWERED BY ENGINEERS FOR ENGINEERS

How to calculate efficiency across the AI power chain

//

Share

Training vs inference efficiency

Improving inference efficiency

Summary

References

Leave a Reply Cancel reply

// Follow Author

Jeff Shepard

// EEWorld Newsletter

// Categories

// Related Articles