The new-age buildings we’ll be seeing around will be dominated by one sector, i.e., the Data Centre. This is because many tasks like search, code, research, audio and visual generations are moving to new-age tools like ChatGPT, Gemini, Claude, Llama, DeepSeek and others.
Most of these tools have started like a chatbot, but now are able to carry out many services, starting from enabling you to switch on your light to carrying out a complex task of writing 1000+ lines of code.
And to process such queries or requests you have, these tools use the GPUs hosted in a building called the data centre to process such queries. If you are wondering how much energy tools like ChatGPT consume, refer to this blog.
In this blog, we wanted to work on a different question. We keep hearing now that data centres are coming up with specific capacities around us. What intrigued us is how many queries the data centres can handle.
While the data centre comes in different capacities, let’s calculate it for a 1GW data centre.
What does a 1 GW data centre contain?
The datacentres mainly contain GPUs, which are powered via server stacks. These server stacks have cooling (HVAC) to maintain the temperature in the building without overheating them, and also have backup power in the form of a UPS. Then there is other equipment like lighting, access controls, power distribution units and fire suppression systems.

Now, how much of the consumption in a 1 GW-rated data centre belongs to the GPUs?
A data centre’s energy is measured by a metric called PUE (Power Usage Effectiveness). This is a ratio of the total data centre’s energy divided by the IT equipment energy.
The PUE in today’s world of data centres vary anywhere between 1.12 and 2.4, depending on the purpose they are built for and the sophisticated mechanism of power consumption they have in place to handle queries.
Let’s assume this 1 GW data centre is an average of the lot with a PUE of 1.5
So, the IT equipment energy consumption would be 1 GW/ 1.5 ~ 0.67 GW or 670 MW.
For GPU workloads, let’s assume an 80% utilisation rate. So, that would translate to 80%*670 MW = 536 MW
As Nvidia H100 have been a prominent GPU in the market, let’s consider this data centre to have the same set of GPUs. Their PCIe version comes with a power rating of 350W.
For 536 MW of GPU load at peak processing, it would translate to having 1.53 million GPUs at this location.
Since we know how many GPUs the data centre has, the next question is..
How many queries can a single GPU handle?
As per this publication, State-of-the-art inference systems achieve 450+ requests per second per GPU through techniques like request batching, quantisation, and intelligent routing.
While mid-range handles 115 requests per second per GPU, conservative requests would be 40 requests per second.
And for deep reasoning requests, the requests processed come down to 5–10 per second.
Considering the varied sizes of requests we see and, importantly, the varied structure of processing done by various LLM companies, let’s assume the average requests per second via a single GPU is 120.
This would translate to 1.53 Mn * 120 ~ 183 Million requests per second for this data centre.
For a minute, that would be ~1.1 billion requests per minute
Isn’t it huge? What do you think?
Please do visit us on the website for more information, and maybe tell your friends about it too. It will go a long way in keeping our energy and environment managed better.

