Skip to main content

We’re entering a new era of generative AI, but that boom in development comes at a cost – taxing resources and causing environmental harm

Open this photo in gallery:

Canada’s federal institutions are being advised to question whether generative AI is truly the answer.Photo illustration by The Globe and Mail/iStockPhoto / Getty Images

It’s bedtime, and your kid wants a story. You, however, recoil at the thought of reading a book you’ve been through 100 times. So you turn to a chatbot powered by artificial intelligence and type a prompt into your phone: “Write a children’s story about a friendly giraffe.”

Your request is pinged to a data centre somewhere in the world, a massive facility containing beastly supercomputers. The place is thrumming with electricity, drawing huge quantities of power, which could be generated by hydro or nuclear, or less environmentally friendly means such as natural gas or coal. Inside these supercomputers is an array of sophisticated chips made with silicon and copper, and more exotic materials including tantalum and palladium – all mined from the Earth, processed and shipped from somewhere else in the world.

These graphics processing units (GPUs), the workhorse of AI, radiate intense heat, and large volumes of water are needed to keep the place from overheating. An AI model hosted here grinds away on your request to predict a string of words that will coherently flow together. You wait, oblivious to the physical infrastructure at work.

Scenes such as this will become much more common, if the hype is to be believed. We are entering a new era of generative AI, with the ability to use machines to create and interpret text, audio, images and video set to change everything, suffusing countless products and services. For that to happen, the world will need more supercomputers, more data centres and more energy to power them. Data centres are already energy hogs, and generative AI is only fuelling the need. By 2026, AI could contribute to a doubling of global electricity use by data centres to 1,000 terawatt-hours, according to the International Energy Agency, which is equivalent to all of Japan. A study published in the journal Joule estimated that by 2027, AI server units built by Nvidia Corp. alone could consume between 85.4 and 134 terawatt-hours of electricity. At the high end, that’s just shy of Ontario’s annual electricity demand.

Global electricity demand from data centres

For AI and cryptocurrencies, 2019-2026, in terawatt hours

1,200

Low case

Base case

High case

1,000

800

600

400

200

0

2019

2020

2021

2022

2023

2024

2025

2026

the globe and mail, Source: iea

Global electricity demand from data centres

For AI and cryptocurrencies, 2019-2026, in terawatt hours

1,200

Low case

Base case

High case

1,000

800

600

400

200

0

2019

2020

2021

2022

2023

2024

2025

2026

the globe and mail, Source: iea

Global electricity demand from data centres

For AI and cryptocurrencies, 2019-2026, in terawatt hours

1,200

Low case

Base case

High case

1,000

800

600

400

200

0

2019

2020

2021

2022

2023

2024

2025

2026

the globe and mail, Source: iea

Some of the projected investments in AI infrastructure are approaching the absurd, such as the US$100-billion supercomputer reportedly planned by Microsoft Corp. and OpenAI, while Meta Platforms Inc. increased its capital-spending estimates this year to US$40-billion to support AI. Canada has more modest ambitions. The federal government announced a $2.4-billion package in March, most of which will go to help companies and researchers access AI computing services and build infrastructure.

These plans come with a cost: increasing energy needs, strain on power grids, and potentially more carbon emissions and water use. At issue is how to support the development of generative AI without taxing resources and causing environmental harm. Part of the answer lies in efficiency, reconfiguring data centres so that these supercomputers can still train and run AI models, but without drawing so much electricity and water. Large cloud providers and AI developers such as Microsoft, Google, Meta and Amazon.com Inc. are already fixated on that goal, and many startups in Canada are now bringing promising solutions to market.

But it might also be best if generative AI doesn’t spread everywhere all at once.


While you wait for the chatbot to write a bedtime story, you are most certainly not thinking about any of this. A few years ago, even people who worked in AI didn’t consider it. Sasha Luccioni joined Mila AI institute in Montreal in 2019, and contributed to a lengthy paper brimming with ideas for how machine learning could fight climate change, from reimagining electricity distribution to blasting aerosols into the stratosphere to limit solar radiation.

That same year, another paper came out from researchers at the University of Massachusetts Amherst. The authors calculated that training a large language model (LLM) could produce more than 284 tonnes of carbon dioxide equivalent, roughly the same amount produced during the lifespans of five cars in the United States. Dr. Luccioni was surprised. “The carbon footprint wasn’t even on our radar,” she said. “Even if you’re doing AI research, you still perceive it to be far away.”

We actually don’t have a firm handle on how much greenhouse gas emissions are produced by training and running generative AI models. Emissions calculations lump in AI with the much broader information and communications technology sector, which accounts for 2 per cent to 6 per cent of greenhouse gas emissions. That’s on par with – or exceeds – the aviation industry.

Open this photo in gallery:
Sasha Luccioni, Artificial Intelligence Researcher & Climate Lead at New York AI company Hugging Face, has made part of her mission to reduce AI’s contribution to carbon emissions. The researcher lives in the east part of Montreal. 
(Date - 05,09,2024)
(Adil Boukind / Globe and Mail)

Sasha Luccioni, artificial intelligence researcher & climate lead at AI company Hugging Face, has made part of her mission to reduce AI’s contribution to carbon emissions.Adil Boukind/The Globe and Mail

Dr. Luccioni, who is now at AI company Hugging Face, has made it part of her mission to tease out AI’s contribution to emissions. For one study, she tracked the process of training an open-source LLM and found it sucked up enough energy to power 30 homes in a year and emitted close to 25 tonnes of carbon dioxide equivalent, the same output from driving a car five times around the planet. The two versions of Meta’s latest LLM, released in April, emitted 2,290 tonnes of carbon dioxide equivalent during training. (Meta noted that it offsets these emissions through its sustainability program.)

Still, that number is pretty small compared to other sources of emissions, and training a model is a kind of one-time event. Meta isn’t churning out huge new models every day. Inference, which refers to using AI applications after a model is built, such as conversing with OpenAI’s ChatGPT, is another matter.

Along with two colleagues, Dr. Luccioni published a study last November that found that generating text with an LLM required 0.047 kilowatt hours of electricity for every 1,000 inferences, which is about the same as fully charging a smartphone nearly four times. Image generation, on average, used 60 times more energy. One model used nearly as much juice to fully charge a smartphone every single time it produced a picture. Consider how often people could be using these applications down the road – ChatGPT alone had close to 1.6 billion visits in February – and you can see why some researchers are uneasy about generative AI.

Not everyone is concerned, though. A group of researchers, including from Microsoft and the University of California, published an article in Nature in April pointing out that AI processors installed in 2023 accounted for 0.04 per cent of global electricity use, and calculated that AI is responsible for just 0.01 per cent of emissions. More reliance on clean power will mitigate these effects even as AI use grows, they argued. “Although there could be challenges for local electricity grids in regions where many data centres are based, from a global perspective, AI should not lead directly to large, near-term increase in greenhouse gas emissions,” they wrote. (The indirect effects, such as using AI to enhance oil and gas extraction, are less clear, they cautioned.)

Much depends on where future data centres are located, and how the electricity is generated. A danger is that utilities “may be compelled to re-engage the use of fossil fuels for power generation” in order to quickly meet the surging demand for electricity driven by AI, according to a recent report from Morningstar DBRS. Former U.S. energy secretary Ernest Moniz, speaking at a conference in March, similarly warned that utilities may resort to natural gas and coal. “We’re not going to build 100 gigawatts of new renewables in a few years,” he said.

Energy isn’t the only resource consumed by data centres either. Water is crucial to prevent the equipment from overheating. One common approach uses water to carry heat from the data centre to a cooling tower. Along the way, some of the water is evaporated to dissipate heat into the atmosphere.

Open this photo in gallery:

Google's Southland data centre in Council Bluffs, Iowa, in 2019. In 2022 alone, Google’s data centres consumed 19.7 billion litres of water.BRIAN SNYDER/Reuters

Google’s data centres consumed 19.7 billion litres of water in 2022, which is equivalent to the water consumption of 2.5 million people. (Google replenished about 6 per cent of its freshwater consumption that year.) Generative AI could make data centres even thirstier. “It’s one of the fastest growing workloads in data centres,” said Shaolei Ren, an associate professor at the University of California, Riverside, who studies water use in data centres.

By 2027, AI computing could be responsible for up to 6.6-billion cubic metres of water withdrawal – equivalent to six Denmarks, according to a study conducted by Prof. Ren and his colleagues. Withdrawal generally assumes at least some portion of the water is returned to the original source, whereas consumption measures the portion that is lost to a local environment such as through evaporation. Withdrawal is nevertheless an important consideration, as it exacerbates competition for water among other users such as farms and municipalities.

Last year, Prof. Ren and his colleagues estimated the amount of water required to train and use OpenAI’s GPT-3, the precursor to GPT-4. Training the model consumed 5.4 million litres of water in total, while every 10 to 50 responses from ChatGPT required around 500 millilitres. “If you combine that with the huge user base of ChatGPT, it’s definitely not a small number,” he said. GPT-4, which is much larger than its predecessor, could use even more water, according to the study.

But that’s really just a guess, to some extent. Researchers such as Prof. Ren and Dr. Luccioni are operating in a vacuum. Some companies don’t disclose enough information to gauge power consumption, emissions and water use. “People have been asking me for over a year now how much energy ChatGPT consumes,” Dr. Luccioni said. “It’s impossible to answer that question.”

There are even more unknowns when it comes to the carbon emissions from GPU manufacturing. More than 60 per cent of the world’s semi-conductor production takes place in Taiwan, which derives 43 per cent of its power from coal and another 40 per cent from natural gas. As for water use, there’s a lack of clarity there, too.

The result is that at a time when generative AI is poised to grow, we’re somewhat in the dark.


Open this photo in gallery:

The Toronto-based company CentML specializes in optimizing how AI applications run. Lowering cost is a big incentive, but there are environmental benefits, too.Shay Conroy/The Globe and Mail

If data centres can be configured to use less energy and water when, say, powering the chatbot that writes your bedtime story, or that does much more intensive tasks such as training large AI models, there may be less cause for concern. If not, “there will be a substantial escalation in emissions attributable to the use of AI,” analysts at Morningstar DBRS wrote in April.

Phil Harris, the chief executive of Ottawa-based Cerio, sees it this way: “We’re still living in a pre-Copernican age of understanding of data centres.” The places are put together all wrong for the era of AI, in his view. GPUs draw power even when not in use, and are crammed together into a rack as tightly as possible, which generates intense concentrations of heat.

Cerio’s solution is to network GPUs differently to spread them throughout the facility. That has a couple of advantages, according to Mr. Harris. First, it allows GPUs to be used by any server in the a data centre at any time, so that they’re not sitting idle and drawing power. Spreading out also reduces heat pockets, easing cooling requirements. “By distributing things, you even out the use of resources,” he said. Cerio has lined up a handful of early-access customers to try its platform, including one of the largest cloud computing firms in the world, and boasts that it can reduce electricity and cooling needs by up to 50 per cent.

Another possibility is to limit the power flowing to GPUs. For Vijay Gadepally, a chance conversation with a colleague about his custom video-gaming PC sparked that idea. His colleague at Massachusetts Institute of Technology (MIT), where Dr. Gadepally serves as a senior scientist at a research lab, was venting about how hot his PC ran, and they wondered if they could limit power without affecting performance. His colleague figured out how to do just that, and Dr. Gadepally thought about applying the concept to data centres.

At MIT, he and his colleagues experimented with throttling power to GPUs training an AI model, and found that the process only took a few hours longer than usual. They also reduced power to systems running LLMs and found the increase in latency – how long it takes to get a response, essentially – was negligible. GPUs, he explained, are designed to run flat out. When they get too hot, they ramp down to cool off, then go full bore again. The approach he experimented with at MIT, in contrast, is designed for more consistent performance, like driving a car at a steady clip. “You get there a little bit later, but it gives you a lot better gas mileage,” he said.

Dr. Gadepally, who is also the chief technology officer at Toronto-based cloud services provider Radium, said the company is implementing approaches such as capping power, which saves customers money and reduces energy and cooling needs. Radium is a lot smaller than the cloud computing giants, but has a growing list of customers. “We are absolutely getting customers that are making business decisions off of the sustainability story,” he said.

Open this photo in gallery:

Gennady Pekhimenko, an associate professor at the University of Toronto, co-founded CentML in 2022.Shay Conroy/The Globe and Mail

Other companies are looking at optimizing how AI applications run. Gennady Pekhimenko, for one, has been studying that topic for years. An associate professor at the University of Toronto, he parlayed his expertise to co-found CentML in 2022, which raised US$27-million last fall. One of its offerings is a platform that allows companies to deploy AI models more efficiently, squeezing more juice out of computing equipment. Lowering cost is a big incentive, but there are environmental benefits, too. “You’ll use less compute to do the same job, so it means you burn less power,” Prof. Pekhimenko said. “If the workload finishes three times faster, you technically spend three times less power.”

The data centres built by Denvr Dataworks, meanwhile, consume no water at all. Denvr, a cloud services provider based in Calgary, has built out a footprint mostly in the United States to service AI workloads. Instead of water, the company employs something called immersion cooling, in which GPUs are literally submerged in a chemical concoction that transfers heat away from the chips. From there, the heat is sent to a dry cooler, where it is exchanged with outside air. “We’re the only company doing this in Canada,” said CEO and co-founder Geoff Gordon.

The substance Denvr uses to cool GPUs is called CompuZol and is manufactured by chemical giant Lubrizol Corp. Denvr even poached Amy Short, a chemistry PhD who was heavily involved in the development of CompuZol. (The design of the data centres and the software layer to operate it is proprietary to Denvr.) “We have operational advantages because it’s cheaper for us to operate these things,” Mr. Gordon said. “We can deploy them fast, so we have all kinds of efficiency advantages.” Denvr’s facilities are smaller, too, as the GPUs can be spaced closer together because they’re not giving off so much heat.

Denvr is betting that the increasing demands of AI hardware require new approaches to cooling. The latest GPU models are so powerful and give off so much heat that traditional methods are impractical. “Air-conditioning units are becoming more and more stressed as they’re trying to manage all of the heat,” Ms. Short said, Denvr’s senior officer of immersion. “What I’ve seen of immersion cooling is that the advantages far outweigh the consequences of a traditional air-cooled deployment, and I don’t see that changing any time soon.”

Immersion cooling is far from standard today, and Shaolei Ren, speaking generally, said that data centres in warm climates may still need to use water-consuming cooling towers to transfer heat from the facility to the outside world. As for CompuZol, it’s a hydrocarbon-based substance. If immersion cooling becomes the norm and data centres are awash in chemical solutions, the downstream effects of that entire supply chain have to be studied and considered, too.

Ms. Smart contended that advances in science will provide a way forward. “Change is coming,” she said, “and my hope is that the change actually is a step towards a more sustainable path.”


Open this photo in gallery:

John Calderon, a software developer at CentML, demonstrates how their open source product, DeepView, can track energy consumption and environmental impacts based on a machine learning model.Shay Conroy/The Globe and Mail

Let’s say the data centre handling your request for a bedtime story is working better these days. It’s more environmentally friendly, having been tricked out with the latest solutions to draw less power and water. There’s still a catch. As the world builds more data centres – whether for AI or anything else – the total amount of electricity and water consumed could still rise, regardless of technological improvements. “These advancements can trigger a rebound effect whereby increasing efficiency leads to increased demand for AI, escalating rather than reducing total resource use,” wrote Alex de Vries, founder of research firm Digiconomist, in a paper published in the academic journal Joule.

Take a look at water use. “Companies were working to improve their water efficiency in the last few years, but the total amount is still increasing,” said Shaolei Ren. Google’s water usage in its data centres grew by 20 per cent in 2022 compared to the year before, while Microsoft reported a 34-per-cent surge, which is at least partly attributable to AI, he said.

When it comes to generative AI then, businesses should really be asking if it’s the right approach for whatever problem needs solving. Some companies, for example, are trying to replace traditional internet search with tools powered by generative AI, which requires 10 times as much energy per query, by one measure. Is the extra processing power necessary to craft those answers really worth it? (Not to mention all of the reliability and accuracy problems with LLMs.)

Canada’s federal institutions, at least, are being advised to question whether generative AI is truly the answer before rolling out such tools. The Treasury Board Secretariat released a guide for federal institutions last September, advising that the use of generative AI should be “balanced against the need for swift and drastic action to reduce global greenhouse gas emissions and avert irreversible damage to the environment.”

For you, at home in need of a bedtime story, the environmental implications of your simple request are frankly an unrealistic burden to put on any individual user. It’s just one of millions of questions asked to a vast, unseen and growing computing network powered by industry. The wait for the chatbot to write a story for your kid is just a second or two. You see the words appearing on your screen. “Once upon time, in a lush, green forest, there lived a friendly giraffe named Gigi,” you read aloud. It’s adequate, succinct and surely something you could have thought of yourself.

Follow related authors and topics

Authors and topics you follow will be added to your personal news feed in Following.

Interact with The Globe

Trending