Truth Is Just Perception

The Three Shortages Affecting Generative Artificial Intelligence

Two of these shortages pose enormous environmental challenges.

The first is energy-related.

For instance, creating an image using generative artificial intelligence consumes as much energy as fully charging a smartphone. Producing text is much less energy-intensive: Generating text 1,000 times only uses 16% of the energy of a full charge. In total, the annual energy consumption of artificial intelligence is equivalent to the needs of a country like the Netherlands.

One might think that training large language models consumes even more energy than their inference. If you were to allocate a drop of water to each calculation needed to train Google’s PaLM model, for example, the volume thus created would be enough to fill the Pacific Ocean. However, the use of these models, on which several million or even tens of millions of queries are made every day, is even more voracious than their training.

This is why, as part of its objective to consume 100% carbon-free energy by 2030, Microsoft has signed the largest renewable energy purchase contract ever concluded by a company, at an estimated cost of between 11.5 and 17 billion dollars. Amazon Web Services will pay up to 650 million dollars to set up a data center in Pennsylvania close to a nuclear power station. However, artificial intelligence can help solve the energy problem it poses: Researchers at Princeton have used artificial intelligence to make a breakthrough in the control of fusion reactions, an important step towards making plasma fusion energy a virtually unlimited, non-polluting source.

There are two ways of reducing the power consumption of generative artificial intelligence: Using it sensibly (as The Wall Street Journal wrote a few months ago, using a chatbot to summarize an email “is like getting a Lamborghini to deliver a pizza”) and using smaller models, optimized for certain tasks, rather than large general-purpose models. In this regard, Microsoft Research recently presented a new variant of language model with a size of a single bit.

Illustration created with DALL-E 3

These two dispositions would also make it possible to somewhat moderate the second environmental risk, which results from the consumption of water by artificial intelligence systems. While training a large language model costs 126,000 liters of water, a conversation with ChatGPT uses around half a liter. In this area too, the major players in artificial intelligence, led by Google, Meta and Microsoft, are planning to mitigate their ecological impact by striving, through various dedicated projects, to replenish more water stocks than they will consume by 2030.

The third shortage concerns data.

The latest “AI Index Report” from Stanford University predicts that the high-quality linguistic data needed to train generative artificial intelligence models will probably run out later this year. Until now, large language models have made progress thanks to the increase in the volume of data on which they were trained. If large models continue to dominate at the expense of more targeted models (see above), there is a risk that they will soon be partially trained on synthetic data, i.e. data created by artificial intelligences that will perpetuate bias, errors and misinformation.

A promising method to avoid this drift is retrieval augmented generation (RAG), which gives artificial intelligence models access to data, usually private (belonging to a company), that is more relevant than the data accessible on the Internet. This is, for example, the logic that governs the conclusion of agreements between creators of major models and some media outlets, so that the former can legally exploit the latter’s archives and news feeds. This technique is primarily used to reduce the tendency of these tools to hallucinate (i.e. to invent information they don’t know), but can also be used to optimize their training with a focus on the quality of training data over its quantity.

Leave a comment

Your email address will not be published

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Go up

Logo created by HaGE via

Carousel pic credits : I Timmy, jbuhler, Jacynthroode, ktsimage, lastbeats, nu_andrei, United States Library of Congress.

Icon credits : Entypo