Monday, December 30, 2024

Energy and Information

by Pierluigi Contucci

a contribution to appear in the joint work 

"AI, Energy and Economy"

by Pierluigi Contucci, Fu Jun and Carlo Alberto Nucci

 

The first quarter of this millennium marks a new and intense relationship between the first and the last industrial revolution. If the Industrial Revolution of the 18th century transformed mechanical energy into productive power, freeing humanity from the limits of physical strength, the AI revolution is now emancipating human thought from the constraints of data processing and is proceeding, albeit in small steps, to transform information into knowledge. Both revolutions have represented moments of rupture, accelerating economic, social, and cultural transformations. However, artificial intelligence stands out for the speed of its diffusion and its ability to penetrate every aspect of daily life, outlining unprecedented scenarios of opportunities and risks. Just as energy was the driving force that powered machines and transformed raw materials into goods during the first industrial revolution, information is now the fundamental resource fueling AI systems, enabling the transformation of raw data into insights, predictions, and decisions. Both energy and information act as essential enablers of progress, one amplifying physical labor and the other expanding cognitive capabilities, marking parallel leaps in humanity’s ability to shape and control its environment.

Energy and Information are concepts that belong to the so-called hard sciences, those that have found in mathematics their natural language. Energy, with its millennia-long history, is a well-established concept whose rigorous modern foundations were laid by Newton and refined by Carnot with the advent of Thermodynamics. It was precisely Thermodynamics that transformed the technology of engines, at the turn of the 18th and 19th centuries, from a pre-scientific set of techniques to a powerful scientific corpus a few decades later.

Information Theory, in contrast, is a young science whose foundations, laid by Shannon and Turing, date back to about only eighty years ago. In such a relatively small window of time computer science developed its rigorous body which consists of coding by humans, arriving to the symbolic methods of classical AI. Its most advanced expression—the modern AI of deep learning and transformers—is currently experiencing its pre-scientific phase, much like engine technology did two centuries ago. Scientific research is working very hard to reach a full comprehension of the spectacular abilities that intelligent machines achieve nowadays. The quest for the “thermodynamics of learning”, asYann LeCun describes it, is making progresses with the tools of complexity theory invented by Giorgio Parisi and further developed by John Hopfield, Jeoffrey Hinton and others.

In this short chapter we aim to use the first industrial revolution as a proxy to better understand the current industrial revolution. We are, ultimately, interested in organizing our thoughts about what lies ahead from the encounter (or clash?) between these two revolutions and the respective entities at their core—energy and information. 

Let us start with our own organism. It has a power consumption of about one hundred watts (equivalently two thousand Kilocalories per day), of which our brain uses between twenty and thirty percent, despite accounting for only two percent of our body weight. This remarkably high investment in cognition—unique among living entities—has proven invaluable, enabling us to survive and develop to our current level. With it, we discovered fire, allowing us to metabolize external energy, as well as the lever and the wheel to make our efforts more efficient. On these foundations, great civilizations arose, culminating in the modern era, which now possesses what we call science—a collective knowledge capable of greatly improving our survival chances while enhancing our quality of life.

In the sixteenth century, per capita energy consumption in Europe was already approximately ten times higher than our basal biological needs. This energy was derived primarily from biomass—wood and charcoal—for cooking, heating, and other basic artisanal activities. Additional sources included animal power, such as oxen, horses, and other animals, which were used for transportation and agriculture. Wind and water power also played a crucial role, with windmills and watermills employed for grinding grain, sawing wood, and similar tasks. This level of energy consumption remained relatively constant until the mid-nineteenth century, when the onset of the first industrial revolution brought about a dramatic transformation (Kanter et al., 2014). In less than a century, per capita energy consumption grew steadily, reaching levels almost thirty times the basal need—a growth interrupted only during the two world wars. From the 1950s up to the present, this growth has averaged about one hundred times globally, nearly two hundred times in advanced economies with peaks over three hundreds.

Another critical factor to consider is the Economic Productivity of the Energy (EPE)—the wealth produced per unit of energy. Measured, for instance, in USD/Kwh, it reflects the efficiency of societal organization, including people, institutions, markets, and governments. A country with a high EPE is wealthier than one with a lower EPE that uses the same energy. Historical data from the industrial revolution reveal a fascinating trend: countries that industrialized rapidly experienced a sudden drop in EPE. In England and Wales, where industrialization was the most intense, the EPE fell from 0.21 to 0.11 in the early decades of the Industrial Revolution. By contrast, Italy, where industrialization arrived much later, maintained a high EPE of 0.28 during the same period but ended up with a GDP roughly half that of the UK (Malanima, 2016).

This phenomenon can be plausibly explained by several factors. First, the development of infrastructure lagged behind the explosive growth of industrial production. Second, excessive enthusiasm among financial investors during the industrialization gold rush led to risky behaviors, frequently resulting in financial bubbles (Perez, 2002). Compounding this was the societal "hype" surrounding technological advancements, amplified by misinformation, which created emotional peaks and troughs. 

The countries that ultimately capitalized most on the industrial revolution were those that established robust education systems linked to science and technology—especially polytechnic universities—and promoted sound information dissemination across society.

It is now natural to ask how EPE depends on GDP and energy. A remarkable finding (Jancovici, Fu Jun) is that, on a global average, EPE is constant: doubling the energy intake most plausibly results in a doubling of GDP (Contucci, Osabutey and Zimmaro).

With this set of premises, we can now make a few considerations about the ongoing AI industrial revolution. The energetic impact of information technology can be split into two distinct regimes—before and after the advent of deep learning. This transition can be approximately pinpointed to 2010–2012.

Before this period, data centers and computing infrastructures consumed about 1% of the world’s electricity, mostly for cooling systems and network hardware. Growth was driven by cloud computing, web services, and the expansion of the internet economy. Despite the growing demand for computation, increased efficiency and optimization of architectures slowed the rise in energy demand. Data transmission and broadband internet during the same period accounted for about 2% of global electricity consumption, with fiber optics, base stations, and routing infrastructure as key contributors.

Computational models before deep learning, including basic machine learning algorithms (e.g., low-dimensional models like decision trees and Bayesian networks), were computationally undemanding. Training these models generally required structured data and ran on standard CPUs rather than energy-hungry GPUs. High-Performance Computing (HPC) was probably the only high-energy-impact activity but being confined to scientific research and largely removed from large-scale industrial applications, it had a negligible global effect.

From 2012 to the present, the rise of deep learning has driven GPU adoption, resulting in exponential increases in energy demands. Natural language processing models like GPT-3 and GPT-4 are among the most energy-demanding due to their scale and complexity. Diffusion models for image and video generation consume significant power during training and inference. Multimodal systems like GPT-4 require even more resources due to their ability to process and integrate multiple data types. Models like AlphaFold and AlphaGo highlight the computational cost of reinforcement learning in simulation-heavy tasks.

The vertical increase in energy consumption, however, is conceptually clear and model-independent. Computer Science before Deep Learning relied on classical computation with standard programming. This process had a small energetic impact because it functioned somewhat like an intellectual lever or wheel, preserving the intelligence encoded by human programmers. Deep Learning, by contrast, is a completely new mental-like engine: it transforms raw, unstructured data into knowledge, incorporating its own (albeit still modest) intelligence.

Are these achievements scalable? And is the scale of the energetic effort required to bring them going to be sustainable? 

In 2018, the most accredited research and analysis think tanks predicted that, based on observed growth data, AI could consume half of the available energy by 2050. That prediction, distant and marked by large margins of uncertainty, was hastily ignored, downplayed, and dismissed with indifference. Behind the scenes, however, the AI revolution continued its inexorable ascent. In rapid succession, the steps necessary to build the generative AI of large language models were completed in laboratories, culminating in the sensational breakthrough of ChatGPT at the end of 2022. Since then, OpenAI, the company behind ChatGPT, and its competitors have been growing at an impressive rate. With a current user base of hundreds of millions, investors are eagerly lining up to board the AI train.

The energetic impact of the ongoing revolution is not just a rumor or one of the hypes. From technical studies (DeepMind, 2022), tested and reproduced by different laboratories, new evidence has emerged: bigger is better! That is, if we build larger machines and provide them with more data to process, they will, in their own way, become proportionally more intelligent—at least within the limits of current tests.

It is difficult to predict how large the new machines are likely to be, but under these competitive conditions, there is a symbolic yet significant benchmark to match: the human brain—a machine with a number of synapses (synaptic parameters) comparable to our own. Unfortunately, with machines of that scale, the earlier forecast for energy consumption by 2050 has now been brought forward to 2030. Physics in fact presents us with the bill: the energy requirements for a single machine of this magnitude are on the order of hundreds of millions of watts (compared to the human brain's thirty watts).

In fact, such a large-scale machine will not be satisfied with a pre-learning phase followed by refinement, as current models operate. Instead, it will require continuous learning and continuous interaction. It is impossible to draw that level of energy from the commercial grid; a dedicated source will be needed—a power plant. Existing power plants, however, are already heavily utilized and we cannot afford to sacrifice half of our current energy resources. It is not just a matter of losing half of what we do and have—it is about losing half of what we are. Apparently, the first solution that comes to match the energy need and its speed is nuclear power. For instance, an SMR (Small Modular Reactor) would be a perfect fit to power one of these machines. As we know, this is no longer just a fantasy. Over the past year, digital giants have signed agreements with companies producing nuclear energy by fission, the old-fashioned kind. They are even making small, symbolic investments—more for publicity than substance—in nuclear fusion. The bottom line here is that we are about to face the highest growth in energy demand in our entire history, and the questions ahead of us urgently need answers. What will be the actual energy power that the AI market will demand? In the turmoil that follows, will the AI revolution cause a drop in the EPE index? And how long will it take before it rises again? It is exceedingly difficult to estimate the GDP produced per unit of energy in the future. If it were to follow a trajectory similar to the first industrial revolution, the consequences could unfold in a chain reaction. The demand for energy would increase dangerously, making efforts to combat climate change through green technologies—which often have high costs—nearly impossible. The next step could be the loss of control over energy prices and mounting pressure to burn anything available just to produce energy—the nightmare scenario for every government and rational thinker.

Are these risks really inevitable? What is the best strategy to mitigate them? For now we do not have a recipe but only good ingredients. First and foremost we must transform this technology into science in order to optimize its energy consumption. To do so, we need to do research with massive investments and global development efforts to foster collaborations between universities, research centers, startups, industries and society.