The rapid growth of large language models (LLMs), such as GPT-3 and BLOOM, has revolutionized the field of artificial intelligence (AI). These powerful models can potentially automate and enhance various aspects of human endeavour. However, there is a pressing concern regarding the environmental impact of these models. The training and operation of LLMs rely heavily on vast computational resources, resulting in a substantial carbon footprint.
In this article, we will delve into the implications of the growing carbon emissions associated with LLMs and explore strategies to mitigate their environmental impact.
The Hidden Emissions of Language Giants
The evolution of LLMs has been nothing short of meteoric. From the realms of GPT-3, boasting 175 billion parameters, to the behemoths of GPT-4 and beyond, the complexity and capabilities of these models have soared. However, with great power comes an equally significant energy demand. The training and operation of LLMS are intrinsically tied to vast computational resources, which, in turn, are powered by electricity – a commodity predominantly generated from fossil fuels.
The development and deployment of LLMs have led to a surge in energy consumption. The training process alone can emit a significant amount of carbon dioxide. For instance, Hugging Face’s BLOOM model emitted 25 metric tons of CO2 during training, and when considering the entire lifecycle, this figure doubled to 50 metric tons.
These emissions are comparable to the carbon footprint of approximately 60 flights between London and New York. It is important to note that the emissions vary depending on the energy grid used for training, with regions reliant on fossil fuels exhibiting higher pollution levels.
Google’s large language model, PaLM, accentuates the scale of the issue. With a whopping 540 billion parameters, PaLM necessitates tens of thousands of advanced high-performance chips for its training and operation, each contributing to the burgeoning carbon emissions associated with LLMs.
Hugging Face’s BLOOM model emitted 25 metric tons of CO2 during training, and when considering the entire lifecycle, this figure doubled to 50 metric tons.
The Underbelly of Innovation
The carbon emissions associated with LLMs extend beyond their operational phase. The manufacturing of the hardware required to support these models, the maintenance of data centres, and the disposal of electronic waste all contribute to their environmental impact.
Additionally, the post-training operation of LLMs continues to demand significant energy, resulting in ongoing emissions. For example, BLOOM emitted approximately 19 kilograms of CO2 daily post-launch, equivalent to the emissions generated by driving around 54 miles in an average new car.
Towards Greener Synapses
Efforts to address the carbon footprint of LLMs are gaining traction within the tech community. Several strategies have emerged to mitigate the environmental impact of these language giants:
1. Renewable Energy Procurement
One approach to sustainability in AI is demand-side interventions, specifically load shifting. By rescheduling the demand for electricity to align with renewable energy availability, the carbon emissions associated with LLMs can be significantly reduced. Procuring renewable energy for training LLMs can result in emissions reductions of up to 40% compared to relying solely on fossil fuel-based grids. Load shifting is particularly feasible for non-latency-bound AI technologies like ML training, as compute resources can be distributed across different regions without affecting system performance.
2. Energy Tracking
To optimize energy consumption, it is crucial to track and monitor the energy usage of LLMs during both training and operation. By accurately measuring the power draw of GPUs and CPUs used for hosting computing, it becomes possible to determine the actual energy consumption. This information is vital for decision-making regarding load shifting and migration to more energy-efficient data centres. However, precise quantification of CO2 emissions remains challenging due to limited reporting of the necessary information, such as data centre details, hardware specifications, and energy mix.
3. Load Shifting Large Language Models
Demonstrating the feasibility of load shifting for LLMs is crucial to promote sustainable AI practices. Real-world use cases, such as the load shifting of BERT (Bidirectional Encoder Representations from Transformers), have been implemented and evaluated. By automatically moving the compute load for training LLMs across different data centres based on the availability of renewable energy, carbon emissions can be effectively reduced. Using saved model checkpoints ensures continuity and functionality throughout the load-shifting process.
The carbon emissions resulting from the development and deployment of LLMs pose significant environmental challenges. The energy consumption associated with training and operating these models demands urgent attention. However, there are viable strategies to mitigate the carbon footprint of LLMs.
By leveraging renewable energy, implementing load-shifting techniques, and tracking energy usage, it is possible to reduce the environmental impact of LLMs while maintaining their functionality and performance. As the prevalence of large language models continues to grow, it is imperative to prioritize sustainable AI practices to ensure a greener future for this transformative technology.
- D. Amodei and D. Hernandez, “AI and Compute,” Available at link https://openai.com/blog/ai-and-compute/
- D. Patterson, “Carbon emissions and large neural network training,” Available at link https://arxiv.org/abs/2104.10350
- R. Schwartz, “Green AI,” Communications of the ACM, vol. 63, no. 12, pp. 54-63, 2020.
- A. Lasse, “Carbontracker: Tracking and predicting the carbon footprint of training deep learning models,” Available at link https://arxiv.org/abs/2007.03035
- K. Hao, “Training a single AI model can emit as much carbon as five cars in their lifetimes,” MIT Technology Review, June 6, 2019.
- M. H. Page, “We’re getting a better idea of AI’s true carbon footprint,” MIT Technology Review, 2022.
- P. Dhar, “The carbon impact of artificial intelligence,” Nature Machine Intelligence, vol. 2, no. 8, pp. 423-425, 2020.
Mohsin Iqbal is a student of Computer Science. His research interests revolve around Deep Learning, Medical Image Analysis using AI, and Large Language Models. With an unwavering commitment to spreading knowledge, Mohsin embodies his manifesto: benefit humanity through the power of Data Science and AI.