Generative AI is reshaping all aspects of software development, from code completion to automated testing, from document generation to architecture design. However, behind this convenience lies an increasingly serious issue: the environmental cost of AI-assisted programming.
Large Language Models (LLMs) consume a great deal of energy during training and inference, resulting in significant carbon emissions. At the same time, is code generated by LLMs more energy-efficient than manually written code? Do generated machine learning models consider energy efficiency optimization? These questions currently lack systematic research data support.
GenAI-GreenML dataset was created to fill this research gap. It provides a carefully selected benchmark dataset specifically for evaluating the environmental impact and energy efficiency performance of generative AI in code generation tasks.