Sustainability

The AI Paradox

The Ecological Price of Progress

In the Face of Exponential Growth

While Artificial Intelligence (AI) has the potential to solve global challenges—from medical diagnostics to climate modeling—it is simultaneously a massive resource consumer. We are experiencing a new paradox: The development and scaling of AI models. Performance is growing exponentially, and the ecological cost is soaring to immeasurable heights. Estimates suggest that training a single large language model can produce as much CO as five cars over their entire lifecycle. In the face of the climate crisis, this growth is alarming. At morev•o, we do not see this dynamic as fate, but as a challenge and a mission. We integrate the United Nations Sustainable Development Goals (SDGs) directly into our development cycle.

Innovation and Resilient Infrastructure (SDG 9)

Bigger is not always better, although nothing works without size. Our focus is primarily on the development and implementation of Small Language Models (SLMs). These specialized models achieve the performance of large language models for clearly defined tasks while consuming only a fraction of the resources. A key lever here is model optimization. Through mathematical methods such as pruning (removing unnecessary connections in the neural network) and quantization (reducing the precision of weights), we massively lower model complexity. Studies show that even a minimal reduction in accuracy by just one percent can reduce a model’s energy demand by up to 77%. In practice, this means that an AI application responds faster, requires less hardware, and still delivers precise results. By creating specialized architectures that run on standard hardware instead of large, energy-hungry GPU clusters, we enable access to cutting-edge technology without requiring exorbitant infrastructure investments.

↑ Back to Table of Contents

Sustainable Production through Fine-Tuning (SDG 12)

As much AI as needed, as little energy as possible. The wheel doesn’t need to be reinvented every time. We focus on adapting existing, proven models (fine-tuning) to avoid energy-intensive developments from scratch. The initial pre-training of a model like GPT-3 consumes about 1,300 megawatt-hours of electricity—equivalent to the annual demand of about 400 average households. Targeted fine-tuning, on the other hand, often requires only a tiny fraction of this energy. Through optimized fine-tuning methods, the required floating-point operations (FLOPs) can be reduced by up to 64%. Model precision remains intact, and the existing intelligence of large models is utilized and tailored precisely for the specific use case.

↑ Back to Table of Contents

Active Climate Protection & Local Sovereignty (SDG 13)

In software development, it is often forgotten that every line of code leaves an energetic footprint. At morev•o, we take responsibility for our indirect emissions (Scope 3) seriously. A central point is the trade-off between cloud and locality. We analyze the ideal infrastructure for each project individually. Cloud platforms can be up to 93% more energy-efficient than classic data centers through economies of scale and the use of renewable energies. However, where data protection and data sovereignty are paramount, local on-premise and edge solutions offer a major advantage: They reduce the CO-intensive data transfer over long distances. A complex, unnecessarily broad request to a generic model can cause up to 50 times more CO than an optimized, specialized prompt to an efficient model.

↑ Back to Table of Contents

Partnerships for the Goals (SDG 17)

Sustainability in the AI sector is not a solo discipline but a community task. SDG 17 reminds us that we can only achieve global goals through cooperation. We see ourselves as partners to our customers and understand ecological responsibility and digital competitiveness not as opponents. Companies that embrace this insight reduce their operating costs through reduced energy consumption and simultaneously increase their attractiveness to investors and customers who value ESG criteria (Environmental, Social, Governance). Optimized AI solutions can enable up to 33% higher productivity while reducing resource consumption. For morev•o, high-end AI is at its strongest when it achieves maximum impact with minimal resources.

↑ Back to Table of Contents

Technical Considerations

Model Compression

To make AI models run on standard hardware, we use two primary methods. Post-Training Quantization (PTQ): While standard models typically use 32-bit floating-point numbers (FP32) for their weights, we transform these into 8-bit (INT8) or even 4-bit formats. This reduces memory requirements by a factor of four and massively accelerates inference. With structural pruning, we identify dead neurons or connections that do not contribute significantly to prediction quality. By removing these redundancies, we reduce the number of necessary calculations (FLOPs) per request.

Parameter-Efficient Fine-Tuning (PEFT)

Instead of updating all billions of parameters in a model, we use techniques like LoRA (Low-Rank Adaptation). Here, only small, additional matrices are trained while the main model remains frozen. The memory requirement during training drops drastically (often by over 90%), allowing training on a single GPU instead of a cluster. This saves energy and costs.

Inference Optimization at the Edge

To minimize data transfer (Scope 3 emissions), we optimize models for edge computing. We reduce memory hunger during long text generations so that AI can respond on local servers without latency loss. By intelligently batching requests, we increase hardware resource utilization so that no energy is wasted in idle mode.

Sustainability Metrics

The success of projects can be measured using specific KPIs, e.g., by measuring the joule consumption of a single user request (Energy per Inference – EPI) or by comparing computation time with the current energy mix, with particular consideration of the use of data centers during periods of high renewable energy feed-in (Carbon Intensity of Compute).

↑ Back to Table of Contents

A morev•o Case Simulation: Efficiency in Customer Service

The Project

Automated email classification and draft responses for an energy service provider.

The Initial Situation

The customer uses an API connection to a standard large language model (LLM) to sort approximately 50,000 customer inquiries per month. The problems are high ongoing costs per request, data protection concerns with cloud transmission, and massive energetic overhead since an all-purpose model is used for a specialized task (classification). The CO equivalent is comparable to the daily charging of ~10 electric cars.

The morev•o Solution

I. Model Selection (SDG 9 & 12): Instead of the 175-billion-parameter model, a compact open-source base model with only 7 billion parameters is chosen.
II. Fine-Tuning with PEFT: Using LoRA, the model is specifically trained on the technical terminology of the energy sector (e.g., meter readings, tariff changes)—a few hours on a single workstation instead of days in a server farm.
III. Quantization: The model is compressed from FP16 to 4-bit (GGUF format) so that the AI now runs on a local server (on-premise).
IV. Prompt Optimization: Vague instructions are replaced by highly precise system prompts that reduce computation time (tokens) by 40%.

The Results

Metric	Before (Standard AI)	After (morev•o)	Improvement
Latency	~4.5 seconds	~0.8 seconds	82%
Energy Demand / Request	12.0 Wh	1.4 Wh	88%
Operating Costs / Month	~1,200 € (API Fees)	~150 € (Electricity/Maintenance)	87%
Data Sovereignty	Cloud (External)	Local (Internal)	Maximum

Fazit: Productivity is increased, and the ecological footprint is almost completely neutralized.

↑ Back to Table of Contents