The interface of the LLM DeepSeek

DeepSeek-R1 – a new era of open-source LLMs

14/02/2025 • 5 minutes reading time

Sophia Antonin

Sophia Antonin Senior UX Data Scientist • Technology Manager AI/ Data Science

Björn Busch-Geertsema Head of Development

Stefan Schulz UX Director, Head of Site Munich

A turning point for the AI landscape?

A small start-up releases a new model—and suddenly stock markets and the tech world are in turmoil. Technology stocks such as Nvidia and Microsoft record significant losses. What exactly happened?

The reason for the excitement is DeepSeek-R1. This Chinese reasoning model can compete with leading AI models such as OpenAI’s o1. That alone is remarkable, but two aspects make this achievement even more outstanding:

  • DeepSeek claims to have trained its models using significantly fewer resources than Western competitors.

  • In deployment, the model requires substantially less computing power than comparable systems.

This challenges a long-held assumption: that better AI always requires more resources. Are there technological shortcuts that could break this trend after all?

However, efficiency is not the only topic of debate. Several countries, including Australia, South Korea, and Taiwan, have restricted or banned DeepSeek on government devices. The reason: concerns about data security and potential ties to the Chinese government.

DeepSeek-R1 therefore raises critical questions—about the future of AI training, China’s role in global AI research, and the geopolitical consequences of emerging technologies.

Technological Advances of DeepSeek-R1

DeepSeek-R1 is based on a Mixture-of-Experts (MoE) architecture. This means that for each query, only 37 billion of the total 671 billion parameters are active. Instead of using the entire model, specialized “experts” are selectively activated for specific tasks. This approach combines the strengths of large, powerful models with the efficiency of smaller ones.

The R1 model was trained in four phases based on the V3 model, using a combination of fine-tuning and reinforcement learning (RL). These methods helped the model learn logical reasoning and inference. As a result, it achieves outstanding performance particularly in mathematics and programming.

A key technique is Chain-of-Thought (CoT). The model breaks down complex problems into small, logical steps, reasoning its way to a solution instead of guessing an answer outright. This process happens automatically, without explicit prompting, making DeepSeek-R1 especially precise for tasks that require clear, traceable reasoning.

Opportunities and Challenges

One major advantage of DeepSeek-R1 is its open availability. DeepSeek has released both the source code and the model weights under the MIT license. This allows free use, modification, and commercial deployment. Even new models may be trained on R1’s output. This openness enables broader advancement of AI technology and represents an important step toward open AI models.

Compared to OpenAI’s o1 (and now o3-mini), R1 still lacks integration into large-scale enterprise features that have already been extensively tested. Nevertheless, it puts established AI providers under pressure. It delivers impressive performance—at significantly lower training costs.

A phone screen with multiple LLM App Icons
The LLM landscape

Impact: New Players Are Reshaping the AI Ecosystem

DeepSeek-R1 accelerates the democratization of AI. It demonstrates that massive resources are not the only decisive factor; innovative approaches and efficient methods can also achieve top-tier performance. This encourages entrepreneurs to take risks and prioritize innovation over resource arms races.

In the medium term, this could expand the range of providers and solutions while curbing the energy-intensive resource demands of AI—a crucial factor for sustainability and the energy sector. At the same time, companies must carefully weigh geopolitical and data protection implications before integrating such models.

Sustainable AI Strategies: Innovation with Foresight

DeepSeek-R1 undoubtedly marks a significant innovation boost and deserves the attention of decision-makers seeking to embed modern AI methods within their organizations. But does it really change everything?

Given the high resource demands of LLMs, carefully balancing desired quality and efficiency remains essential. While conditions continue to evolve, time and cost efficiency remain central factors. Local deployment options are gaining relevance, but their implementation often remains complex. Lessons learned from the cloud computing trend can also be applied to AI.

Open-source solutions foster innovation, but for companies, ease of access and usability will ultimately be the key priorities. In adapting AI for specific applications, the trend toward democratization and innovation will give rise to new approaches. Nevertheless, the core principles described above remain the decisive levers for targeted specialization in an enterprise context.

This leaves one key question: how future-proof is an architectural decision made today? We don’t have a crystal ball—but we are convinced that decisions guided by these criteria will endure even in a highly dynamic environment.

Do you have questions? We look forward to the exchange.

The future of AI will not only be shaped by powerful models, but also by the way they transform human interactions, technologies, and businesses. DeepSeek-R1 is a promising step in this direction.

Sophia Antonin

Sophia Antonin completed her Master’s degree in Computational Linguistics at LMU Munich in 2019. Since then, she has successfully delivered projects in the fields of Natural Language Processing and Generative AI. With her expertise and passion for artificial intelligence, she develops innovative solutions at Ergosign and helps shape the digital future.

Sophia AntoninTechnology Manager AI/ Data Science