10 persons are gathered around a big, white table in a captivating discussion

LLMs in Enterprises: What Decision-Makers Should Know

Sophia Antonin Senior UX Data Scientist • Technology Manager AI/ Data Science

Björn Busch-Geertsema Head of Development

Esther Barra Lead Communication Manager

14/02/2025 • 5 minutes reading time

Potentials and Challenges — a summary by Ergosign

Large Language Models (LLMs) are increasingly revolutionizing the digital world. Recently, the Chinese LLM DeepSeek-R1 (and -V3) has made headlines. With unexpectedly high efficiency and performance, it has not only stormed the app charts but also stirred up tech giants and investors. In this two-part Insights article, we will first look at LLMs in general and then delve deeper into DeepSeek.LLMs form the basis for numerous AI applications that come onto the market almost daily. These models are trained over months on huge amounts of text data using machine learning. After this training, they can understand, complete, translate, and even write texts creatively. Well-known examples are ChatGPT, LLaMA, and Claude.But what actually makes such a model, and what should influence the decisions for using certain LLMs in the company? Why are open-source LLMs like DeepSeek so interesting for companies?

Adapting LLMs for Specific Applications

To use an LLM for a company or a specific use case, it often needs additional information about the company, a specific domain, or the respective task. Depending on the context and use case, different methods or strategies come into question:

1. In-Context-Learning

If only a small amount of information is needed, it can be given directly in the prompt during input. In so-called in-context learning, the model learns the task directly from the context. For example:“Read the following article and determine the appropriate newspaper section (e.g., politics, economy, panorama, sports, etc.):Berlin, March 2023 – In the middle of the Pacific, a freighter discovered a drifting wreck yesterday. Two survivors clung to the remains, seriously injured but alive. The cause of the accident remains unclear…”The model uses the provided input to recognize a pattern and make a new decision that it has not been trained on before. This method is suitable, for example, for text or style adaptations for personalized emails or product descriptions.

In context learning process: 1. Prompt preparation, 2. Model Execution, 3. Evaluation & Iteration — In-Context-Learning

2. Fine-Tuning

If a language model needs in-depth knowledge about a company, software, or a field of expertise, there is often not enough space for the required context in the input field. One solution is fine-tuning. Here, an already trained model is further trained with specific data. The language model already knows grammar and general language structures but is enriched with context-specific content. Although significantly less data is required for fine-tuning than for the original training, it still needs an appropriate amount of high-quality data in a standardized format. Incorrect information, poorly documented databases, or inferior image and table quality can negatively affect the model. A disadvantage of fine-tuning is that already integrated data can no longer be removed or easily updated. Fine-tuning is useful, for example, for brand-specific text creation: Companies train a model on their own marketing and communication guidelines to generate consistent content in the company's own style.

3. Retrieval Augmented Generation (RAG)

A particularly effective method for providing relevant and, above all, factually correct information is Retrieval Augmented Generation (RAG). Here, relevant data sources such as documentation, workshop recordings, explanatory videos, or instructions are stored in a vector database and divided into smaller sections (chunks). When a user makes a request, the system searches the database for the most suitable chunks. These are provided to the language model as context in the prompt to generate well-founded answers.

Advantages of RAG:

Flexibility: Databases can be expanded and updated as required
Precision: The answers are based on specific information
Efficiency: No complex fine-tuning of the model necessary

Challenges:

Extraction: Finding the right information chunks
Relevance: Selecting suitable documents for the request
Semantic Matching: Correctly capturing the contextual meaning

The RAG approach is advantageous in a variety of use cases, e.g., in dynamic knowledge databases for customer or IT support or personalized product recommendations based on real-time product data and user behavior.

In the AI Exploration Sprint, for example, we used the RAG method in our project with schrempp edv to develop a proof of concept for the integration of AI into the existing software.

The Right Language Model

To decide which language model is suitable for use in the company, there are several factors to consider.

Model SizeThe model size describes the compromise between efficiency, resource use, and quality.Depending on which end device the model is to run on and how good the quality of the output should be, different models come into question. Smaller models are more efficient and less resource-intensive but may be limited in the quality of the generation. Larger models deliver better results but require more computing power and are more costly.

Small models: GPT-2, Gemma-2B, Phi-2
Medium-sized models: GPT-3.5, Llama-2-13B, Mistral-7B
Large models: GPT-4, Llama-2-70B, Claude-Opus, Gemini-Ultra

Infrastructure & Data Protection

The choice of infrastructure is a fundamental decision that depends largely on the type of application and the data protection requirements. The more sensitive the data, the more important the control over its processing. Language models can either be used via a cloud API or hosted on their own servers.

API-based models: These powerful models are very easy to integrate but cause ongoing costs per request. Since the data is sent to external servers, there is less control over the data processing. Examples: ChatGPT, Claude, Gemini, Cohere

Self-hosted models: Open-source models can be operated locally or on their own cloud servers. This enables complete data control but requires powerful hardware. This is particularly relevant for data protection regulations such as GDPR or TISAX. Examples: Llama, Mistral, DeepSeek, Falcon

Performance & Domain-Specific Requirements

Depending on the use case, specific models make sense. Some are optimized for technical, legal, or scientific applications, while others can handle multilingual or medical content particularly well. It is worthwhile to compare models and adapt them if necessary. Technical factors such as:

Inference Speed: How quickly can the model answer requests?
Context Window: How much text can the model process at once? A large context window is advantageous for complex requests and long conversations.

Closed vs. Open Source

Open-source LLMs offer more transparency, adaptability, and control over the data, but require a powerful infrastructure and technical know-how. They are particularly suitable for companies that want to save costs in the long term and customize their models individually. (For the sake of simplicity, we include here open-weights models, where only the trained weights have been published, but not the source code or the training data.)

Closed-source LLMs, on the other hand, are often more powerful and easier to integrate, but involve dependencies on external providers and potential data protection risks. They are ideal for quick implementation and lower maintenance. Since the models are not directly accessible, no one can operate, change, or further develop them independently. They are usually only accessible via API. The choice of the right language model therefore depends on the individual requirements, the budget, and the data protection guidelines. A balanced approach between quality, cost, and security is crucial. Companies today have many options for specifically selecting and adapting LLMs. But how sustainable are such decisions in a time when technology is developing rapidly and new hypes seem to “turn everything upside down” every week? Does it make sense to adapt an LLM – or is it worth simply waiting for the next version? How relevant are data protection debates when it comes to API variants when local models could soon be just as powerful? So how do technical decision-makers make informed decisions? Which trends are here to stay and which are just short-term phenomena? In the next part we want to discuss the DeepSeek example, which has caused quite a stir in the last few weeks.

Ergosign Insights: DeepSeek-R1 — a new era in Open-Source LLMs

Sophia Antonin accomplished her Master's degree in Computational Linguistics at the LMU Munich in 2019. Since then she has successfully been realizing projects in Generative AI and Natural Language Processing. With her expertise and passion for Artificial Intelligence, Sophia develops innovative solutions and actively co-creates the future at Ergosign.

Sophia AntoninTechnology Manager AI/ Data Science

Do you want to learn more about our services around AI?

With our AI workshops you'll discover how AI can help stand out from competitors, how to make products and processes more efficient and how to tackle new challenges. You'll learn how to make your business more resilient and future-proof here.