Context engineering: why prompt engineering is no longer enough
The first waves of generative AI popularized “prompt engineering”, the art of formulating instructions to obtain the desired response from a model. Over the course of 2023-2025, communities shared recipes: writing long prompts, using lists, assigning roles (“you’re an accountant”) and so on. However, this technique has its limitations: the results are fragile, sensitive to slight modifications, and do not take into account memory or the extended context of exchanges. To deploy robust applications, companies are now turning to context engineering.
The limits of prompt engineering
An isolated prompt acts like a message in a vacuum. Large Language Models (LLMs) are non-deterministic: the same query can produce different answers depending on the internal state of the model or random draw. What’s more, a change in formulation can lead to drift. Tests show that a slightly different sentence or instruction can lead to an opposite result. Prompts have no memory: after responding, the model forgets the previous context, unless it is repeated in the prompt. This limits long or complex applications.
In addition, a simple prompt cannot integrate up-to-date business data (legal texts, internal databases). The answer is based solely on general knowledge of the model, which may be obsolete or inadequate.
The concept of context engineering
Context engineering involves providing the model with a structured environment, enriched with relevant knowledge and metadata, enabling consistent and reliable interactions. According to MarketLiftUp, this approach establishes a reasoning framework where the model can draw on continuity, memory and domain-specific rules. Unlike simple prompts, the context includes the previous conversation, external data and parameters that guide the model’s behavior.
One of the main associated technologies is Retrieval-Augmented Generation (RAG). This method combines an LLM with an external knowledge base: when a user asks a question, an internal search engine will retrieve relevant documents and inject them into the context before generating an answer. In this way, the model responds on the basis of up-to-date, verifiable information.
Types of context to manage
Context is not a monolithic block. MarketLiftUp identifies several types:
| Context type | Description | Example |
|---|---|---|
| Persistent | Stable information on user (identity, preferences), domain or task | Company credentials, internal compliance rules, history of past interactions |
| Current (up-to-date) | Real-time data from databases, APIs or recent documents | Exchange rates, stock availability, updated legal texts |
| Adaptive | Information adjusted to the conversation in progress | Summary of exchanges, clarifications made, user’s intention |
Context management is constrained by the model’s context window, i.e. the maximum length of the input text. For models like GPT-4, this window can exceed 30,000 tokens, but it remains limited. It is therefore necessary to choose which information is relevant and adapt it (summarize, prioritize) to maximize relevance.
Differences between prompt engineering and context engineering
| Aspect | Prompt engineering | Context engineering |
|---|---|---|
| Nature | Instructions formulated to guide response | Structured environment including memory, data and metadata |
| Robustness | Sensitive to variations and randomness | More reliable thanks to continuity and external sources |
| Knowledge management | Relies solely on internal model knowledge | Integrates data updated via RAG or internal databases |
| Adaptability | Requires manual adjustments for each task | Can be parameterized and automated (context builders) |
| Use cases | Experimentation, simple queries, quick tests | Professional applications, specialized assistants, business chatbots |
Implementing context engineering
To deploy a solution based on context engineering, several components come into play:
- Building a knowledge base: aggregating internal documents (policies, FAQs), customer data and reliable public sources. It is important to index these documents and structure them (titles, sections, metadata) to facilitate retrieval.
- Internal search engine: a semantic search model or vector engine identifies the most relevant passages for a given query. These passages are converted into vectors and compared with the query.
- Context selection and injection: depending on the length of the available context, select the most relevant documents and inject them into the context window. Add headers or tags to help the model distinguish between sources.
- Prompt orchestration: structure the prompt by combining conversation history, user information and retrieved documents. Use tags or markers to separate parts.
- Learning loop: analyze generated responses and gather user feedback to improve the knowledge base, search engine and context management.
Benefits and examples
The shift from prompt engineering to context engineering offers several benefits:
- Greater reliability: by relying on verified documents, answers are more accurate and hallucinations are reduced.
- Personalization: the model takes into account the user’s identity and preferences, thanks to persistent context.
- Scalability: the system can be deployed in different teams in different contexts (legal, HR, marketing).
- Compliance: the use of internal databases and traceability of sources facilitates compliance with regulations (e.g. RGPD).
Companies in the legal sector are already using this approach: start-up Harvey integrates case law databases to provide accurate advice to lawyers, while hospitals combine models with medical records to offer up-to-date care plans. These vertical LLMs (see next article) often rely on contextual architecture to ensure relevance and security.
Conclusion
Prompt engineering is a useful step in experimenting with LLMs, but it reaches its limits as soon as we tackle professional, complex or sensitive tasks. By providing the model with a rich, structured environment, context engineering makes it possible to overcome the fragility of prompts and integrate up-to-date data, thanks to approaches such as RAG. For companies, this is a prerequisite for developing reliable intelligent assistants, capable of reasoning about long contexts and providing real added value.