In our earlier articles, we focused first on adoption trends and then on best practices for prototyping. Those pieces highlighted how GenAI is changing the way we work day to day. This article shifts perspective. Instead of looking at how we use GenAI, we explore what changes when it becomes part of the architecture itself. What does it mean to design a system where an LLM is not just a helper but a core component? Four themes stand out.
Traditional software architectures are deterministic. The same input, under the same conditions, will always produce the same output. This predictability makes systems reliable, testable, and auditable.
GenAI introduces a different foundation: probabilistic reasoning. The same input may generate slightly different outputs, shaped by training data, context, and randomness. This flexibility is what makes LLMs powerful, but it also complicates design. Architects must now decide where degrees of freedom add value and where determinism remains essential. Creative generation, summarization, and translation benefit from variability; financial transactions and compliance-critical processes do not.

The challenge is that testing shifts from binary checks to something closer to performance evaluation. The question is no longer “did the system return the correct output?” but “does the system perform consistently enough under varied conditions?”
In traditional development, testing often happens after the code is written. With GenAI, testing begins much earlier. Teams often need to validate what the model can do before they commit to an architectural choice. This means testing is not a final gate but an integral part of design.
It also continues throughout the lifecycle. Outputs may change with different prompts, updated models, or evolving contexts. QA becomes less about compilation and execution, and more about measuring reliability across scenarios. In practice, this means adding evaluation loops during development and monitoring performance after deployment.
Classical workflows are built as deterministic sequences of functions and API calls. In GenAI architectures, workflows often take the form of orchestrated LLM calls. The key is task decomposition: instead of one broad request, problems are split into smaller, structured steps.

This creates what can be called semi-agentic systems. At the macro-process level, structure and sequencing remain clear. At a lower level, freedom is granted to the model to interpret and generate. The design question shifts from “how do we implement every step?” to “where do we constrain the model, and where do we let it decide?”
Splitting tasks across multiple requests often yields better performance than relying on a single all-encompassing call. It also makes testing and debugging more manageable, since each step can be evaluated in isolation.
Traditional architecture diagrams are drawn in layers: interface, functional logic, and data. GenAI introduces a fourth: context.
An LLM’s output depends heavily on the information it is given at runtime. Context engineering — curating knowledge bases, designing embedding strategies, building retrieval pipelines — becomes as central as database design. In this sense, context is not an afterthought but a core architectural layer. It determines what the system “knows” at any given moment.
Architectures that incorporate GenAI now include context stores, retrieval mechanisms, and grounding strategies alongside the usual services and databases. The question of accuracy is no longer just about business logic or data integrity, but also about how effectively the context pipeline supplies the right information to the model.
GenAI architectures redefine the fundamentals of design. Determinism gives way to probabilistic reasoning. Testing shifts earlier and becomes continuous. Workflows are reimagined as orchestrations of LLM calls. And context emerges as a new architectural layer in its own right.
The opportunity is clear: systems that are faster, more adaptive, and more capable of handling complex, unstructured problems. The risk is equally real: fragility and unpredictability if flexibility is left unchecked. Just as we have started to develop best practices for prototyping, the next step is to establish emerging patterns for architectural design with GenAI at the core. The question is not whether these architectures will spread — they already have — but how we ensure they remain reliable, maintainable, and trustworthy as they do.