Why architecture tests?
AI coding assistants optimise for a single objective: make the code compile and the existing tests pass. They take the shortest path to that goal, and the shortest path routinely violates architectural boundaries the compiler cannot see.
A controller calls a repository directly, bypassing the service layer. A domain object imports an infrastructure class. A new module introduces a cyclic dependency. None of these trigger a compilation error. None of them fail an existing unit test. All of them erode the architecture over time.
Architecture tests make these invisible constraints visible and enforceable. They run as ordinary JUnit tests, execute in milliseconds, and fail before the code leaves the developer's machine.
ArchUnit in 60 seconds
ArchUnit is a Java library that scans compiled bytecode and lets you express architectural rules as executable tests. No infrastructure required. No special build plugins. Just a test dependency and JUnit integration.
- Scans
.classfiles — works with any JVM language (Java, Kotlin, Scala) - Integrates with JUnit 5 natively
- Runs in milliseconds — suitable for pre-push hooks
- No infrastructure, no Docker, no external services
Here is a rule that prevents controllers from accessing repositories directly:
@Test
public void controllersShouldNotAccessRepositoriesDirectly() {
noClasses()
.that().resideInAPackage("..controller..")
.should().dependOnClassesThat()
.resideInAPackage("..repository..")
.check(importedClasses);
} When an AI agent generates a controller that injects a repository, this test fails instantly with a clear message identifying the offending dependency. The agent receives the failure, corrects the code, and the architecture stays intact.
Pattern 1: Technology Migration
AI models are trained on millions of lines of legacy code. When you ask them to make an HTTP call,
they default to RestTemplate — deprecated since Spring 6.1. When writing tests, they
reach for JUnit 4 annotations. For date handling, they import java.util.Date.
Architecture tests turn migration decisions into enforceable rules:
@Test
public void noClassesShouldUseRestTemplate() {
noClasses()
.should().dependOnClassesThat()
.haveFullyQualifiedName("org.springframework.web.client.RestTemplate")
.because("RestTemplate is deprecated since Spring 6.1; use RestClient instead")
.check(importedClasses);
} This pattern applies to any technology migration:
RestTemplate→RestClient- JUnit 4 (
@org.junit.Test) → JUnit 5 (@org.junit.jupiter.api.Test) java.util.Date/Calendar→java.time.*- Apache Commons
StringUtils→ JDK built-in methods
Pattern 2: Legacy Codebases
Greenfield projects can enforce strict rules from day one. Legacy codebases cannot. ArchUnit's
FreezingArchRule solves this: it records the current violations as a baseline and
only fails on new violations.
@Test
public void noNewLayerViolations() {
FreezingArchRule.freeze(
layeredArchitecture()
.consideringAllDependencies()
.layer("Controller").definedBy("..controller..")
.layer("Service").definedBy("..service..")
.layer("Repository").definedBy("..repository..")
.whereLayer("Controller").mayOnlyAccessLayers("Service")
.whereLayer("Service").mayOnlyAccessLayers("Repository")
.whereLayer("Repository").mayNotAccessAnyLayer()
).check(importedClasses);
} The first run creates a violation store (typically committed to version control). Subsequent runs only fail if new violations appear. As you fix existing violations, update the baseline. This enables incremental adoption without a massive upfront refactoring effort.
Pattern 3: Architectural Boundaries
Beyond simple layering, ArchUnit enforces structural rules that prevent architectural erosion:
@Test
public void domainModelShouldNotDependOnInfrastructure() {
noClasses()
.that().resideInAPackage("..domain..")
.should().dependOnClassesThat()
.resideInAnyPackage("..infrastructure..", "..controller..", "..config..")
.because("Domain model must remain framework-independent")
.check(importedClasses);
}
@Test
public void noCircularDependenciesBetweenPackages() {
slices().matching("com.example.(*)..")
.should().beFreeOfCycles()
.check(importedClasses);
}
@Test
public void orderModuleShouldNotAccessCustomerInternals() {
noClasses()
.that().resideInAPackage("..order..")
.should().dependOnClassesThat()
.resideInAPackage("..customer.internal..")
.check(importedClasses);
} These rules enforce: clean domain boundaries, acyclic module structure, and controlled cross-domain access. The compiler sees none of these — ArchUnit sees all of them.
Pattern 4: Unwritten Standards
Every team has conventions that exist only in tribal knowledge: "we don't use field injection,"
"never call Thread.sleep in production code," "@Transactional goes on service classes, not repositories."
AI agents have no access to this knowledge.
@Test
public void noFieldInjection() {
noFields()
.should().beAnnotatedWith("org.springframework.beans.factory.annotation.Autowired")
.because("Decision 2023-03: use constructor injection for testability")
.check(importedClasses);
}
@Test
public void noThreadSleepInProductionCode() {
noClasses()
.that().resideOutsideOfPackage("..test..")
.should().callMethod(Thread.class, "sleep", long.class)
.because("Decision 2024-01: use ScheduledExecutorService or @Scheduled")
.check(importedClasses);
}
@Test
public void transactionalOnlyOnServiceLayer() {
noClasses()
.that().resideInAPackage("..repository..")
.should().beAnnotatedWith("org.springframework.transaction.annotation.Transactional")
.because("Decision 2023-07: service layer owns transaction boundaries")
.check(importedClasses);
} Each rule names the decision it enforces. When a test fails, the developer (or AI agent) sees not just what is wrong but why the team decided against it.
Controlled Experiment
We ran a controlled experiment: the same AI agent, the same CRUD task (create an Order entity with controller, service, and repository layers), 10 repetitions across 3 variants.
| Metric | Raw Agent | Agent + Loop | Agent + Loop + ArchUnit |
|---|---|---|---|
| Layer bypass (controller → repository) | 10/10 | 7/10 | 0/10 |
| Deprecated API usage (RestTemplate) | 8/10 | 6/10 | 0/10 |
| Field injection | 9/10 | 5/10 | 0/10 |
| Cyclic dependencies | 4/10 | 3/10 | 0/10 |
| Passes all architecture rules | 0/10 | 0/10 | 10/10 |
✅ Key finding
The agentic loop alone reduces some violations but does not eliminate them. ArchUnit rules achieve 0/10 violations across every category because the agent cannot mark the task as complete while tests are failing. The rules are deterministic, binary, and non-negotiable.
Deployment Strategy
Architecture tests work best in two tiers:
Tier 1: Focused Local Suite (<10 seconds)
- 5-15 high-value rules covering the most common AI violations
- Runs as a pre-push hook or within the AI agent's feedback loop
- Catches issues before code leaves the developer's machine
- Feedback time under 10 seconds — fast enough for interactive development
Tier 2: Full Suite in CI
- Complete rule set including domain boundaries, naming conventions, and freezing rules
- Runs on every pull request
- Includes baseline management for legacy codebases
- Produces architecture compliance reports
Integration with AI Agents
When integrated into an agentic coding loop, architecture tests act as guardrails. The agent generates code, runs the architecture tests, receives structured failure messages, and corrects the code — all within a single iteration. The agent never sees a "green build" until the architecture is correct.
Cross-ecosystem equivalents
ArchUnit is the gold standard for the JVM, but every ecosystem has analogous tools:
| Ecosystem | Tool | Notes |
|---|---|---|
| .NET | NetArchTest / ArchUnitNET | Same fluent API style as ArchUnit, scans .NET assemblies |
| JavaScript / TypeScript | eslint-plugin-boundaries | ESLint rules for module boundary enforcement |
| Python | import-linter | Enforces import rules based on declared contracts |
| Go | go-cleanarch | Checks clean architecture dependency rules |
The principle is universal: if you have a convention the compiler cannot enforce, write a test that can.