Before / After Scenarios | Visdom Testing

Each scenario is drawn from real patterns observed across enterprise teams adopting AI-assisted development. The "before" reflects what happens with standard testing practices. The "after" shows what changes when Visdom Testing layers are in place.

Scenario 1: AI generates a pricing module

Pricing module with AI-generated tests

BEFORE

Copilot generates a pricing calculation module along with 10 unit tests.

Tests achieve 90% line coverage.
Code review looks clean — all tests pass, coverage is high.
Code ships to production.
Two rounding bugs discovered by customers weeks later.

AFTER

PBT finds 2 rounding bugs via oracle properties before the code leaves the PR.

Oracle property compares production implementation against naive reference.
jqwik generates 1000+ input combinations and finds a discrepancy.
Automatic shrinking produces minimal counterexample: unitPrice=0.01, qty=48, discount=1.05%.
Developer fixes rounding logic before merge.

0 → 2

Bugs found

High → Eliminated

False confidence

Scenario 2: AI generates a Spring service

Spring controller with layer violations

BEFORE

AI generates a Spring controller that calls the repository directly, bypassing the service layer.

Code compiles successfully.
AI-generated tests pass — everything is mocked.
Code review may or may not catch the layer violation.
Architecture erodes over time as more controllers bypass services.

AFTER

ArchUnit rule catches the layer violation at build time. AI fixes it on retry.

ArchUnit rule: controllers must not access repositories directly.
Build fails with a clear violation message.
AI regenerates with proper service layer indirection.
Result: 0/10 violations with ArchUnit vs 10/10 without.

10/10 → 0/10

Architecture violations

Production → Build time

Fix cost

Scenario 3: Migration from RestTemplate to RestClient

API migration enforcement

BEFORE

Team decides to migrate from RestTemplate to RestClient. AI keeps introducing RestTemplate.

AI has more RestTemplate examples in training data.
Each PR re-fights the same battle: reviewer catches RestTemplate usage, sends it back.
Some usages slip through review, especially in large PRs.
Migration stalls. Both APIs coexist indefinitely.

AFTER

ArchUnit bans RestTemplate. FreezingArchRule allows legacy code. AI gets build failure and uses RestClient.

ArchUnit rule bans RestTemplate imports in new code.
FreezingArchRule captures existing RestTemplate usages as a baseline.
AI generates code, gets a build failure, and uses RestClient on retry.
Senior engineers stop reviewing for API migration — the compiler enforces it.

Constant → Zero

Regression rate

High → -70%

Senior review time

Scenario 4: Flaky test plague

CI trust restoration

BEFORE

Test suite takes 45 minutes. 84% of failures are flaky. Developers lose trust in CI.

Developers merge without waiting for CI results.
Real failures get lost in the noise of flaky failures.
QA team spends hours triaging failures that turn out to be flakes.
Bugs escape to production because the safety net has holes.

AFTER

TORS identifies flaky tests. Quarantine + ownership. Predictive test selection restores speed.

TORS (Test Observability and Reliability Score) identifies tests with <98% pass rate.
Flaky tests are quarantined: they run but do not block the pipeline.
Each quarantined test gets an owner and a fix-by date.
Predictive Test Selection (PTS) reduces PR suite to relevant tests: 45 min → 8 min.

Low → High

CI trust

45 min → 8 min (TIA)

Suite time

✅ Start with one scenario

You don't need to deploy all layers at once. Pick the scenario that matches your biggest pain point. Flaky tests? Start with TORS and quarantine. Architecture erosion? Start with ArchUnit. Computation bugs? Start with PBT on your critical calculations.