Testing Strategy

Principle: Test pyramid — many unit tests, few E2E. Contract tests at service boundaries.
Constraint: 5 engineers → testing must be automated, no dedicated QA team
AI Role: AI generates 60–70% test code, human validates edge cases + business rules
Source: Architect.md, Development Guide.md, Deployment.md, Deliverable 4.3 - Failure Modeling.md

1. Test Pyramid

                        ╱╲
                       ╱  ╲
                      ╱ E2E╲         5%   — Critical user journeys only
                     ╱______╲              Playwright, run nightly
                    ╱        ╲
                   ╱Integration╲    15%   — Service + DB + Service Bus
                  ╱____________╲          Testcontainers, run on PR
                 ╱              ╲
                ╱  Contract Tests ╲  20%  — Service boundary validation
               ╱__________________╲       Pact, run on PR (blocking)
              ╱                    ╲
             ╱     Unit Tests       ╲ 60% — Business logic, domain model
            ╱________________________╲    xUnit, run on every commit
           
  Target Coverage: ≥80% line coverage (unit + integration combined)
  Business logic:  ≥90% branch coverage
  Infrastructure:  ≥60% (Bicep what-if + deployment tests)

2. Test Types & Responsibilities

2.1 Unit Tests

Aspect	Detail
Framework	xUnit + FluentAssertions + Moq/NSubstitute
Scope	Domain entities, value objects, application services, validators, mappers
What to test	Business rules, edge cases, validation logic, state transitions
What NOT to test	EF Core queries (→ integration), HTTP endpoints (→ integration), 3rd party libs
Naming	`MethodName_Scenario_ExpectedResult`
AI role	AI generates happy path + 2–3 edge cases. Human adds business-specific cases
Run	Every commit (< 30 seconds per service)

// Example: Travel Service — Booking domain logic
public class BookingTests
{
    [Fact]
    public void Cancel_WhenWithin24Hours_ShouldRefundFull()
    {
        var booking = Booking.Create(traveler, flight, DateTime.UtcNow);
        booking.Cancel(DateTime.UtcNow.AddHours(12));
        
        booking.Status.Should().Be(BookingStatus.Cancelled);
        booking.RefundAmount.Should().Be(booking.TotalAmount);
    }

    [Fact]
    public void Cancel_WhenAfter24Hours_ShouldRefund50Percent()
    {
        var booking = Booking.Create(traveler, flight, DateTime.UtcNow.AddDays(-2));
        booking.Cancel(DateTime.UtcNow);
        
        booking.RefundAmount.Should().Be(booking.TotalAmount * 0.5m);
    }

    [Fact]
    public void Cancel_WhenAlreadyCancelled_ShouldThrowDomainException()
    {
        var booking = Booking.Create(traveler, flight, DateTime.UtcNow);
        booking.Cancel(DateTime.UtcNow);
        
        var act = () => booking.Cancel(DateTime.UtcNow);
        act.Should().Throw<DomainException>()
           .WithMessage("Booking already cancelled");
    }
}

2.2 Integration Tests

Aspect	Detail
Framework	xUnit + Testcontainers (SQL Server, Service Bus emulator)
Scope	API endpoints, EF Core queries, message handlers, ACL calls
Pattern	`WebApplicationFactory<Program>` with real DB (containerized)
What to test	Full request pipeline, DB persistence, message publish/consume
What NOT to test	UI rendering (→ E2E), business logic already covered by unit tests
Run	Every PR (2–5 minutes per service)

// Example: Travel API integration test
public class BookingApiTests : IClassFixture<TravelApiFactory>
{
    [Fact]
    public async Task CreateBooking_WithValidData_Returns201AndPublishesEvent()
    {
        // Arrange
        var client = _factory.CreateClient();
        var request = new CreateBookingRequest { /* ... */ };

        // Act
        var response = await client.PostAsJsonAsync("/api/bookings", request);

        // Assert
        response.StatusCode.Should().Be(HttpStatusCode.Created);
        
        // Verify event published to Service Bus
        var publishedEvent = await _factory.GetPublishedEvent<BookingCreatedEvent>();
        publishedEvent.Should().NotBeNull();
        publishedEvent.BookingId.Should().NotBeEmpty();
    }
}

2.3 Contract Tests (Pact)

Aspect	Detail
Framework	PactNet (consumer-driven)
Scope	Service-to-service API contracts, ACL contracts
Pattern	Consumer writes expectations → Provider verifies
Critical contracts	Travel → Payment ACL, Event → Comms, Workforce → Reporting
Broker	Pactflow (SaaS) for contract storage + verification
Run	Every PR (blocking — broken contract = blocked merge)

CONTRACT MAP:

Travel Service (Consumer) ←→ Payment ACL (Provider)
  Contract: POST /api/payments/authorize
    Request:  { bookingId, amount, currency }
    Response: { transactionId, status: "authorized"|"declined" }

Event Service (Consumer) ←→ Comms Service (Provider)
  Contract: Event "EventCreated" on Service Bus
    Schema:  { eventId, title, attendees[], notificationType }

Workforce (Consumer) ←→ Reporting (Provider)
  Contract: GET /api/reports/team/{teamId}/hours
    Response: { teamId, totalHours, members[] }

Why Contract Tests > E2E for microservices?

E2E test 1 user journey:
  Click "Book" → Travel → Payment ACL → Legacy → Comms → Email
  = 5 services must be running + healthy + correct data
  = Flaky, slow (30s+), hard to debug
  = One failure anywhere → entire test fails

Contract test same journey:
  Travel → Payment ACL: 2 tests (isolated, fast, deterministic)
  Travel → Comms: 2 tests (isolated, fast, deterministic)
  = Each test runs independently (< 1 second)
  = Failure pinpoints EXACTLY which contract broke
  = No shared state, no flakiness

2.4 E2E Tests

Aspect	Detail
Framework	Playwright (.NET)
Scope	Critical user journeys ONLY (5–10 scenarios max)
Environment	Staging (prod-like, real services)
Run	Nightly (not on PR — too slow, too flaky)
Owned by	D5 (Frontend + DevOps)

Critical E2E Scenarios:

#	Journey	Services Involved	Priority
1	User logs in → creates booking → sees confirmation	Auth, Travel, Comms	P1
2	Manager approves booking → payment authorized	Travel, Payment ACL	P1
3	Create event → attendees notified	Event, Comms	P2
4	View workforce report → export PDF	Workforce, Reporting	P2
5	Canary: new service version serves traffic correctly	Any (smoke test)	P1

2.5 Performance Tests

Aspect	Detail
Tool	k6 (open-source load testing)
Scope	API load test per service + system-wide load
Targets	P95 latency < 200ms, throughput ≥ 100 req/s sustained
Scenarios	Normal load (22 req/s), Peak (100 req/s), Burst (220 req/s, 5 min)
Run	Before each service go-live + monthly thereafter
Environment	Staging with prod-like data volume

// k6 script example: Travel Service load test
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 50 },   // Ramp up to 50 VUs
    { duration: '5m', target: 100 },   // Sustained 100 VUs
    { duration: '2m', target: 200 },   // Burst
    { duration: '1m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'],  // P95 < 200ms
    http_req_failed: ['rate<0.01'],    // Error rate < 1%
  },
};

export default function () {
  const res = http.get('https://staging.product.com/api/bookings');
  check(res, { 'status is 200': (r) => r.status === 200 });
  sleep(1);
}

3. Test Automation in CI/CD

┌────────────────────────────────────────────────────────────────────┐
│  CI PIPELINE (GitHub Actions) — Per Service                        │
│                                                                    │
│  ┌──────────┐   ┌──────────────┐   ┌────────────┐   ┌──────────┐ │
│  │ Build    │──▶│ Unit Tests   │──▶│ Contract   │──▶│Integration│ │
│  │          │   │ (< 30s)      │   │ Tests      │   │ Tests     │ │
│  │          │   │ Coverage ≥80%│   │ (Pact)     │   │ (2–5min)  │ │
│  └──────────┘   └──────────────┘   └────────────┘   └──────────┘ │
│       │                                                    │       │
│       ▼              ALL MUST PASS                          ▼       │
│  ┌──────────┐                                        ┌──────────┐ │
│  │ Security │  gitleaks + CodeQL + Dependabot         │ Merge OK │ │
│  │ Scan     │────────────────────────────────────────▶│          │ │
│  └──────────┘                                        └──────────┘ │
│                                                                    │
│  ┌── NIGHTLY ──────────────────────────────────────────────────┐  │
│  │  E2E Tests (Playwright) on Staging                          │  │
│  │  Report → Slack channel #quality                            │  │
│  └─────────────────────────────────────────────────────────────┘  │
│                                                                    │
│  ┌── PRE-RELEASE ──────────────────────────────────────────────┐  │
│  │  Performance Tests (k6) on Staging                          │  │
│  │  Must pass thresholds before go-live approval               │  │
│  └─────────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────────┘

4. Test Data Strategy

Concern	Approach
Unit tests	In-memory objects, builder pattern (e.g., `BookingBuilder.WithStatus(Cancelled).Build()`)
Integration tests	Testcontainers (fresh DB per test class). EF Core migrations applied on startup
Contract tests	Pact fixtures (JSON). Versioned with contract
E2E tests	Seeded test data in staging environment. Dedicated test tenant (isolated from real data)
Performance tests	Synthetic data generator (k6 + Faker). Scale: 10K bookings, 500 events, 5K users
PII in tests	NEVER use production data. Synthetic only. Faker library for realistic fake data

5. Quality Gates — Per Phase

Phase 0 (Month 1): Foundation

xUnit project structure per service
CI pipeline with unit tests
Testcontainers setup for integration tests
PactNet + Pactflow integration
Coverage reporting (Coverlet → PR comment)

Phase 1 (Month 2–4): First Services

Travel Service: ≥80% coverage, 10+ contract tests
Event Service: ≥80% coverage
ACL: Contract tests for Payment bridge (critical)
First E2E scenario running on staging

Phase 2 (Month 5–7): Full Suite

All 5 services: ≥80% coverage
All service-to-service contracts in Pactflow
5 E2E scenarios on nightly run
First k6 performance test (Travel Service)

Phase 3 (Month 8–9): Hardening

All services: performance tested, thresholds set
Chaos testing: kill one service container → verify graceful degradation
Full regression: all tests green (unit + integration + contract + E2E)
Test documentation: what's tested, what's not, known gaps

6. AI-Assisted Testing

Activity	AI Tool	Human Role
Generate unit test boilerplate	Cursor Pro (inline)	Review edge cases, add business-specific scenarios
Generate integration test setup	Claude API (agentic)	Verify DB assertions, check race conditions
Suggest missing test cases	CodeRabbit (PR review)	Accept/reject suggestions
Generate k6 scripts	Claude.ai	Validate thresholds, adjust load profile
Generate Pact contracts from OpenAPI	Claude API	Verify contract matches actual business rules

Anti-pattern to prevent:

❌ AI generates 50 unit tests, all pass, developer moves on
   → Problem: tests may only cover happy path
   → Fix: Enforce "mutation testing" spot-checks monthly
          (change business logic → test should fail. If not → gap)

❌ AI generates mock that always returns success
   → Problem: failure paths untested
   → Fix: PR review checklist: "Are error scenarios tested?"

7. Test Metrics Dashboard

Metric	Target	Alert If
Unit test coverage	≥80%	< 75% on any service
Contract test count	≥5 per service boundary	New endpoint without contract
E2E pass rate (nightly)	≥95%	< 90% for 3 consecutive nights
CI pipeline time	< 10 min	> 15 min (optimize or parallelize)
Performance P95 latency	< 200ms	> 300ms on staging
Flaky test rate	< 2%	> 5% (quarantine and fix)

Weekly Test Health Report (Slack #quality):

  Service       | Coverage | Contracts | E2E  | Perf P95 | Status
  ─────────────┼──────────┼───────────┼──────┼──────────┼───────
  Travel        |    85%   |    12     | 3/3  |  145ms   |  ✅
  Event         |    82%   |     8     | 2/2  |  120ms   |  ✅
  Workforce     |    79%   |     6     | 1/1  |  160ms   |  ⚠️ (coverage)
  Comms         |    88%   |     5     | 2/2  |   95ms   |  ✅
  Reporting     |    81%   |     4     | 1/1  |  180ms   |  ✅
  ACL           |    90%   |    10     | 1/1  |  110ms   |  ✅

Testing Strategy

Testing Strategy

1. Test Pyramid

2. Test Types & Responsibilities

2.1 Unit Tests

2.2 Integration Tests

2.3 Contract Tests (Pact)

2.4 E2E Tests

2.5 Performance Tests

3. Test Automation in CI/CD

4. Test Data Strategy

5. Quality Gates — Per Phase

Phase 0 (Month 1): Foundation

Phase 1 (Month 2–4): First Services

Phase 2 (Month 5–7): Full Suite

Phase 3 (Month 8–9): Hardening

6. AI-Assisted Testing

7. Test Metrics Dashboard

Related Documents

Links to →

← Referenced by