Unit & Integration Testing (Mid-Depth)

What This Builds On

The surface layer got you started with basic testing: write a test, verify it catches bugs, run it before deploying. You understand the difference between unit and integration tests, and you’ve shipped code with test coverage.

This mid-depth layer is for building production-ready test suites. You’re solving different problems now:

Tests are slow. Your suite takes 15 minutes to run. Developers skip tests before committing. CI/CD blocks deploys for half an hour.
Tests are brittle. You refactor code and 30 tests break, even though behavior didn’t change. Tests are coupled to implementation details.
Tests don’t catch real bugs. You have 80% coverage but bugs still ship. Your tests verify your code runs, not that it’s correct.
Mocking confusion. You mock everything and tests pass while production breaks. You mock nothing and tests are slow and flaky.
Growing codebase complexity. Your simple test approach doesn’t scale. Test data is a mess. Tests depend on each other.

We’ll fix these problems with techniques from testing experts: Martin Fowler’s test double taxonomy, Kent C. Dodds’ Testing Trophy, and Maurizio Aniche’s test effectiveness research.

The Problems You’re Solving

Problem 1: Slow Tests Kill Productivity

Your integration tests hit the database. Every test creates test data, runs queries, cleans up. The suite takes 20 minutes.

Developers stop running tests locally. They push code, wait for CI/CD, then fix failures. The feedback loop is 30 minutes instead of 30 seconds.

Solution approach:

Use test doubles at architectural boundaries (mock external APIs, not internal functions)
Use fast in-memory databases for integration tests
Parallelize test execution
Separate fast tests from slow tests

We’ll cover these in detail.

Problem 2: Brittle Tests Slow Down Refactoring

You rename a function. You change how a component renders internally. You refactor from classes to hooks. Tests break everywhere, even though behavior is identical.

This is testing implementation instead of behavior. Your tests are coupled to how code works, not what it does.

Solution approach:

Test behavior users see, not internal implementation
Avoid testing private functions directly
Use higher-level integration tests instead of unit tests for complex interactions
Follow the Testing Trophy model

Problem 3: High Coverage, Low Effectiveness

You have 90% code coverage. Tests are green. A user enters unexpected input and the app crashes. Coverage measured lines executed, not whether tests were meaningful.

Solution approach:

Property-based testing generates inputs you wouldn’t think to test
Mutation testing validates tests actually catch bugs
Focus on specification-based testing (what should happen) not just structural testing (what code exists)

Problem 4: Mock Abuse Creates False Confidence

You mock the database. You mock the email service. You mock internal functions. You mock everything. Tests pass. You deploy. Everything breaks because mocks don’t match reality.

Solution approach:

Understand test double types: mocks, stubs, fakes, spies
Mock at architectural boundaries, not everywhere
Use real implementations when practical
Keep mocks synchronized with real services (contract testing)

Test Doubles: When and How to Fake Dependencies

Martin Fowler’s test doubles taxonomy clarifies the confusion around mocks, stubs, fakes, and spies. Each type serves different purposes.

The Five Types of Test Doubles

1. Dummy Objects

Objects passed around but never actually used. They fill parameter lists but aren’t accessed.

test('processes order without using logger', () => {
  const dummyLogger = null; // Never called, just fills parameter
  const order = new Order(dummyLogger);

  order.addItem('product-123');

  expect(order.total()).toBe(29.99);
});

You rarely create these intentionally. They emerge when functions require parameters you don’t care about in a specific test.

2. Stubs

Return canned responses to calls. No logic, just predetermined data.

// Stub payment gateway - always succeeds
class StubPaymentGateway {
  async charge(amount, card) {
    return {
      id: 'ch_stub123',
      status: 'succeeded',
      amount: amount
    };
  }
}

test('creates order when payment succeeds', async () => {
  const paymentGateway = new StubPaymentGateway();
  const order = await checkout(cart, paymentGateway);

  expect(order.status).toBe('paid');
  expect(order.paymentId).toBe('ch_stub123');
});

Stubs verify state. Did the payment result in a paid order? Stubs don’t care how you used them, only what they return.

When to use stubs:

External services you don’t control (payment gateways, shipping APIs)
Tests that need specific responses (error conditions, edge cases)
Making slow operations fast (database reads become instant returns)

3. Spies

Record how they were called. Wrap real objects to observe behavior.

test('sends welcome email after signup', async () => {
  const emailSpy = {
    sent: [],
    async send(to, subject, body) {
      this.sent.push({ to, subject, body });
    }
  };

  await createUser('user@example.com', 'password', emailSpy);

  // Verify email was sent
  expect(emailSpy.sent).toHaveLength(1);
  expect(emailSpy.sent[0].to).toBe('user@example.com');
  expect(emailSpy.sent[0].subject).toContain('Welcome');
});

Spies verify behavior. Was the email service called with the right arguments?

Most testing frameworks include spy functionality:

// Jest spy
const emailService = {
  send: jest.fn()
};

await createUser('user@example.com', 'password', emailService);

expect(emailService.send).toHaveBeenCalledWith(
  'user@example.com',
  expect.stringContaining('Welcome'),
  expect.anything()
);

When to use spies:

Verify side effects happened (email sent, event logged, API called)
Check function was called with correct arguments
Track call count (function called exactly once, or three times)

4. Mocks

Pre-programmed with expectations. Tests fail if expectations aren’t met.

test('saves audit log on critical action', async () => {
  const auditLogger = {
    log: jest.fn()
  };

  await performCriticalAction(auditLogger);

  // Mock expectation: log must be called with specific arguments
  expect(auditLogger.log).toHaveBeenCalledWith({
    action: 'critical_action',
    timestamp: expect.any(Number),
    user: expect.any(String)
  });
});

Mocks and spies look similar. The difference is in intent. Spies observe what happened. Mocks define what should happen and fail if it doesn’t.

When to use mocks:

Verify interactions with dependencies (logging, analytics, external APIs)
Ensure required side effects occur (audit logging, event publishing)
Test code paths that depend on specific call sequences

5. Fakes

Working implementations with shortcuts. Simpler than production but functionally equivalent for testing.

// Fake in-memory database
class FakeDatabase {
  constructor() {
    this.users = [];
  }

  async create(data) {
    const user = { id: this.users.length + 1, ...data };
    this.users.push(user);
    return user;
  }

  async findById(id) {
    return this.users.find(u => u.id === id);
  }

  async clear() {
    this.users = [];
  }
}

// Use in tests
let db;

beforeEach(() => {
  db = new FakeDatabase();
});

test('can create and retrieve user', async () => {
  const user = await db.create({ email: 'test@example.com' });
  const found = await db.findById(user.id);

  expect(found.email).toBe('test@example.com');
});

Fakes are real implementations, just simpler. An in-memory database instead of PostgreSQL. A fake file system instead of disk I/O.

When to use fakes:

Database (in-memory SQLite instead of PostgreSQL)
File system (in-memory instead of disk)
Time/clock (controllable time instead of real system time)
Random number generators (seeded RNG for reproducibility)

The Mock Abuse Anti-Pattern

The most common testing mistake is mocking everything. It creates tests that pass while production code is broken.

Example of over-mocking:

// ❌ Bad - testing implementation, not behavior
test('creates user', async () => {
  const mockValidator = jest.fn().mockReturnValue(true);
  const mockHasher = jest.fn().mockReturnValue('hashed_password');
  const mockDb = jest.fn().mockResolvedValue({ id: 1 });
  const mockEmailer = jest.fn().mockResolvedValue(true);

  await createUser(
    'test@example.com',
    'password',
    mockValidator,
    mockHasher,
    mockDb,
    mockEmailer
  );

  expect(mockValidator).toHaveBeenCalled();
  expect(mockHasher).toHaveBeenCalled();
  expect(mockDb).toHaveBeenCalled();
  expect(mockEmailer).toHaveBeenCalled();
});

This test passes if you call the mocked functions, even if every function has bugs. You’re testing that code calls functions, not that it works correctly.

Better approach:

// ✅ Good - testing behavior with minimal mocking
test('creates user and sends welcome email', async () => {
  // Use real validator
  // Use real password hasher
  // Use real test database
  // Only mock external email service
  const emailSpy = { sent: [] };

  const user = await createUser('test@example.com', 'SecurePass123!', {
    sendEmail: (to, subject, body) => emailSpy.sent.push({ to, subject, body })
  });

  // Verify user was created correctly
  expect(user.id).toBeTruthy();
  expect(user.email).toBe('test@example.com');

  // Verify password was hashed (not stored plaintext)
  expect(user.password).not.toBe('SecurePass123!');
  expect(user.password).toMatch(/^\$2[aby]\$/); // bcrypt hash

  // Verify email was sent
  expect(emailSpy.sent).toHaveLength(1);
  expect(emailSpy.sent[0].to).toBe('test@example.com');

  // Verify can log in with created user
  const session = await login('test@example.com', 'SecurePass123!');
  expect(session.userId).toBe(user.id);
});

This test uses real implementations for everything except the external email service. If validation fails, password hashing breaks, or database saving fails, the test catches it.

Decision Framework: Mock or Real?

Use real implementations when:

Internal functions in your codebase
Pure functions with no side effects
Fast operations (validation, calculation, formatting)
In-memory alternatives exist (SQLite instead of PostgreSQL)

Use test doubles when:

External services you don’t control (Stripe, Twilio, AWS)
Slow operations you can’t speed up (network calls, file I/O)
Unpredictable operations (current time, random values)
Testing error conditions (simulate API failure)

Specific recommendations:

Dependency	Recommendation	Reason
Database	Real (in-memory)	Fake (SQLite) fast enough, catches real SQL bugs
External API	Stub/Mock	You don’t control it, might be down, costs money
Email service	Stub/Spy	External service, slow, costs money
File system	Fake (in-memory)	Fake file systems fast and deterministic
Time/dates	Fake (controllable)	Makes tests deterministic
Random	Fake (seeded)	Makes tests reproducible
Validation	Real	Fast, pure logic, your code
Calculations	Real	Fast, pure logic, your code

Test-Driven Development (TDD) in Practice

TDD is a discipline: write test first, then implementation. Popularized by Kent Beck, it’s been controversial. Some swear by it. Others find it impractical.

The truth is context-dependent.

The Red-Green-Refactor Cycle

TDD follows a three-step loop:

Red: Write a failing test
Green: Write minimal code to make it pass
Refactor: Improve code while keeping tests green

Complete example - Building a shopping cart:

Step 1: Red - Write failing test

test('can add item to cart', () => {
  const cart = new ShoppingCart();
  cart.addItem({ id: '123', name: 'Widget', price: 9.99 });

  expect(cart.items).toHaveLength(1);
  expect(cart.items[0].name).toBe('Widget');
});

// Test fails - ShoppingCart doesn't exist

Step 2: Green - Minimal implementation

class ShoppingCart {
  constructor() {
    this.items = [];
  }

  addItem(item) {
    this.items.push(item);
  }
}

// Test passes

Step 3: Refactor - Improve (nothing to improve yet)

Continue the cycle.

Step 4: Red - Next test

test('calculates total', () => {
  const cart = new ShoppingCart();
  cart.addItem({ id: '123', price: 9.99 });
  cart.addItem({ id: '456', price: 15.00 });

  expect(cart.total()).toBe(24.99);
});

// Test fails - total() doesn't exist

Step 5: Green - Implement total()

class ShoppingCart {
  constructor() {
    this.items = [];
  }

  addItem(item) {
    this.items.push(item);
  }

  total() {
    return this.items.reduce((sum, item) => sum + item.price, 0);
  }
}

// Test passes

Step 6: Refactor - Add quantity support

Current design doesn’t support quantities. Refactor while tests stay green.

test('supports item quantities', () => {
  const cart = new ShoppingCart();
  cart.addItem({ id: '123', price: 9.99 }, 2);

  expect(cart.total()).toBe(19.98);
});

// Test fails - quantity not supported

Refactor to support quantities:

class ShoppingCart {
  constructor() {
    this.items = [];
  }

  addItem(item, quantity = 1) {
    this.items.push({ ...item, quantity });
  }

  total() {
    return this.items.reduce((sum, item) => {
      return sum + (item.price * item.quantity);
    }, 0);
  }
}

// All tests pass

TDD forces you to think about the interface before implementation. What should this function do? What should it return? What errors should it handle?

When TDD Works Well

TDD shines in specific contexts:

1. Well-understood requirements

You know what the code should do. You just need to implement it.

// Requirements: Calculate tax based on amount and rate
// Requirement: Reject negative amounts
// Requirement: Round to 2 decimal places

test('calculates tax for positive amount', () => {
  expect(calculateTax(100, 0.08)).toBe(8.00);
});

test('rejects negative amount', () => {
  expect(() => calculateTax(-100, 0.08)).toThrow('Amount must be positive');
});

test('rounds to 2 decimal places', () => {
  expect(calculateTax(100, 0.085)).toBe(8.50);
});

Writing tests first clarifies edge cases and requirements.

2. Bug fixes

Write a failing test that reproduces the bug, then fix the bug.

// Bug report: User can submit form with empty email
test('rejects empty email', () => {
  expect(() => validateSignup({ email: '', password: 'pass' }))
    .toThrow('Email required');
});

// Test fails - bug confirmed
// Now fix the bug
// Test passes - bug fixed and won't regress

This guarantees the bug won’t come back. If it does, the test catches it.

3. Refactoring legacy code

Write tests for current behavior (even if it’s bad), then refactor.

// Legacy code with no tests
function processPayment(amount, card) {
  // 200 lines of spaghetti code
}

// First: Write tests for current behavior
test('processes valid payment', () => {
  const result = processPayment(100, '4242424242424242');
  expect(result.status).toBe('success');
});

// Now refactor with confidence
// If tests pass, behavior is preserved

4. Learning new APIs or libraries

TDD helps you understand how an API works.

// Learning how Stripe API works
test('creates customer', async () => {
  const customer = await stripe.customers.create({
    email: 'test@example.com'
  });

  expect(customer.id).toMatch(/^cus_/);
  expect(customer.email).toBe('test@example.com');
});

Writing tests forces you to read documentation and experiment.

When TDD Struggles

TDD doesn’t work well in every context:

1. Exploratory work and spikes

You don’t know what you’re building yet. You’re experimenting with approaches.

// Bad fit for TDD
// You don't know what the UI should look like
// You're trying different layouts and interactions
// Write the UI first, add tests after you know what works

2. UI/UX development

Hard to write tests before you know what the UI should look like.

Better approach: Build UI, get feedback, then add tests for the settled design.

3. Unclear or changing requirements

Tests for requirements that change every day become wasted effort.

4. Learning unfamiliar technology

When learning a new framework, library, or domain, test-first is backwards. You don’t know what to test because you don’t understand the technology yet.

Better approach: Spike without tests to learn, then delete spike code and rebuild with TDD.

Pragmatic TDD: A Hybrid Approach

Neither pure TDD nor test-after is universally better. Use what works for the context.

Recommended workflow:

Spike/explore - Build quick proof-of-concept without tests
Once approach is clear - Delete spike code
Rebuild with TDD - Write tests first for settled design
Or test-after - If approach is very clear, implement then test

Example: Building a new feature

// Phase 1: Spike (no tests)
// Try different approaches to see what works
// Get feedback on UI/UX
// This code will be deleted

// Phase 2: Rebuild with TDD
// You know what works now
// Write tests first
test('user can filter products by category', () => {
  // Test code
});

// Implement feature with confidence

Kent Beck (who popularized TDD) admits he doesn’t always use it. Use TDD when it helps, skip it when it doesn’t.

Test Organization and Maintainability

As your codebase grows, test organization matters. Poorly organized tests become technical debt.

Test Structure Patterns

Two popular patterns: AAA and Given-When-Then.

AAA (Arrange-Act-Assert)

test('user can update profile', async () => {
  // Arrange - set up test data and preconditions
  const user = await createUser({
    email: 'user@example.com',
    name: 'Original Name'
  });
  const updates = { name: 'New Name' };

  // Act - perform the action being tested
  const result = await updateProfile(user.id, updates);

  // Assert - verify the outcome
  expect(result.name).toBe('New Name');
  expect(result.email).toBe('user@example.com'); // Unchanged

  // Verify database was updated
  const saved = await db.users.findById(user.id);
  expect(saved.name).toBe('New Name');
});

Clear structure makes tests readable. You see setup, action, and verification.

Given-When-Then (BDD style)

describe('Shopping Cart Discounts', () => {
  it('applies 10% discount when total exceeds $100', async () => {
    // Given a cart with $120 worth of items
    const cart = new ShoppingCart();
    cart.addItem({ id: '1', price: 100, name: 'Item 1' });
    cart.addItem({ id: '2', price: 20, name: 'Item 2' });

    // When calculating total
    const total = cart.calculateTotal();

    // Then 10% discount is applied
    expect(total).toBe(108); // $120 - 10% = $108
  });

  it('does not apply discount when total is under $100', () => {
    // Given a cart with $80 worth of items
    const cart = new ShoppingCart();
    cart.addItem({ id: '1', price: 80, name: 'Item 1' });

    // When calculating total
    const total = cart.calculateTotal();

    // Then no discount is applied
    expect(total).toBe(80);
  });
});

Given-When-Then reads like specifications. Tests document requirements.

Both patterns work. Pick one and be consistent.

Test Data Builders

As tests grow, creating test data becomes repetitive. Test Data Builder pattern solves this.

Without builder (repetitive):

test('admin can delete posts', async () => {
  const admin = await createUser({
    email: 'admin@example.com',
    name: 'Admin User',
    role: 'admin',
    verified: true,
    createdAt: new Date()
  });

  const post = await createPost({
    title: 'Test Post',
    content: 'Content',
    author: admin.id,
    published: true,
    createdAt: new Date()
  });

  await deletePost(admin, post.id);

  expect(await findPost(post.id)).toBeNull();
});

With builder (concise and readable):

class UserBuilder {
  constructor() {
    this.data = {
      email: 'test@example.com',
      name: 'Test User',
      role: 'user',
      verified: true
    };
  }

  withEmail(email) {
    this.data.email = email;
    return this;
  }

  asAdmin() {
    this.data.role = 'admin';
    return this;
  }

  unverified() {
    this.data.verified = false;
    return this;
  }

  async build() {
    return await createUser(this.data);
  }
}

// Usage
test('admin can delete posts', async () => {
  const admin = await new UserBuilder()
    .asAdmin()
    .build();

  const post = await new PostBuilder()
    .withAuthor(admin.id)
    .build();

  await deletePost(admin, post.id);

  expect(await findPost(post.id)).toBeNull();
});

Builders make tests readable. You see what matters (user is admin) and ignore what doesn’t (exact email address).

Another example with multiple variations:

class OrderBuilder {
  constructor() {
    this.data = {
      status: 'pending',
      total: 0,
      items: []
    };
  }

  withItem(product, quantity = 1) {
    this.data.items.push({ product, quantity });
    this.data.total += product.price * quantity;
    return this;
  }

  paid() {
    this.data.status = 'paid';
    return this;
  }

  shipped() {
    this.data.status = 'shipped';
    return this;
  }

  async build() {
    return await createOrder(this.data);
  }
}

// Tests become self-documenting
test('can refund paid orders', async () => {
  const order = await new OrderBuilder()
    .withItem({ id: '123', price: 50 })
    .paid()
    .build();

  const refund = await refundOrder(order.id);
  expect(refund.amount).toBe(50);
});

test('cannot refund shipped orders', async () => {
  const order = await new OrderBuilder()
    .withItem({ id: '123', price: 50 })
    .shipped()
    .build();

  await expect(refundOrder(order.id))
    .rejects.toThrow('Cannot refund shipped orders');
});

DAMP Tests, Not DRY Tests

DRY (Don’t Repeat Yourself) is good for production code. Tests should be DAMP (Descriptive And Meaningful Phrases).

Too DRY (bad for tests):

let user, product, cart;

beforeEach(async () => {
  user = await createStandardUser();
  product = await createStandardProduct();
  cart = await createStandardCart(user);
});

test('can checkout', async () => {
  // What's "standard"? Have to read beforeEach to understand
  await checkout(cart);
  expect(cart.status).toBe('complete');
});

test('cannot checkout empty cart', async () => {
  // Wait, this test uses empty cart but beforeEach creates non-empty cart?
  // Confusing!
  cart.items = [];
  await expect(checkout(cart)).rejects.toThrow();
});

DAMP (good for tests):

test('can checkout with items in cart', async () => {
  const user = await createUser('user@example.com');
  const cart = await createCart(user);
  await cart.addItem({ id: '123', price: 50 });

  await checkout(cart);

  expect(cart.status).toBe('complete');
});

test('cannot checkout empty cart', async () => {
  const user = await createUser('user@example.com');
  const cart = await createCart(user); // Empty cart, explicit

  await expect(checkout(cart)).rejects.toThrow('Cart is empty');
});

Each test is self-contained. You understand it without reading other code.

When to share setup:

Share setup when every test needs identical preconditions:

describe('User permissions', () => {
  let adminUser;

  beforeEach(async () => {
    // Every test needs admin user
    adminUser = await createUser({ role: 'admin' });
  });

  test('admin can delete users', async () => {
    // Uses adminUser
  });

  test('admin can ban users', async () => {
    // Uses adminUser
  });
});

But if tests need different data, make it explicit in each test.

Property-Based Testing

Most tests use examples. Property-based testing uses generated inputs and verifies properties hold.

Example-Based vs Property-Based

Example-based (traditional):

test('reversing a string twice returns original', () => {
  expect(reverse(reverse('hello'))).toBe('hello');
  expect(reverse(reverse('world'))).toBe('world');
  expect(reverse(reverse('a'))).toBe('a');
});

You test specific examples. But what about strings you didn’t think to test?

Property-based (generative):

const fc = require('fast-check');

test('reversing a string twice returns original', () => {
  fc.assert(
    fc.property(fc.string(), (str) => {
      expect(reverse(reverse(str))).toBe(str);
    })
  );
});

// This generates hundreds of random strings
// Including edge cases: '', 'a', very long strings, Unicode, etc.

The framework generates inputs and verifies the property (reversing twice returns original) holds for all of them.

When to Use Property-Based Testing

Property-based testing excels for code with clear invariants (rules that always hold).

Good fits:

1. Mathematical properties

// Property: Sorting is idempotent (sorting twice = sorting once)
fc.assert(
  fc.property(fc.array(fc.integer()), (arr) => {
    const sorted1 = sort(arr);
    const sorted2 = sort(sorted1);
    expect(sorted1).toEqual(sorted2);
  })
);

// Property: Sorted array has same elements as original
fc.assert(
  fc.property(fc.array(fc.integer()), (arr) => {
    const sorted = sort(arr);
    expect(sorted.length).toBe(arr.length);
    expect(sorted.sort()).toEqual(arr.sort()); // Both sorted, should match
  })
);

2. Encoders/decoders

// Property: Decoding encoded data returns original
fc.assert(
  fc.property(fc.anything(), (data) => {
    const encoded = encode(data);
    const decoded = decode(encoded);
    expect(decoded).toEqual(data);
  })
);

3. Business logic with invariants

// Property: Adding item increases cart total
fc.assert(
  fc.property(
    fc.float({ min: 0.01, max: 1000 }),
    (price) => {
      const cart = new ShoppingCart();
      const before = cart.total();

      cart.addItem({ id: 'test', price });

      expect(cart.total()).toBeGreaterThan(before);
      expect(cart.total()).toBe(before + price);
    }
  )
);

// Property: Removing all items leaves cart empty
fc.assert(
  fc.property(
    fc.array(fc.record({
      id: fc.string(),
      price: fc.float({ min: 0.01 })
    })),
    (items) => {
      const cart = new ShoppingCart();
      items.forEach(item => cart.addItem(item));

      cart.clear();

      expect(cart.items).toHaveLength(0);
      expect(cart.total()).toBe(0);
    }
  )
);

4. Validation logic

// Property: Valid email format is accepted
fc.assert(
  fc.property(
    fc.emailAddress(),
    (email) => {
      expect(() => validateEmail(email)).not.toThrow();
    }
  )
);

// Property: Invalid format is rejected
fc.assert(
  fc.property(
    fc.string().filter(s => !s.includes('@')),
    (invalid) => {
      expect(() => validateEmail(invalid)).toThrow();
    }
  )
);

Property-Based Testing Libraries

JavaScript: fast-check
Python: Hypothesis
Java: jqwik
Haskell: QuickCheck (the original)

Limitations

Property-based testing can’t replace all tests. Some behaviors don’t have clear properties.

// Hard to express as property
test('welcome email has correct subject line', async () => {
  const email = await sendWelcomeEmail('user@example.com');
  expect(email.subject).toBe('Welcome to our app!');
});

There’s no property here, just a specific expected value. Use example-based tests.

Use property-based testing for code with mathematical or logical invariants. Use example-based testing for everything else.

Integration Testing Strategies

Integration tests verify components work together. More realistic than unit tests, slower to run.

Testing with Real Databases

Don’t mock your database in integration tests. Use a real database (but not production).

Transaction rollback pattern:

describe('User repository', () => {
  let transaction;

  beforeEach(async () => {
    transaction = await db.beginTransaction();
  });

  afterEach(async () => {
    await transaction.rollback();
  });

  test('can create and find user', async () => {
    const user = await userRepo.create({
      email: 'test@example.com',
      name: 'Test User'
    }, { transaction });

    const found = await userRepo.findById(user.id, { transaction });

    expect(found.email).toBe('test@example.com');
    expect(found.name).toBe('Test User');
  });

  test('prevents duplicate email', async () => {
    await userRepo.create({
      email: 'user@example.com',
      name: 'User One'
    }, { transaction });

    await expect(userRepo.create({
      email: 'user@example.com',
      name: 'User Two'
    }, { transaction })).rejects.toThrow('Email already exists');
  });
});

Each test runs in a transaction that rolls back. No cleanup needed. Tests are isolated.

Isolated test database pattern:

// Each test gets fresh database
beforeAll(async () => {
  await db.migrate.latest();
});

beforeEach(async () => {
  await db.seed.run(); // Seed with test data
});

afterEach(async () => {
  await db.raw('TRUNCATE TABLE users CASCADE');
  await db.raw('TRUNCATE TABLE orders CASCADE');
});

Truncating tables between tests ensures isolation.

In-memory database for speed:

// Use SQLite in-memory for tests instead of PostgreSQL
const db = knex({
  client: 'sqlite3',
  connection: ':memory:',
  useNullAsDefault: true
});

// Runs much faster than PostgreSQL
// Good enough for most integration tests

SQLite is fast (in-memory) but not identical to PostgreSQL. Trade-off: speed vs production parity.

Testing External APIs

Don’t call real external APIs in tests. They’re slow, cost money, and might be down.

Mock Service Worker (MSW) for HTTP mocking:

import { rest } from 'msw';
import { setupServer } from 'msw/node';

const server = setupServer(
  rest.get('https://api.stripe.com/v1/charges/:id', (req, res, ctx) => {
    return res(ctx.json({
      id: req.params.id,
      amount: 1000,
      status: 'succeeded'
    }));
  }),

  rest.post('https://api.stripe.com/v1/charges', (req, res, ctx) => {
    return res(ctx.json({
      id: 'ch_test123',
      amount: req.body.amount,
      status: 'succeeded'
    }));
  })
);

beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

test('can process payment', async () => {
  const result = await stripe.charges.create({
    amount: 1000,
    currency: 'usd',
    source: 'tok_visa'
  });

  expect(result.status).toBe('succeeded');
  expect(result.amount).toBe(1000);
});

test('handles payment failure', async () => {
  // Override handler for this test
  server.use(
    rest.post('https://api.stripe.com/v1/charges', (req, res, ctx) => {
      return res(ctx.status(402), ctx.json({
        error: { message: 'Card declined' }
      }));
    })
  );

  await expect(stripe.charges.create({
    amount: 1000,
    currency: 'usd',
    source: 'tok_visa'
  })).rejects.toThrow('Card declined');
});

MSW intercepts HTTP requests and returns mock responses. Your code makes real HTTP calls, but they never leave your machine.

Contract Testing for Microservices

When testing microservices, you have a problem: services depend on each other. How do you test service A without running service B?

Contract testing (using Pact) solves this:

Consumer test (Frontend expects specific API response):

const { Pact } = require('@pact-foundation/pact');

const provider = new Pact({
  consumer: 'Frontend',
  provider: 'UserAPI'
});

describe('User API', () => {
  beforeAll(() => provider.setup());
  afterAll(() => provider.finalize());

  test('can get user by ID', async () => {
    await provider.addInteraction({
      state: 'user 123 exists',
      uponReceiving: 'a request for user 123',
      withRequest: {
        method: 'GET',
        path: '/users/123'
      },
      willRespondWith: {
        status: 200,
        headers: { 'Content-Type': 'application/json' },
        body: {
          id: 123,
          name: 'John Doe',
          email: 'john@example.com'
        }
      }
    });

    const user = await userAPI.getUser(123);

    expect(user.name).toBe('John Doe');
    expect(user.email).toBe('john@example.com');
  });
});

This creates a contract: “When I GET /users/123, I expect this response format.”

Provider verification (Backend must satisfy contract):

The backend runs the same test to verify it actually returns what the frontend expects:

const { Verifier } = require('@pact-foundation/pact');

it('validates the expectations of Frontend', () => {
  return new Verifier({
    provider: 'UserAPI',
    providerBaseUrl: 'http://localhost:3000',
    pactUrls: ['path/to/frontend-userapi.json']
  }).verifyProvider();
});

If the backend changes the response format, this test fails. You know the frontend will break before deploying.

Contract testing prevents integration bugs between services.

Test Effectiveness and Quality

You have tests. But are they good tests?

Mutation Testing

Mutation testing answers: “If I introduce bugs, do my tests catch them?”

How it works:

Mutation testing tool changes your code (mutates it)
Runs tests against mutated code
If tests still pass, mutation survived (weak tests)
If tests fail, mutation killed (good tests)

Example with Stryker (JavaScript):

npm install --save-dev @stryker-mutator/core
npx stryker run

Original code:

function isAdult(age) {
  return age >= 18;
}

Test:

test('isAdult returns true for age 18', () => {
  expect(isAdult(18)).toBe(true);
});

Stryker mutates code:

// Mutation: >= becomes >
function isAdult(age) {
  return age > 18;
}

Test still passes! Mutation survived. Your test didn’t verify the boundary condition.

Add better test:

test('isAdult boundary cases', () => {
  expect(isAdult(18)).toBe(true);  // Exactly 18
  expect(isAdult(17)).toBe(false); // Just under
  expect(isAdult(19)).toBe(true);  // Just over
});

Now when Stryker mutates >= to >, test fails. Mutation killed.

Mutation testing score shows test quality:

90%+ mutation score = excellent tests
70-90% = good tests
<70% = weak tests

Mutation testing is slow (runs your tests many times with different mutations). Run it occasionally, not on every commit.

Test Smells (Maurizio Aniche)

Common problems that make tests worse than useless:

1. Flaky Test

Test passes/fails randomly.

Causes:

Timing issues (race conditions, hardcoded sleeps)
Shared state between tests
External dependencies (network, file system)
Randomness (Math.random() in production code)

Fix:

// ❌ Flaky - timing dependent
test('modal appears', async () => {
  clickButton();
  await sleep(100); // Maybe enough, maybe not
  expect(modal).toBeVisible();
});

// ✅ Fixed - wait for condition
test('modal appears', async () => {
  clickButton();
  await waitFor(() => expect(modal).toBeVisible());
});

2. Slow Test

Test takes too long. Developers skip running it.

Fix:

Use in-memory database instead of PostgreSQL
Mock external APIs
Parallelize test execution
Move to integration/E2E suite (run less frequently)

3. Mystery Guest

Test depends on hidden external state.

// ❌ Mystery guest - where did test-data.json come from?
test('loads users', async () => {
  const users = await loadUsers('test-data.json');
  expect(users).toHaveLength(5);
});

// ✅ Fixed - explicit data
test('loads users', async () => {
  const data = [
    { id: 1, name: 'User 1' },
    { id: 2, name: 'User 2' }
  ];
  const users = await loadUsers(data);
  expect(users).toHaveLength(2);
});

4. Resource Optimism

Test assumes resources exist.

// ❌ Assumes file exists
test('reads config', () => {
  const config = readFile('/tmp/config.json');
  expect(config.setting).toBe('value');
});

// ✅ Creates resource in test
test('reads config', () => {
  writeFile('/tmp/test-config.json', { setting: 'value' });
  const config = readFile('/tmp/test-config.json');
  expect(config.setting).toBe('value');
  deleteFile('/tmp/test-config.json');
});

5. Assertion Roulette

Too many assertions. Which one failed?

// ❌ Which assertion failed?
test('user data', () => {
  expect(user.name).toBe('John');
  expect(user.email).toBe('john@example.com');
  expect(user.age).toBe(30);
  expect(user.role).toBe('admin');
  expect(user.verified).toBe(true);
});

// ✅ Separate tests or descriptive error
test('user has correct name', () => {
  expect(user.name).toBe('John');
});

test('user has correct email', () => {
  expect(user.email).toBe('john@example.com');
});

6. Conditional Test Logic

If/else in tests means some paths never execute.

// ❌ Conditional logic in test
test('processes data', () => {
  if (data.type === 'A') {
    expect(process(data)).toBe('result A');
  } else {
    expect(process(data)).toBe('result B');
  }
});

// ✅ Separate tests
test('processes type A', () => {
  const data = { type: 'A' };
  expect(process(data)).toBe('result A');
});

test('processes type B', () => {
  const data = { type: 'B' };
  expect(process(data)).toBe('result B');
});

Code Coverage: What It Means (and Doesn’t)

Code coverage measures what lines executed, not whether tests are good.

Coverage types:

Line coverage: % of lines executed
Branch coverage: % of if/else branches taken
Function coverage: % of functions called
Statement coverage: % of statements executed

The problem with coverage as a metric:

// 100% coverage, zero value
test('function runs', () => {
  myFunction();
  expect(true).toBe(true);
});

Every line executed. No verification of correctness.

Better metric: Mutation testing score

Mutation testing measures whether tests catch bugs, not just execute code.

How to use coverage:

Use coverage to find untested code
Don’t use coverage as quality metric
Don’t mandate 100% coverage (diminishing returns)
Focus on critical path coverage (not every getter/setter)

Good coverage target: 70-80% for critical code paths

100% coverage has costs:

Time writing tests for trivial code
Brittle tests for code that doesn’t need testing
False confidence from bad tests that achieve coverage

Practical CI/CD Integration

Tests only help if they run. Integrate testing into your deployment pipeline.

Test Running Strategy

Run fast tests first, slow tests later. Fast feedback on most bugs.

# GitHub Actions example
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Install dependencies
        run: npm install

      # Fast feedback: Unit tests first (seconds)
      - name: Unit Tests
        run: npm run test:unit

      # Medium speed: Integration tests (minutes)
      - name: Integration Tests
        run: npm run test:integration

      # Optional on PR, required on main branch
      - name: E2E Tests
        if: github.ref == 'refs/heads/main'
        run: npm run test:e2e

      # Coverage report
      - name: Upload Coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/lcov.info

Strategy:

Unit tests on every commit (fast, immediate feedback)
Integration tests on every commit (slower, but catches most bugs)
E2E tests on main branch only (slow, catches integration issues)

Test parallelization:

// Jest configuration
module.exports = {
  maxWorkers: '50%', // Use 50% of CPU cores
  testMatch: ['**/__tests__/**/*.test.js']
};

Or split across CI workers:

# Run different test suites in parallel
jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - run: npm run test:unit

  integration-tests:
    runs-on: ubuntu-latest
    steps:
      - run: npm run test:integration

  e2e-tests:
    runs-on: ubuntu-latest
    steps:
      - run: npm run test:e2e

All three jobs run simultaneously. Total time = slowest job, not sum of all jobs.

Failing Tests Block Deployment

Configure CI/CD to prevent deployment if tests fail:

deploy:
  needs: [unit-tests, integration-tests]
  runs-on: ubuntu-latest
  steps:
    - name: Deploy to production
      run: ./deploy.sh

Deployment only runs if tests pass. No broken code in production.

What’s Next

This mid-depth layer covered production-ready testing practices:

Test doubles (mocks, stubs, fakes, spies) and when to use each
TDD workflow and when it helps vs hurts
Test organization with builders and DAMP principles
Property-based testing for invariants
Integration testing with real databases
Test effectiveness with mutation testing
Test smells and how to avoid them
CI/CD integration strategies

You can now build test suites that are fast, reliable, and catch real bugs.

When you’re ready for deep-water:

Contract testing for microservices at scale
Chaos engineering for distributed systems
Advanced test data management strategies
Testing in production with feature flags and canary releases
Test organization for monorepos and large teams

Related topics:

Security Testing - Test for vulnerabilities (SAST, DAST, penetration testing)
CI/CD Pipelines - Automate testing in deployment pipeline
Refactoring - Tests enable confident refactoring

The Bottom Line

Good tests prevent bugs, enable refactoring, and make deployments boring.

Mock at architectural boundaries. Test behavior, not implementation. Use property-based testing for invariants. Run fast tests frequently, slow tests less often.

Focus on test effectiveness (do they catch bugs?) not test coverage (what lines ran?).

Tests are production code. Treat them with the same care.

Unit & Integration Testing (Mid-Depth)

What This Builds On

The Problems You’re Solving

Problem 1: Slow Tests Kill Productivity

Problem 2: Brittle Tests Slow Down Refactoring

Problem 3: High Coverage, Low Effectiveness

Problem 4: Mock Abuse Creates False Confidence

Test Doubles: When and How to Fake Dependencies

The Five Types of Test Doubles

1. Dummy Objects

2. Stubs

3. Spies

4. Mocks

5. Fakes

The Mock Abuse Anti-Pattern

Decision Framework: Mock or Real?

Test-Driven Development (TDD) in Practice

The Red-Green-Refactor Cycle

When TDD Works Well

When TDD Struggles

Pragmatic TDD: A Hybrid Approach

Test Organization and Maintainability

Test Structure Patterns

Test Data Builders

DAMP Tests, Not DRY Tests

Property-Based Testing

Example-Based vs Property-Based

When to Use Property-Based Testing

Property-Based Testing Libraries

Limitations

Integration Testing Strategies

Testing with Real Databases

Testing External APIs

Contract Testing for Microservices

Test Effectiveness and Quality

Mutation Testing

Test Smells (Maurizio Aniche)

Code Coverage: What It Means (and Doesn’t)

Practical CI/CD Integration

Test Running Strategy

Failing Tests Block Deployment

What’s Next

The Bottom Line

Want to Go Deeper?

Related Topics