Runtime Platform Patterns
Use this guide to classify client runtime platforms before choosing patterns. The goal is not to rank stacks. The goal is to match delivery, security, observability, and operations practices to the platform that actually runs production.
Platform families
Orchestrated containers
Examples: Kubernetes, OpenShift, and managed Kubernetes services.
Best fit:
- Many services with shared networking, policy, and deployment needs.
- Teams that need workload portability or custom platform primitives.
- Organizations with platform engineering capacity.
Best practices:
- Use namespaces, labels, and ownership metadata consistently.
- Standardize deployment through Helm, Kustomize, or a paved-road abstraction.
- Use GitOps with Argo CD or Flux for cluster state.
- Enforce admission policy with Kyverno, OPA Gatekeeper, or native validating policies.
- Require resource requests, limits where appropriate, probes, pod disruption budgets, and autoscaling defaults.
- Use workload identity instead of static cloud credentials.
- Centralize ingress, certificates, DNS, secrets, metrics, logs, and traces as platform services.
Watchouts:
- Do not adopt Kubernetes just to run a small number of simple apps.
- Avoid per-team bespoke charts and cluster-specific snowflakes.
- Budget for upgrades, policy changes, capacity, and incident ownership.
Managed containers
Examples: Amazon ECS with Fargate, Google Cloud Run, and Azure Container Apps.
Best fit:
- Containerized services without a need to operate Kubernetes.
- Teams that want simpler deployment and fewer control-plane concerns.
- HTTP services, workers, and scheduled jobs with clear boundaries.
Best practices:
- Use immutable images and environment-specific task or service definitions.
- Keep networking, IAM, and secrets in infrastructure-as-code.
- Prefer service-to-service identity over shared credentials.
- Use blue-green, canary, or rolling deployments with automatic rollback signals.
- Emit structured logs, metrics, and traces using OpenTelemetry where possible.
- Define CPU, memory, concurrency, scaling, and health check defaults.
Watchouts:
- Avoid hiding platform drift in console-managed service definitions.
- Treat task roles and execution roles as separate security boundaries.
- Check cold start, startup time, and image pull behavior for bursty workloads.
Serverless and FaaS
Examples: AWS Lambda, Azure Functions, and Cloud Functions.
Best fit:
- Event-driven workflows, glue code, async jobs, and bursty workloads.
- Small services with low operational overhead requirements.
- Integrations that benefit from managed triggers and scaling.
Best practices:
- Model functions around events and business capabilities, not random utility code.
- Keep handlers thin and move domain logic into testable modules.
- Use infrastructure-as-code for triggers, permissions, queues, and retry behavior.
- Define timeout, memory, concurrency, dead-letter, and idempotency policies explicitly.
- Use structured logs with correlation IDs across events.
- Prefer queues and event buses over synchronous function chains.
Watchouts:
- Watch package size, cold starts, regional limits, and concurrency spikes.
- Avoid unbounded retries that amplify incidents or duplicate writes.
- Keep local and integration testing realistic; mocks alone are not enough.
Application PaaS
Examples: Azure App Service, Elastic Beanstalk, Heroku, and similar platforms.
Best fit:
- Standard web apps and APIs with conventional runtime needs.
- Teams that want simple deployment over platform flexibility.
- Internal apps where managed operations are more valuable than deep customization.
Best practices:
- Treat platform configuration as code where the provider allows it.
- Use deployment slots, health checks, and rollback-friendly releases.
- Externalize secrets into managed secret stores.
- Standardize runtime versions, buildpacks, container images, and environment variables.
- Capture logs, metrics, dependency health, and request traces.
Watchouts:
- Understand scaling limits, filesystem behavior, and networking constraints.
- Avoid manual console changes that are invisible to review and audit.
- Plan an exit path before platform-specific features become core architecture.
VM and legacy compute
Examples: autoscaling groups, long-lived Linux servers, IIS, Windows services, and vendor-managed appliances.
Best fit:
- Commercial software, legacy apps, stateful workloads, or specialized runtime constraints.
- Systems that cannot be containerized safely yet.
- Transitional platforms during modernization.
Best practices:
- Manage images with golden image pipelines or configuration management.
- Use immutable replacement where feasible instead of in-place mutation.
- Keep patching, backup, restore, endpoint protection, and access policies explicit.
- Centralize logs, metrics, traces, and host health checks.
- Put deployment, rollback, and break-glass steps in runbooks.
- Reduce snowflakes before attempting migration.
Watchouts:
- Long-lived hosts accumulate hidden state and security drift.
- SSH or RDP-driven operations are hard to audit and repeat.
- Migration should start with observability and deployment safety, not only runtime replacement.
Data and event platforms
Examples: Kafka, Kinesis, managed queues, batch jobs, warehouses, and lakehouses.
Best fit:
- Event streaming, analytics pipelines, async integration, and scheduled processing.
- Systems where data contracts matter as much as service deployment.
- Workloads with replay, ordering, retention, or backpressure needs.
Best practices:
- Define ownership for topics, schemas, jobs, tables, and dashboards.
- Version event schemas and validate producers and consumers in CI.
- Monitor lag, throughput, error rates, dead-letter queues, and cost.
- Treat data retention, deletion, encryption, and access as design requirements.
- Separate orchestration concerns from business transformation logic.
Watchouts:
- Data platforms often fail through silent quality drift, not crashes.
- Poorly owned shared topics become integration junk drawers.
- Backfills and replays need production-grade controls.
Selection heuristics
Use the smallest platform that satisfies the workload's operational and security requirements.
- Choose Kubernetes when shared platform primitives, policy, and custom orchestration justify the operating cost.
- Choose managed containers when teams need containers without cluster operations.
- Choose serverless when events, bursty scale, and managed triggers are the natural shape of the workload.
- Choose PaaS when app teams need a paved road more than infrastructure control.
- Keep VMs when modernization risk exceeds the benefit, but improve deployment, patching, and observability immediately.
- Treat data platforms as product surfaces with contracts, ownership, and lifecycle management.