Ultimate DevOps & Automation Maturity Assessment: 24 Key Practices Checklist
As a seasoned platform engineering leader with StoneTusker Systems, I've refined this assessment over dozens of 90-day transformations for regulated industries like healthcare and fintech. Drawn from DORA metrics and SRE principles, these 24 questions map directly to elite performance—elite teams hit multiple daily deploys with <15% failure rates.
Score honestly: Not doing (0) to Visionary (5). Total /120. Use the breakdowns to build your roadmap. This is your mirror for delivery excellence.
Score Benchmarks (DORA-Aligned)
- 0-30: Low – LT >6 months, CFR >46%.
- 31-60: Medium – LT 1 week-1 month, CFR 16-30%.
- 61-90: High – LT 1 day, CFR 0-15%.
- 91-110: Elite – Multiple daily deploys, MTTR <1hr.
- 111+: Visionary – Platform-led, AI-optimized.
Integration Practices
Tool sprawl kills velocity. Elite teams standardize pipelines for end-to-end flow, tying CI/CD to IaC and security via APIs/events—core to DORA's high deployment frequency.
Q1: Are CI/CD tools standardized and integrated across teams to enable end-to-end automation?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Team-specific tools (Jenkins per repo). | >10 pipeline variants; manual orchestration. |
| Novice (1) | Single tool adopted, config varies. | Pipeline library exists; 40% standardization. |
| Intermediate (2) | Shared templates with overrides. | 80% repos use golden pipelines. |
| Advanced (3) | Platform-managed pipelines-as-code. | Self-service triggers; drift detection. |
| Expert (4) | Event-driven cross-tool orchestration. | Zero-touch end-to-end; SLAs met. |
| Visionary (5) | Adaptive pipelines via ML/agents. | New services onboard in <1hr. |
Benchmark: Elite teams deploy multiple times/day via unified pipelines.
Q2: Do application, infrastructure, and security tools integrate using well-defined APIs or events?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Email/manual handoffs. | No integrations. |
| Novice (1) | Basic webhooks. | 5+ one-way flows. |
| Intermediate (2) | API-driven (Terraform + Snyk). | 15+ bidirectional. |
| Advanced (3) | Event buses (EventBridge/Kafka). | OpenTelemetry unified. |
| Expert (4) | Contract-tested integrations. | 99% uptime on flows. |
| Visionary (5) | Abstraction layers/plugins. | Tool swaps without pipeline changes. |
Testing Practices
Comprehensive, continuous testing is non-negotiable for low change failure rates (CFR <15%). Shift-left prevents prod defects.
Q3: Is automated testing implemented across unit, integration, performance, and security layers?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Manual QA only. | No automation. |
| Novice (1) | Unit tests (~30% coverage). | Basic frameworks. |
| Intermediate (2) | Full pyramid (E2E/load). | >80% coverage enforced. |
| Advanced (3) | Chaos/security/chaos tests. | Mutation coverage >90%. |
| Expert (4) | AI-generated/property-based. | Test flakiness <1%. |
| Visionary (5) | Self-healing suites. | Tests as canaries. |
Q4: Are tests executed early and continuously to prevent defects from reaching later environments?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | End-of-cycle testing. | Defects in prod. |
| Novice (1) | Pre-merge unit. | 10min feedback. |
| Intermediate (2) | Full suite per PR. | Gates block bad code. |
| Advanced (3) | Sharded/<5min total. | 0 staging escapes. |
| Expert (4) | Prod-like canaries early. | Prioritized dynamically. |
| Visionary (5) | ML-prioritized. | Defect prevention >95%. |
Culture Practices
Collaboration drives DORA elite status—cross-functional teams own outcomes end-to-end.
Q5: Do development, operations, security, and business teams collaborate effectively?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Siloed handoffs. | Ticket ping-pong. |
| Novice (1) | Sync meetings. | Shared channels. |
| Intermediate (2) | Joint on-call/rituals. | Shared KPIs. |
| Advanced (3) | Embedded platform teams. | Blameless culture. |
| Expert (4) | Full-stack ownership. | Safety surveys >4/5. |
| Visionary (5) | Autonomous squads. | Culture in onboarding. |
Q6: Is ownership of reliability, security, and cost clearly defined within product or platform teams?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Centralized ops/sec. | Ad-hoc assignments. |
| Novice (1) | RACI docs. | Incident ownership. |
| Intermediate (2) | Team charters/SLOs. | Cost allocation. |
| Advanced (3) | You-build-you-run. | Champions embedded. |
| Expert (4) | Ownership-as-code. | Budgets enforce. |
| Visionary (5) | Dynamic delegation. | Platform guarantees it. |
Infrastructure Practices
IaC and reproducibility are table stakes for elite lead times <1 day.
Q7: Is infrastructure provisioned and managed using Infrastructure as Code?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Console/CLI manual. | No versioning. |
| Novice (1) | Basic Terraform. | Prod-only. |
| Intermediate (2) | Full lifecycle IaC. | Policy scanning. |
| Advanced (3) | Modular/multi-cloud. | Auto-drift fix. |
| Expert (4) | GitOps platforms. | Composable APIs. |
| Visionary (5) | AI-generated IaC. | Self-service catalogs. |
Q8: Are environments consistent and reproducible without manual configuration drift?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Config drift rampant. | "Works locally." |
| Novice (1) | Shared Docker images. | Weekly scans. |
| Intermediate (2) | Immutable + values.yaml. | Promotion pipelines. |
| Advanced (3) | Golden paths enforced. | Chaos parity tests. |
| Expert (4) | Ephemeral everything. | Spin-up <5min. |
| Visionary (5) | No persistent envs. | Reproducible by hash. |
Leadership Practices
Leaders who tie comp to DORA metrics accelerate 2x faster.
Q9: Does leadership actively sponsor DevOps, SRE, and DevSecOps initiatives?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Buzzword only. | No budget. |
| Novice (1) | Training/POCs. | One-off funding. |
| Intermediate (2) | OKRs + headcount. | Quarterly reviews. |
| Advanced (3) | Comp incentives. | Ringfenced budget. |
| Expert (4) | Board dashboards. | Risk-based targets. |
| Visionary (5) | C-suite ownership. | Public benchmarks. |
Q10: Are delivery, reliability, and operational metrics used in leadership decision making?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Features only. | No eng metrics. |
| Novice (1) | Uptime reports. | Monthly shares. |
| Intermediate (2) | DORA tracked. | Team percentiles. |
| Advanced (3) | SLOs in biz reviews. | Budget gates. |
| Expert (4) | MLT benchmarks. | Cost/deploy optimized. |
| Visionary (5) | Predictive models. | Drives strategy. |
SRE Practices
SLOs/error budgets enable safe velocity—hallmark of elite performers.
Q11: Are SLIs, SLOs, and error budgets clearly defined and actively used?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Vague 99.9%. | No enforcement. |
| Novice (1) | Basic SLOs. | Alerts only. |
| Intermediate (2) | Service SLOs/SLIs. | Quarterly budgets. |
| Advanced (3) | Cascading + burn. | Deploy gates. |
| Expert (4) | Multi-tenant SLOs. | Comp linked. |
| Visionary (5) | Dynamic/ML-tuned./td> |
Q12: Do incidents result in structured reviews that drive long-term improvements?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Blame games. | No follow-up. |
| Novice (1) | RCA templates. | Tracked actions. |
| Intermediate (2) | Blameless PMs. | MTTR <4h. |
| Advanced (3) | IC role/process. | Auto-prioritize. |
| Expert (4) | Toil reduction SLA. | PMR to PRs. |
| Visionary (5) | AI-assisted reviews. | Incidents → features. |
Deployment Practices
Low-risk deploys = high frequency + low CFR
Q13: Are application and infrastructure deployments automated and low risk?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Change boards. | Weekly releases. |
| Novice (1) | Scripted staging. | Manual rollback. |
| Intermediate (2) | Auto prod deploys. | >95% success. |
| Advanced (3) | FF/progressive. | Infra in same pipe. |
| Expert (4) | Dark deploys. | <1min rollbacks. |
| Visionary (5) | Git-triggered only. | CFR <5%. |
Q14: Are advanced deployment strategies such as canary or blue green used consistently?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Big bangs. | Downtime common. |
| Novice (1) | Rolling updates. | Health gates. |
| Intermediate (2) | Canary 50% traffic. | Auto-promote. |
| Advanced (3) | B/G + shadow. | Multi-phase. |
| Expert (4) | Adaptive rollouts. | ML pacing. |
| Visionary (5) | Risk-isolated. | Zero-downtime SLA. |
Innovation Practices
Safe experimentation fuels continuous improvement without stability tradeoffs.
Q15: Can teams safely experiment and innovate without risking stability or compliance?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Prod = lab. | Risks everywhere. |
| Novice (1) | Sandboxes. | Approval workflows. |
| Intermediate (2) | Preview envs. | Compliance gates. |
| Advanced (3) | Multi-tenant previews. | Chaos budgets. |
| Expert (4) | Opt-in alphas. | Isolation native. |
| Visionary (5) | Platform experiments | AI validation. |
Q16: Are new tools and practices adopted through a structured and scalable process?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Shadow IT. | Sprawl. |
| Novice (1) | Request forms. | Sec review. |
| Intermediate (2) | Tech radar/RFCs. | POC cadence. |
| Advanced (3) | Curated catalogs. | Adoption metrics. |
| Expert (4) | Contribution policy. | Auto-deprecation. |
| Visionary (5) | Platform loop. | Tool-as-service. |
Observability Practices
Proactive detection crushes MTTR <1hr for elites.
Q17: Do logs, metrics, traces, and user signals provide clear system visibility?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Log grep. | Blind spots. |
| Novice (1) | Basic Prometheus. | 10 dashboards. |
| Intermediate (2) | 3-pillars OTEL. | Golden signals. |
| Advanced (3) | Auto-maps. | Full context. |
| Expert (4) | eBPF/observability mesh. | Custom SLIs. |
| Visionary (5) | AI insights. | Service-level views. |
Q18: Is observability data used proactively to detect issues before users are impacted?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Reactive paging. | Users complain first. |
| Novice (1) | Threshold alerts. | Fatigue high. |
| Intermediate (2) | Runbooks auto. | MTTD <5min. |
| Advanced (3) | Adaptive/noise-free. | Preemptive heals. |
| Expert (4) | Capacity prediction. | Self-healing. |
| Visionary (5) | Digital twins. | Zero user incidents. |
Design Practices
Design-time non-functionals prevent runtime fires.
Q19: Are scalability, resilience, security, and cost considered during system design?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Afterthoughts. | Retrofits common. |
| Novice (1) | Checklists. | Basic modeling. |
| Intermediate (2) | ADRs/decision gates. | GameDays planned. |
| Advanced (3) | Threat/cost models. | NFRs scored. |
| Expert (4) | Chaos in design. | FinOps native. |
| Visionary (5) | Self-optimizing designs. | AI forecasting. |
Q20: Are architectural decisions documented, reviewed, and evolved regularly?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Verbal/tribal. | Knowledge loss. |
| Novice (1) | Post-hoc wikis. | Basic templates. |
| Intermediate (2) | Repo ADRs/RFCs. | Quarterly arch reviews. |
| Advanced (3) | Backstage catalogs. | Evolution tracked. |
| Expert (4) | Impact measurement. | Supersession policy. |
| Visionary (5) | Living diagrams. | AI-evolved arch. |
Security Practices
Automated Sec = velocity without vuln debt.
Q21: Are security controls automated across CI/CD pipelines and runtime environments?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Manual audits. | - |
| Novice (1) | Scan gates. | Secrets basic. |
| Intermediate (2) | SAST/DAST/IAAC scan. | OPA policies. |
| Advanced (3) | Runtime/mTLS mesh. | Zero trust. |
| Expert (4) | SBOM supply chain. | Auto-threat model. |
| Visionary (5) | Confidential compute. | AI vuln hunt. |
Q22: Are vulnerabilities detected, prioritized, and remediated through integrated workflows?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Ignored until breach. | - |
| Novice (1) | Weekly reports. | Criticals only. |
| Intermediate (2) | PR vuln blocks. | Sev SLAs. |
| Advanced (3) | Exploitability scores. | Virtual patches. |
| Expert (4) | Reachability analysis. | Contract SLAs. |
| Visionary (5) | Auto-fix PRs. | Vuln in budgets. |
Cost Optimization Practices
FinOps embedded = 30-50% savings without perf hits.
Q23: Is cloud cost visibility transparent and accessible to engineering teams?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Finance black box. | - |
| Novice (1) | Monthly reports. | Tagging policy. |
| Intermediate (2) | Team dashboards. | Showback allocation. |
| Advanced (3) | RT alerts/FinOps. | Monthly rituals. |
| Expert (4) | Auto RI/spot. | Cost SLOs. |
| Visionary (5) | Cost-aware schedulers. | ROI/feature. |
Q24: Are cost optimization and FinOps practices embedded into engineering workflows?
| Level | Characteristics | Key Indicators |
|---|---|---|
| Not doing (0) | Ops-only concern. | - |
| Novice (1) | Resource warnings. | Rightsizing basic. |
| Intermediate (2) | Pipeline gates. | Optz team. |
| Advanced (3) | Intelligent scaling. | Cost SLOs. |
| Expert (4) | CoE/chargeback. | 30%+ savings. |
| Visionary (5) | Self-optimizing. | Cost = reliability. |



