Essays 11 min read

DevOps Beyond Automation: What Compounds in a 15-Year Engineering Career

DevOps is not a job title and never was. It is a thesis about how to build systems that can change without breaking. Fifteen years in, here is what actually compounded, what automation never solved, and what agent mode is now revealing about the practice.

DevOps Beyond Automation: What Compounds in a 15-Year Engineering Career

DevOps was never a job title. It was a thesis about how to build systems that can change without breaking. I have spent the last fifteen years inside that thesis as a practitioner, then a lead, then someone running platform organizations. This page is the cornerstone for everything else on this site about DevOps, platform engineering, and the operational philosophy that runs underneath it.

The short version: the work I do today looks very little like the job description that was being written when the word "DevOps" entered the industry vocabulary around 2009. The automation we built then is now a commodity. The pipelines, the dashboards, the configuration management, the container orchestration: all of that became table stakes. What stayed valuable, and what is suddenly more valuable than it has ever been, is something the original DevOps movement gestured at but never quite named.

It is systems thinking. The ability to see how the parts move together, where the feedback loops live, what fails first under load, and what the operational tax of a design decision actually is over five years rather than five sprints.

That is the thing that compounds. Not the toolchain. Not the certifications. Not the buzzwords. The capacity to look at a running system and understand it as a system, not as a stack of tools.

The original DevOps thesis, and what survived

The original DevOps argument was simple and correct: developers and operators were optimizing against each other, the friction between them was the largest single source of incidents, and the cure was to put them in the same loop. Deploy small changes often. Automate the path from commit to production. Measure what matters. Share responsibility for the running system, not just the code that produced it.

Every word of that is still true. What changed is that almost every part of it became commodified into tools that you can buy off the shelf. GitHub Actions, GitLab CI, Argo, Terraform, Pulumi, Datadog, Grafana, the entire CNCF landscape. None of these existed in their current form when the original DevOps argument was made. All of them now exist, all of them are good, and most of them are interchangeable.

What this means in practice: the part of the DevOps job that was about configuring tools has been steadily eaten by the tools getting better. The part that was about understanding the system the tools are operating on has gotten correspondingly more important. Engineers who built their identity around the tools are now interchangeable with each other and with the tools. Engineers who built their identity around the systems are now the bottleneck.

This is the first thing that compounded. Not the tool knowledge. The system literacy.

What automation never solved

Automation is a real thing and it does real work. Every line of Terraform I have written has saved someone a real hour. But there is a category of failure that automation does not touch, and that category turned out to be the one that matters at scale.

Three patterns I have watched repeatedly:

Operational debt that no script can pay down. A system that requires fifteen humans to interpret seven dashboards to decide whether to deploy on Friday is not a system that needs more automation. It is a system that needs a different shape. Automation built on top of a confused architecture amplifies the confusion. The runbook that takes nine pages is the symptom; the architecture that produced it is the disease. I have written the runbook many times. The work that actually moved the incident rate down was the work that made the runbook unnecessary.

Tribal knowledge as the real load-bearing structure. Every production system I have inherited had a small number of people who knew why a specific config setting existed. When those people left, the setting became a mystery, and the mystery became an incident six months later. Automation does not capture the why. Documentation captures it only when someone enforces the discipline of writing it. The compounding skill is not "writes automation"; it is "captures the reasoning behind decisions in a form the next person can use."

The cost of change versus the cost of stasis. Most organizations are exquisitely good at calculating the cost of doing something. They are catastrophically bad at calculating the cost of not doing it. The five-year-old EC2 instance that nobody wants to touch because nobody knows what runs on it is a cost. The fragile deploy process that everyone is afraid to refactor is a cost. The legacy monolith that the team has spent four years not migrating is a cost. Automation does not surface these costs. Senior judgment does.

If your DevOps practice is mostly automating the steps in the runbook, you are doing the easy half of the job. The hard half is questioning whether the runbook needs to exist.

Platform engineering is what DevOps wanted to be

Around 2022 the industry started using the term "platform engineering" for what was happening in mature DevOps organizations. The renaming was useful. It separated the cultural claim of DevOps (developers and operators in one loop) from the technical practice that grew out of it (treating the internal toolchain as a product, with users, requirements, SLAs, and a roadmap).

The platform-engineering frame clarifies the work in a way that the DevOps frame never quite did. The platform is a product. The application teams are the customers. The platform team's job is to make the next line of application code easier and safer to ship than the previous one. That is the entire mandate.

This frame has consequences. It means:

  • The platform team owns the developer experience as a measurable thing
  • Internal tooling has a cost of ownership, the same as external tooling
  • Platform features that nobody uses are platform technical debt
  • A platform that does not get adopted is failing, regardless of how technically clean it is
  • The platform team's success is not measured by how much they ship; it is measured by how much the application teams ship because the platform exists

Engineers who grew up in DevOps already knew this implicitly. Articulating it as platform engineering made it teachable. It also made it staffable: companies could now write job descriptions for what mature DevOps practitioners had been doing for years, and pay accordingly.

This is the second thing that compounded: the ability to treat infrastructure as a product, not as a cost center.

What agent mode is now doing to the practice

The arrival of AI agents inside the engineering workflow, which I have written about elsewhere as the shift in software engineering since 2025, is doing two things to DevOps specifically.

The first is obvious: a lot of the routine platform work that used to take a junior engineer a week now takes a senior engineer a morning with agent assistance. Writing the Terraform module, scaffolding the new service, generating the CI workflow, drafting the runbook: all of that compresses dramatically. The economics of the platform team shift from "how many junior engineers do we need to do the rote work" to "how many senior engineers do we need to direct the rote work that the agents now do." The ratio inverts.

The second is less obvious and more interesting: the agents are very good at the steps and very bad at whether the steps are the right steps. An agent will happily build you a beautifully clean CI pipeline that solves the wrong problem. It will happily write a Terraform module that provisions infrastructure your team will not be able to operate. It will happily refactor your deploy script in a way that breaks a tribal-knowledge invariant nobody remembered to write down.

This is not a flaw in the agents. It is a clarification of where the actual engineering work was always located. The agents have made it impossible to fake the systems-thinking part of the job. You either bring real judgment about how the pieces fit together, or you generate a lot of impressive-looking work that quietly increases the operational debt.

The DevOps engineers I see thriving in 2026 are the ones who already had strong systems instincts. They are now operating at multiples of their previous output. The ones who were getting by on tool fluency are exposed. The agents do tool fluency for free.

The skills I see compounding in a DevOps career right now

Fifteen years in, here is what I would tell someone starting today, in priority order:

  1. System literacy. Pick three running systems and learn them in depth. Not "what tools they use," but "how the request flows, where the state lives, what fails first when it fails, what the operational cost actually is." Most engineers never do this and it shows.

  2. Production instincts. Spend time on call. Read incident reports from other companies. Build the muscle that notices when a design decision is going to hurt at 3am. This is a pattern-recognition skill that takes years to develop and is not optional.

  3. Architecture judgment that operates at speed. With agents now generating code in minutes, the architecture decisions you make at the start are even more consequential. Bad architecture used to surface over weeks of implementation. Now it surfaces over hours of generation, and the cleanup cost is the same.

  4. Writing. Every senior platform engineer I respect writes well. Design docs, ADRs, postmortems, runbooks, internal proposals. The job is increasingly about persuading other engineers and stakeholders that a specific change is worth making. The ones who cannot write at length cannot lead at scale.

  5. Restraint. The most expensive failure mode in modern platform work is doing too much. Building features nobody asked for. Adopting tools nobody needed. Refactoring systems that were working. The agents make this failure mode easier to fall into. The compounding skill is knowing when not to do something.

  6. Cross-functional fluency. Security, finance, compliance, product. The platform sits at the intersection of all of these. A platform engineer who can speak to a finance partner about cloud spend, to a security partner about IAM posture, and to a product partner about deployment cadence is a different category of useful than one who can only speak to other engineers.

  7. Calibrated optimism. This work is hard. The systems are old, the constraints are real, the budget is finite, the team is tired. The engineers who compound are the ones who keep believing the next version can be better, while staying honest about why the current version is the way it is.

The operational philosophy that holds it all together

Everything above is downstream of one operating assumption: production systems are living things. They have a state, a history, a set of invariants you can't see, and a future shape that the current decisions are bending toward.

You can treat them as a stack of tools to configure, or you can treat them as a system to understand. Both approaches will keep the site up most of the time. Only one of them compounds over a fifteen-year career.

I have written this site's other engineering essays from inside that operational philosophy. The connections are intentional:

  • The AI Development Revolution series documents the shift from typing code to directing agents inside this same practice.
  • The AgentSpek book is the long-form treatment of how to maintain engineering discipline when the agents are doing the typing.
  • What Is People of the Stars? extends the operational thinking into the timing of decisions themselves, using astronomical cycles as a non-human clock.
  • Decanal journaling is the personal-systems version of the same instinct: instrument the practice, observe the patterns, change the inputs.

These are not separate topics. They are the same operational worldview applied at different scales. Production systems, engineering practice, personal practice, and the long-arc decisions about where to spend a career.

DevOps is the part of that worldview that I have been paid to do for the longest. The honest answer to "what does DevOps mean in 2026" is that the tool stack changed almost completely and the actual work changed almost not at all. The job was always about building systems that change without breaking. It still is. The leverage just got considerably higher.

Continue reading