Living with Antifragility: How I Build Systems and a Life That Gain from Disorder
Antifragility is not resilience. Resilient things survive disorder; antifragile things gain from it. Here is how I apply Taleb's framework to fifteen years of engineering, the way I build films and books from code, and a ten-day journal cycle run as a stress rhythm.
There is a word missing from English, and once you notice the gap you cannot un-notice it. We have fragile for the things that break under stress. We have robust for the things that resist it. We have no clean word for the third category: the things that get better under stress. Taleb invented one. He called it antifragile, and the book by that name is the closest thing I have to a personal operating manual. I reviewed it in full here; this page is the other half of that review, the applied half, the part where I stop talking about the book and show you where the idea actually lives in how I work and how I live.
This is a cornerstone, so let me state the thesis up front and then spend the rest of the page earning it. I have spent fifteen years building distributed systems for a living. I write books and make films and music out of code. I keep a journal organized in ten-day cycles anchored to fixed stars. These look like unrelated activities. They are not. They are the same instinct applied at different scales, and the instinct is antifragility: build the thing so that the disorder you cannot prevent makes it stronger instead of breaking it.
Resilience is the floor, not the goal
The first thing antifragility did for me was kill a word I used to reach for constantly: resilience. For most of my career, resilience was the highest compliment you could pay a system. A resilient service stays up when a node dies. A resilient team absorbs a bad quarter and keeps going. A resilient person bounces back. All good. All necessary. All insufficient.
Resilience gets you back to where you were. That is the ceiling of the concept, and it is also its trap. A resilient system treats every stressor as damage to be absorbed and recovered from. It is playing defense against a world it has decided is hostile. The antifragile reframe is sharper and stranger: what if the stressor is information? What if the failure is the cheapest, most honest signal the system will ever get about its own weak points, and the only waste is failing to harvest it?
That distinction is not academic. It changes what you build. If you believe in resilience, you build to withstand. If you believe in antifragility, you build to learn, and then you go looking for the stress on purpose, in controlled doses, while the stakes are still small enough to be tuition rather than catastrophe.
Chaos engineering is antifragility with a runbook
The clearest place this shows up in my actual work is chaos engineering, which is, when you strip away the jargon, just antifragility operationalized for distributed systems.
The premise of chaos engineering offends most people the first time they hear it. You take a production system, or a near-production replica, and you deliberately break parts of it. You kill instances. You inject latency. You sever a dependency. You partition the network. You do this on a Tuesday afternoon, on purpose, with engineers watching, because the alternative is doing it by accident at three in the morning during a holiday traffic spike with nobody who understands the system awake.
What makes this antifragile rather than merely reckless is the asymmetry. The downside of a chaos experiment is bounded and known: a small, contained, observed failure that you can stop the instant it goes sideways. The upside is unbounded: you discover a failure mode that, left undiscovered, would eventually have taken the whole thing down. You are paying a small, voluntary stress to immunize against a large, involuntary one. That is the immune-system logic Taleb keeps returning to. You do not build a strong immune system by living in a bubble. You build it through controlled exposure. A system that has never failed is not a strong system. It is an untested one, and untested is just a synonym for fragile-but-you-do-not-know-it-yet.
I wrote at length in DevOps Beyond Automation about the difference between configuring tools and understanding systems. Chaos engineering is where that difference becomes visceral. You cannot run a useful chaos experiment if you do not understand where the request flows, where the state lives, and what fails first when it fails. The tools that inject the failure are commodity now. The judgment about which failure to inject, and what the blast radius should be, and when the result is telling you something real versus telling you noise, that does not commoditize. That is the systems literacy, and chaos engineering is the practice that builds it fastest because it forces the system to teach you the truth instead of letting you keep believing your own architecture diagram.
Redundancy looks wasteful right up until it does not
There is a kind of engineer, and I have been this engineer, who looks at redundancy and sees waste. Two of something when one would carry the load. Spare capacity sitting idle. A standby that costs money every month and earns nothing most months. An optimizer's instinct screams to trim it.
Antifragility taught me to read that idle capacity differently. Redundancy is not waste. It is purchased optionality against a future you cannot predict. The standby that earns nothing for eleven months is not a cost center; it is an insurance policy that happens to also be the thing that keeps you in business the one month the primary fails. The slack in the system, the capacity you are not using, the second supplier you do not strictly need today, is exactly what lets the system absorb a shock and come out the other side intact, sometimes stronger because the shock revealed which path actually mattered.
Taleb's word for the failure here is naive optimization: tuning a system so tightly to known conditions that it has no give left for the unknown ones. An overoptimized system is a fragile system wearing the costume of an efficient one. It looks lean and beautiful on the spreadsheet and it shatters the first time reality serves it a condition the spreadsheet did not anticipate. I have watched this happen to teams, to architectures, and to careers. The lean ones break. The ones with slack bend and recover. The expensive lesson, learned and re-learned, is that a little inefficiency is the price of survival, and survival is the precondition for everything else.
Optionality and the barbell, applied to a life of building
Here is where the framework stops being about servers and starts being about how I have arranged my actual life.
Taleb's barbell strategy is the practical engine of antifragility. Instead of taking moderate, middle-of-the-road risk across the board, you split: put the overwhelming majority of your resources somewhere boring and safe, and a small, capped slice somewhere wildly speculative. You make your downside small and known, and you leave your upside open and unbounded. The middle, the place where most people sit, is the worst of both worlds. Moderate risk feels prudent and is in fact the most exposed position there is, because it carries real downside without the asymmetric upside that would justify the exposure.
My version of the barbell is a fifteen-year engineering career on one end and everything else on the other. The career is the safe, boring, load-bearing weight. It pays for the house and the time and the margin of safety. On the other end of the bar is the speculative cluster: the films I make from code, the books I write, the music, the small software products. Any one of those experiments can fail completely and it costs me almost nothing, because the safe end of the barbell is carrying my life. But any one of them could also become something, and the cost of finding out is small enough that I can afford to keep taking the swing, over and over, for years.
This is optionality, and optionality is the part of the framework I understand most viscerally because of how I build. When I make a film out of code, or generate a book, or ship a small product, I am not making a bet I need to win. I am buying an option. Most options expire worthless. That is fine; that is expected; the entire structure assumes it. The structure only works because each individual swing is cheap and the payoffs, when they land, are disproportionate to what they cost. You do not need to be right often. You need to be wrong cheaply and right hugely, and you need to take enough swings that the math has room to work. The small bet that costs me a weekend and a few API calls, and might turn into nothing, and might turn into a thing people actually use, is antifragility as a way of life. The volatility is not a bug in this arrangement. The volatility is the whole point. The disorder is where the upside comes from.
Via negativa: the engineering of subtraction
The concept from Antifragile that took me longest to internalize, and that I now reach for the most, is via negativa: the idea that you usually gain more by removing the harmful thing than by adding the beneficial one.
Engineers are trained to add. A problem appears and our reflex is to build something: a new service, a new layer of caching, a new monitoring dashboard, a new abstraction, a new tool. The instinct to add is so strong that we rarely even consider the other direction. But the most durable improvements I have ever made to a system came from subtraction. Deleting the service nobody could explain. Removing the cache that was hiding a real performance problem instead of solving it. Killing the dashboard that fifteen people consulted to make a decision the architecture should have made for them. Retiring the clever abstraction that saved three lines of code and cost every new engineer two weeks of confusion.
I made this argument from the operational side in the DevOps cornerstone, where I said the runbook that takes nine pages is the symptom and the architecture that produced it is the disease. Via negativa is the philosophical name for that move. You do not pay down that kind of debt by automating the runbook. You pay it down by changing the system so the runbook does not need to exist. Subtraction is harder than addition because it requires you to understand the system well enough to know what is actually load-bearing and what is just accumulated sediment that everyone is afraid to touch. But subtraction is where the antifragility hides, because every component you remove is a component that can no longer fail, a dependency that can no longer break, a piece of fragility you have permanently deleted from the system instead of merely guarding against.
The same logic runs through my life off the keyboard. The biggest gains in health, in focus, in clarity, came not from adding regimens but from removing the things that were quietly harming me. Remove the fragility first. Then, and only then, worry about adding the antifragility. Taleb is right that the medical establishment hates this idea, and he is right about why: subtraction does not sell. There is no product in stop doing the harmful thing. But it is, reliably, where the leverage is.
Skin in the game keeps the loop honest
There is a failure mode that antifragility is allergic to, and it is the one where the person making the decision does not bear the consequences of the decision. Taleb calls it the absence of skin in the game, and it is the thing that quietly corrupts otherwise good systems.
In engineering, the version I have lived is the on-call rotation. The single most clarifying force I know for building systems that do not break is the knowledge that the person who designed the thing is the person whose phone goes off at three in the morning when it fails. The moment the people who build a system are insulated from the pain of operating it, the system rots, because the feedback loop that would have told them about the fragility has been severed. Skin in the game is what keeps that loop closed. It is why I have always believed that the engineer who writes the code should carry the pager for it, not as punishment but as information. The pain is the signal. Removing yourself from the pain removes yourself from the truth.
This is also why I publish under my own name, why the films and books and products go out into the world with my name attached and my judgment exposed. The exposure is not a vanity. It is the mechanism. When I am wrong in public, the wrongness comes back to me, and that returning wrongness is the cheapest and most honest tuition I will ever pay. A career, like a distributed system, only stays antifragile if the feedback from its failures actually reaches the person who can change the inputs.
The decanal cycle as a stress rhythm
This is the part that ties the engineering to the rest of my life, and it is the part I think about most.
I keep a journal organized in ten-day cycles instead of seven-day weeks. Each cycle is anchored to one of the thirty-six fixed stars of the ancient Egyptian calendar. I explained the mechanics of this in What Is Decanal Journaling, and the full framework lives in The Decan Log. What I want to add here is the antifragile reading of it, because the more I run the practice, the more it looks to me like a stress rhythm rather than a calendar.
A decan has a shape: initiate, flow, reflect. You set a question, you work it without re-litigating the choice, and then you close the cycle by writing down what happened and deciding what carries forward. The reflection phase is the part that matters for antifragility, because it is structured harvesting of stress. Whatever broke during those ten days, whatever frustrated me, whatever failed, gets pulled into the journal at the close of the cycle and converted into something the next cycle can use. The disorder of a hard ten days does not just get survived. It gets metabolized. The friction becomes input. I wrote one entry, during a stretch of genuine chaos, that was entirely about transmuting small daily frustrations into strength, and I did not realize until later that I had been describing hormesis: the way a controlled, repeated, moderate stress makes the organism stronger, the same logic as the chaos experiment and the immune system and the muscle under load.
There is also a deeper structural reason the ten-day cycle is antifragile in a way the seven-day week is not. Seven days is too short for a theme to develop, integrate, and clear; the weekend resets you before the stress has finished teaching you anything. Ten days has room for a beginning, a middle, and an end, which means it has room for something to go wrong in the middle and be made sense of by the end. The shorter the cycle, the more often you reset before the lesson lands. The decanal rhythm is long enough to let the disorder run its course and short enough that no single bad cycle can do lasting damage. That is the barbell again, expressed in time: small, frequent, bounded exposures to the disorder of a real life, each one closed out and harvested before the next begins. Thirty-six of them a year, plus five days outside the count. A year of controlled stress, metabolized in ten-day doses.
Why all of this is one idea
I said at the top that the engineering and the building and the journaling are the same instinct at different scales, and now I can name the instinct precisely. In every one of these domains I am trying to arrange things so that the disorder I cannot prevent makes me stronger instead of breaking me.
I cannot prevent my distributed systems from failing, so I make them fail on purpose, in small doses, and I harvest what the failures teach. I cannot prevent most of my creative bets from going nowhere, so I make each one cheap and keep the upside open, and I take enough swings that the asymmetry has room to pay. I cannot prevent ten days of my life from going sideways, so I run them in a cycle that closes by converting whatever went wrong into something the next cycle can use. Chaos engineering, the barbell, via negativa, skin in the game, the decanal rhythm: these are not five techniques. They are one stance, held toward five different kinds of disorder.
The world is getting more volatile, not less, and the technological systems we live inside are getting more complex, not simpler. The fragile response is to try to predict and control all of it, which is a bet you will lose, because the whole nature of a complex system is that it produces conditions your model did not anticipate. The resilient response is to build to withstand, which is better, but which still treats every shock as damage. The antifragile response, the one I am trying to live, is to stop fighting the disorder and start feeding on it. Build the system, and the life, so that the stress is fuel. That is the entire project. Everything on this site is some version of it.
Continue reading
- Antifragile: Things That Gain from Disorder, my full review of the book this essay is built on
- DevOps Beyond Automation, the engineering cornerstone where systems thinking does the same work
- What Is Decanal Journaling, the ten-day practice read here as a stress rhythm
- The Decan Log, the full framework for the cycle
- Start Here, the orientation page for everything on this site