AI Development Revolution Part 4: Content Pipeline
Turning years of handwriting and a tangled WordPress blog into something I could maintain, without letting the machine flatten the voice out of it. OCR for cursive, a WordPress to Astro migration, and where automation actually ends.
The Content Pipeline
Part 4 of the AI Development Revolution. This was the long one, and most of what I learned later started here.
The problem was my own handwriting
Daily freewriting is my morning practice. Two to five pages of cursive before the day starts. The writing was never the problem. Getting it off the page was.
Generic transcription models could not read me. My handwriting is its own dialect, and the early models guessed at it badly enough that fixing their output took longer than just typing the pages myself. But I write a lot, and retyping all of it by hand was not going to last. So I needed something in between: a pipeline that could take a photo of a page and hand back a clean draft that still sounded like me.
I had written before, in "Taming the Paper Tiger", about how thin the thread is between thought and text. Speeding it up with a machine was the fastest way to snap it.
Five tries before it worked
The first version used basic OCR. It read the words and lost the voice completely, flattened everything into the same gray tone. The second used a better model, and somehow that was worse. It cleaned the writing up into corporate copy, the kind of prose that says nothing in a confident way. I trained a custom model, and it came out too rigid, locking onto patterns and forcing every page through them. I tried a context-aware pass that was supposed to read the meaning and keep the style. It fell apart on stream-of-consciousness, which is most of what freewriting is.
From my notes at the time:
The handwritten pipeline finally works, but it took complete rewrites and weeks of debugging. The AI does not just transcribe, it reads context and keeps the voice, but only after a lot of training and a lot of throwing things out.
None of the working version came from a clever idea. It came from doing it badly four times and paying attention to how each one failed. The cost was real. Weeks of work scrapped, the same pages retyped to check the machine against the truth, the stubbornness to start over with a different approach when the last one had felt close.
I wrote in "Handwriting as Meditation" and "The Flow of Thoughts" that the link between the hand and the thought is the point, not a side effect. Teaching a model to leave that link alone was the whole job.
The WordPress exodus
The other half was years of old posts trapped in a WordPress database. I wanted them out, into plain markdown files I could read, move, and back up without a plugin and a server in the way.
The export was easy. The cleanup was not. I ran the first hundred articles through an enhancement pass and they all came back sounding like a brochure. Smoother, shorter, and gone. So I read all hundred with a red pen, rebuilt the rules, and read them again. "Automation," it turned out, did not remove the work. It moved it. Instead of writing the articles I was reading every one of them, slowly, asking whether the machine had quietly swapped a sentence of mine for a sentence of its own.
A few things it was genuinely good at, and I let it have them. Alt text for images. Formatting code blocks. Checking links. Writing a meta description that summarized a post without trying to improve it. Anywhere it had to be descriptive instead of creative, it earned its keep. Everywhere it tried to be me, I took the work back.
What I gave up, and what I kept
Every step was the same trade. Make it clearer and you risk sanding off the part that was interesting. Optimize the title for search and you lose the title that had a joke in it. Most of the time I chose the joke. I left original titles when they had character and accepted that they would rank worse for it. I kept transitions that were a little rough, because rough was mine and smooth was anyone's.
The pipeline only started working when I stopped trusting it. Scanned page in, draft out, and then a person reading the draft against the page before any of it went near the blog. That last step never went away. At some point I stopped trying to make it.
What it is for
What I have now is quiet. A photo of a morning's pages becomes a draft by the time I have had coffee, and the draft still sounds like the person who wrote it longhand. The old posts are files I own instead of rows in a database I rent. None of it is hands-off, and I have made peace with that.
I have written elsewhere, in "Vibe Coding with AI", about working with the machine as a partner instead of a tool. This was where I learned where the partnership ends: at the page, with me reading.
Next I took the same habit, build it badly, watch how it breaks, keep a human in the loop, and pointed it at whole applications. That is Part 5.
Next: Part 5: The Business Transformation →
Series Navigation:
- Part 1: The Awakening
- Part 2: The Methodology
- Part 3: Enterprise Infrastructure
- Part 4: The Content Pipeline (Current)
- Part 5: Business Transformation
- Part 6: Future Implications
- Part 7: Advanced Patterns
Part 4 of 7 in The AI Development Revolution