Plan 9 Rap Battle: A Bunny, a Beat, and 40 Bars of Machine Spit
A 128-second rap music video: the Plan 9 bunny steps into a dark alley at 3AM with a 40oz, a bandana, and 40 bars of autotuned machine spit over a harpsichord trap beat. Built in one session with PIL animation, ElevenLabs TTS, rubberband autotune, and two layered ChipForge scores.
Plan 9 Rap Battle: A Bunny, a Beat, and 40 Bars of Machine Spit
128 seconds. One bunny. 40 bars. Harpsichord trap.
The Plan 9 bunny, Glenda from Bell Labs, steps into a dark alley at 3AM. Red bandana. 40oz in one hand, mini Uzi in the other. The brick wall behind scrolls with terminal commands and 80s Easter eggs. A Tesla coil pulses on the left. Paint splats explode on the hard consonants. The beat is a harpsichord playing sparse Am-to-F stabs over a half-time 808 sub at 150 BPM. The bunny raps about code, the grind, and the machine frontier.
This is not a good rap. It is a real attempt at one, and the distance between those two things is where the interesting engineering lives.
(Honest disclaimer: TTS cannot rap. What it can do, with enough autotune and compression, is deliver short punchy lines with rhythmic aggression over a beat. The result sounds like a very confident robot who wandered into a cypher and refused to leave. If you listen expecting Kendrick, you will be concerned. But if you listen for the engineering underneath, there is a lot here.)
The Film
Three verses, three hooks, a bridge, a golden drop, and an outro. 1536 frames at 12fps. 128 seconds.
INTRO (0-12s). Dark alley fades in. Cricket ambient, synthesized inline via ffmpeg (high sine tones with rapid tremolo plus a low frog drone; Stranger Things vibe). The bunny steps out of the shadows. Beat drops at frame 96.
PRE-VERSE. Four quick opener bars: "Yeah. Plan Nine on the mic." / "3AM crew. Coming out the pen." / "Ears up. Eyes red. Check the stack." / "Bunny back. Tell the whole block." Establishes the voice, the cadence, the character.
VERSE 1. Eight bars, 26-frame spacing (Ice Cube-style short pockets). Bell Labs blood, Glenda in the veins, Tesla coil buzz, Feynman flow, vi and grep. The bars reference the hacker-scientist lineage without naming anyone real. "Plan Nine from Bell Labs. Take my lane."
HOOK 1. "Plan Nine. Built it from the block. / Plan Nine. Keyboard tick tick tock. / Plan Nine. Terminal loaded up. / Plan Nine. Whole world shut up." Wider spacing (28 frames) for emphasis. Camera shake on each punch.
VERSE 2. The origin story verse. Wall came down, Berlin brown, moms worked doubles, stole a BASIC book. "Eight bit scars. I still got the pain." / "They said get a job. I said I made mine." The autobiographical material is abstracted through the bunny character; universal themes, no real names.
HOOK 2 + GOLDEN DROP. The beat's golden ratio moment (pattern 7 of 10 in the ChipForge score, at 0.618 of total runtime) hits with a heavy sub drop. Four punch bars ride it: "Plan Nine in the drop. / Bunny on the top. / Hand on the keyboard. / Watch the whole system stop."
OUTRO. "Don't sleep. / Bunny up." Then two yelled deliberates at full intensity: "PLAN NINE!" and "CERTIFIED!" Credits roll over the beat's fade, the title card lingers.
The Autotune Pipeline
This is the third Napkin Films production to use autotune_voice.py, and the first to push the spit-rap mode to its limit. The pipeline:
Raw TTS. ElevenLabs
villain_shadowpersona (Josh voice, menacing emotion). Each bar is a 6-9 word line ending on a hard consonant. The persona delivers with weight but zero rhythm.Autotune.
rubberband(pitch-shift library) chops the voice WAV on silence gaps, assigns each chunk to a note from an Em pentatonic scale (E3, G3, A3, B3, D3, E4, G4, B2), and pitch-corrects with--spit 0.80(heavy T-Pain depth) and--blend 0.70(mostly autotuned).--max-shift 8semitones locks each chunk hard to its target note.--spit-gap-ms 8catches very tight word boundaries.Vocal polish. The finalmix chain: loudnorm to -14 LUFS (louder target for rap vocals), acompressor at 6:1 ratio with -18dB threshold and slow attack, parametric EQ cutting 250Hz mud and boosting 3kHz presence.
The rule from ADR-011 holds: 6-12 words per bar, hard consonant clusters (crack, stack, drop, form, drive, step), hard-stop punctuation every 2-4 beats. Lines that "smear" under speed-up (serene, aura, flowing) are replaced with lines that "hold" (Bell Labs blood, vi and grep, Tesla coil buzz).
The Score: Harpsichord Trap
Two ChipForge WAVs layered:
Primary: harpsichord_trap_score.wav. A minor, 150 BPM (half-time feel = 75 effective for the vocal pocket). Sparse 4-note piano motif (E5-D5-C5-A4, pentatonic descent) over deep 808 sub bass. Am | F chord vamp (i | bVI). Half-time snare on beat 3 only. Trap hi-hats: 16ths with velocity dynamics plus occasional rolls. 10 patterns, 128 seconds. The golden ratio lands at pattern 7 (the heavy drop). The harpsichord replaces the usual piano stabs with something weirder and more aggressive.
Secondary: circuit_glitch_score.wav. Dark electro-glitch texture, mixed underneath the primary at lower volume. Adds atmosphere and fills the spectral gaps the sparse trap beat leaves open.
Cricket ambient. Synthesized inline by the finalmix script via ffmpeg: two sine tones (4800Hz and 6000Hz) with rapid tremolo plus a 90Hz frog drone. Appears during the intro (0-10s) and outro (116-128s) only. The Stranger Things vibe.
Music sits at 0.55 volume during verses (voice forward), ducks to 0.35 during the golden drop so the spit cuts through, returns to 0.72 at the outro, and fades out over the last 8 seconds.
The Visual Stack
Every frame drawn in PIL, 854x480 at 12fps.
The bunny. draw_plan9() from characters/plan9_bunny.py, with three custom prop drawers: draw_40oz() (brown-green bottle with amber liquid visible in the neck), draw_bandana() (Tupac-style forehead wrap with paisley dots, side knot, and two hanging tails), and draw_uzi() (stylized micro-Uzi with muzzle flash).
The wall. Brick texture with 24 scrolling Easter-egg text lines referencing Bell Labs, Tesla, WarGames, Tron, Commodore 64, Run-DMC, Grandmaster Flash, Eric B & Rakim, Neuromancer, Dennis Ritchie, Ken Thompson, and the Glenda bunny. Each line drifts left-to-right over 150 frames, fading in and out. Terminal green and amber against dark brick.
Lip sync. Phoneme-based draw_mouth() with key_beats per line: explicit mouth shapes (closed, small, mid, open, wide) at specific frame offsets so the strongest syllables snap open even without audio-driven sync. 40 lines, each with 4-6 key-beat anchors.
Camera work. cinematography.py provides crop_to_shot() and ease_shot() for animated pushes (wide to close-up), plus dutch_angle() for tilted aggression shots during the hooks. camera_shake fires on every hook and drop impact.
Paint splats. Particle system that fires colored paint explosions on hook impacts and bar transitions. Red, blue, green, yellow against the dark brick.
The Easter Eggs
The scrolling wall text is half the fun. A selection:
$ awk -f beats.awk | grep -E '^\$'tesla coil ok. 3.6.9 ok.wargames: shall we play a game?neuromancer — case still dreamstron → end of line.COMMODORE BASIC V2followed byREADY.paid in full — eric b & rakimwardenclyffe.sh --broadcast all// plan9 bunny — certified// signal --> noise --> truth
Each line color-coded: green for Bell Labs/UNIX, amber for Tesla, blue for 80s movies, warm gold for hip-hop.
The Lyrics
40 bars total. Every line 6-9 words, ending on a hard consonant. The privacy contract from the scene file: "lyrics draw on universal inner-life themes (identity, the grind, reinvention, the human-machine frontier). No real names, places, family, or biographical specifics."
The voice is the bunny's voice. The autobiographical material is abstracted through the character. "Came up hard. Wall came down." could be anyone. "Stole a BASIC book. Never looked back." could be any 80s kid. That is the point. The bunny is an archetype: the builder in the void, the 3AM crew, the one who coded through the night and came back changed.
What This Proves
Stars Press Start proved TTS could be pushed toward musical performance over 21 passes. Deduct Yourself proved double-time rap delivery worked with 10 passes. Plan 9 Rap Battle tests whether the autotune pipeline, applied to short aggressive bars over a trap beat, can produce something that sounds like it belongs over the music.
The answer is: almost. The autotune locks each word-chunk to a pitch in Em pentatonic, which gives the vocal a melodic contour it does not earn through performance. The compression and EQ chain gives it presence. The key_beats lip sync gives it visual rhythm. Together, with a beat that is deliberately sparse (two chords, half-time snare, lots of space), the voice occupies the pocket without fighting the music. It is not a rap. It is an engineer's approximation of one, and at 128 seconds it does not overstay.
The Stack
Same as every Napkin Films production:
- Animation: Python + PIL, Plan 9 bunny drawn frame by frame, 854x480 at 12fps
- Music: ChipForge, two layered scores (harpsichord trap + circuit glitch), plus ffmpeg-synthesized cricket ambient
- Voice: ElevenLabs v3, villain_shadow persona, rubberband autotune (Em pentatonic, spit 0.80, blend 0.70)
- Assembly: FFmpeg, stem-preserving finalmix with vocal compression, EQ, and per-section ducking
- Direction: Claude Code, Opus 4.6, agent mode
No GPU. No subscriptions beyond ElevenLabs. No animation software.
Watch Plan 9 Rap Battle on YouTube. The engine is open source (GPL-3.0). The film is Creative Commons (CC BY-NC 4.0). The ChipForge scores are CC BY-SA 4.0.
License. This film is licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). Share and adapt with attribution to "Organic Arts LLC" and a link to the original, non-commercial use only. Engine code is GPL-3.0-or-later. ElevenLabs voice audio is licensed content and is not redistributed. Contact: j@organicartsllc.com
If you want the full origin story of the Napkin Films stack, Four Films From Code covers how the architecture works and why constraint is the feature. For the autotune pipeline's first outing, Stars Press Start walks through 21 passes of TTS voice engineering. The ten-minute theological tribunal: The Reckoning. The Agentic Development post explains the agent mode workflow underneath all of it. The next film takes the same engine somewhere quieter: I Want to Be Martian is a Bradbury tribute with three voices, a Holst-Mars ostinato, and a rover on Mars in 2035.
Produced with Napkin Films and ChipForge, open source tools built by Joshua Ayson and AI agents at Organic Arts LLC.