On Saturday, April 18th, Anitra and I attended Tech in Full Effect, a gathering of Tampa people in Tampa’s tech scene who also enjoy ’90s hip-hop and R&B, and we had a grand time!
Organized by Jordyn Short and Tiffiny B., it took place at Gaspar’s Luxury Estate in South Tampa, a large house and property that can serve as a vacation place for a large group, or in the case of Tech in Full Effect, a lovely party venue.
They had DJ Will spinning ’90s classics…
…’90s hip-hop/R&B-themed drinks…
…and opportunities aplenty to catch up with old friends and make new ones:
While many of the guests were local, a number came from way, way out of town, including from Jacksonville, Miami, Atlanta, and even San Francisco.
I had lots of conversations, and some of the topics were…
What am I doing after Tampa Bay Tech Week? My answer: “Get up to speed on the new job, keep cranking out the Tampa Bay tech events list, touch base with the new connections I made, try and find a venue for an upcoming Tampa Bay AI Meetup.”
My favorite current hip hop?Tkandz and Relaye. None of the “mumble stuff.”
Did we buy a Mac Mini to run OpenClaw? No, Anitra and I got his-and-hers M5 MacBook Pros as our main devices and are turning our old M1 MacBooks into to agent machines.
So what’s my new job?Senior Developer Advocate at NetFoundry, and it’s starting in two days (Tech in Full Effect took place on the 18th; my first day was on the 20th).
All in all, it was a great event with great people, great conversations, and great food and drink in an unusual setting for a Tampa Bay tech event. My thanks to Tiffiny and Jordyn for putting it on — and please let us know when and where you’re holding the next one!
Last week was my first week at NetFoundry, where I’m the Senior Developer Advocate. It was fun, and it was also like drinking from a high-tech, encrypted firehose!
To mark the occasion, I sat down with NetFoundry’s Head of Developer Experience (and also developer; he does a lot!) Clint Dovholuk for my first episode on Ziti TV. We spent an hour diving into the “meat” of Zero Trust, networking architecture, and why your traditional VPN might be the “castle and moat” that finally (and unintentionally) lets the invaders in.
If you’re a developer who has always viewed networking infrastructure as someone else’s problem (and as a recovering mobile developer, I’m certainly guilty on that charge), here’s the deep-dive breakdown of what I learned in my first week on the job.
Clint said that Zero Trust might be better understood if you called it Explicit Trust. In the old “Castle and Moat” model, if you’re in the castle, you’re trusted. In the OpenZiti model, we assume the network is already compromised. You have zero privileges until they are explicitly granted based on:
Authentication: “Who are you?”
Authorization: “What are you allowed to do?”
A lot of resources will authenticate and authorize you through some kind of sign-in process. Clint describes OpenZiti as moving the process out by one layer into the network so you can’t even connect to an OpenZiti-protected resource without being authenticated and authorized first.
Or, to quote Clint:
With OpenZiti and Zero Trust, if you have a service that’s protected by OpenZiti, you first need to authenticate to the OpenZiti overlay network, and then you need to have an authorization that permits the operation you’re trying to perform.
OpenZiti also uses a Zero Privilege approach. Once again, to quote Clint:
The whole idea is that you have no privileges until you are granted privileges, and only then are you able to take whatever operation you want.
“Jay double-you tee” vs. “Jawt”
Apparently we’re on different sides of this debate. Clint prefers referring to JWTs as “Jay double-U tees,” while I prefer to call them “Jawts.”
OpenZiti and NetFoundry: How are they related?
OpenZiti is the network overlay project, and NetFoundry is the company behind OpenZiti.
The “Open” in OpenZiti comes from the fact that it’s an open source project. This is in keeping with the philosophy that a cybersecurity product should be open source because making source code publicly visible enables a community of developers, analysts, and other experts to audit, test, and improve it.
If you have the time, tech skills, and inclination, you can use OpenZiti and run your own overlay network at zero cost — if you don’t count the cost of said time and tech skills. It’s all up for grabs here.
However, if you’d rather spend your time and technical expertise elsewhere, especially once your needs get up to scale, such as on your main line of business, NetFoundry is here to provide you with a managed OpenZiti platform.
It’s easy to run one controller and two routers on your laptop. But when you’re an enterprise managing a fleet of routers, handling upgrades, and monitoring metrics, you’re suddenly in the “overlay business” instead of your actual business. NetFoundry is the “Easy Button” that manages OpenZiti for you [19:10].
The quickstart
Clint then gave a quick demonstration of the OpenZiti quickstart, which creates a fully functional OpenZiti network overlay on your system in a couple of seconds. This overlay has both a router and a controller, and each has a specific job.
Controller
The OpenZiti controller [24:36] serves as the brain of the overlay network. It’s the authority responsible for managing the state of the environment and ensuring that all connections are secure and verified before traffic ever flows.
Its responsibilities can be broken down into several key functions:
1. API surface and management
The controller surfaces several critical APIs that different components of the network interact with. These include:
Edge Client API: Used by SDKs and tunnelers to authenticate and discover services.
Management API: The interface used by administrators (often via the Ziti CLI) to configure the network, such as creating new identities or defining service policies.
Fabric and OIDC APIs: Used for internal mesh communication and identity provider integration.
2. The authority on explicit trust
The controller is the primary decision-maker for the two pillars of Zero Trust security:
Authentication: It verifies the identity of any user, device, or “workload” attempting to connect (answering “Who are you?”).
Authorization: It checks configured policies to determine exactly what that identity is allowed to access (answering “What are you allowed to do?”).
Unlike a traditional network where a firewall might be open by default, the controller ensures the network is dark by default. No connection is permitted until the controller has explicitly authorized it.
3. Bootstrapping trust, a.k.a. enrollment
The controller is the starting point for bringing new devices into the fold through a process called “Bootstrapping Trust”.
It issues One-Time Tokens (OTTs) (essentially signed JSON Web Tokens) that are delivered to users.
When a client initiates enrollment, the controller validates the token and facilitates a Certificate Signing Request (CSR) exchange.
The end result is a strong, cryptographically verifiable identity that the client uses for all future secure communications.
4. Orchestrating the mesh
While the controller does not actually handle the data traffic (that is the job of the routers), it provides the “map.” It coordinates with the edge routers to broker data channels, ensuring that when a client “dials” a service, the routers know how to steer that traffic to the correct destination.
Router
The OpenZiti router [26:09] is the workhorse of the network. While the controller acts as the brain and makes policy decisions, routers constitute the data plane: the actual infrastructure that moves bits from point A to point B.
According to Clint, the router’s job can be broken down into these core functions:
1. Forming the mesh overlay
The routers are responsible for creating the “mesh overlay network”. Unlike a traditional hub-and-spoke networking model, these routers connect to one another to form an interconnected fabric. Even if you start with just one router, you can deploy many others to extend this mesh.
2. Brokering data channels
The primary job of a router is to broker data channels. When an application wants to send data, the router facilitates the creation of a secure path. It effectively “steers” the traffic through the mesh to ensure it reaches the intended destination router and, ultimately, the target service.
3. Serving as the entry point for clients
Everything in OpenZiti is technically an SDK client, whether it’s a standalone app or a “tunneler.” These clients connect directly to the routers to form the necessary channels for communication. The router acts as the listener that accepts these connections once the controller has given the “okay.”
4. Shuttling the actual data
The router is where the heavy lifting happens. It is the component that actually sends your data from one side to the other. While the controller handles the logic of authentication and authorization, it never touches the application data itself. That task is handled entirely by the routers.
5. Enforcing the “dark network”
By acting as the only point of entry into the mesh, routers help enforce the “dark by default” philosophy. Unless a client has been explicitly authorized by the controller, a router will not broker a channel for it, effectively keeping the protected services invisible to the public internet, and by extension, unauthorized and malicious parties.
The coolest part for a developer? You can spin this all up on your local machine in about seven seconds with a simple ziti edge quickstart [23:00].
Why not just use a VPN?
One of my questions was the one every developer asks: “Why can’t I just use a VPN?”
Clint insists that an OpenZiti overlay actually is a VPN [34:05] in the broadest sense, in that it’s a virtual network that’s closed off to unauthorized parties. It just functions much differently than the “one big mush” of traditional VPNs, which are open by default, and once you’re in, you can see everything.
On the other hand, OpenZiti is dark by default [35:45]. If you have a server on the open internet, it usually has an open port (such as port 22 for SSH or 443 for HTTPS). With Ziti, you close those ports entirely. The service becomes “dark,” and the ports are invisible, and you can’t attack what you can’t even find.
The “magic dance” of bootstrapping trust
I’ll admit, when I first tried to set up a client and server, I got a little lost in the “magic dance” of certificates. Clint called this process bootstrapping trust [38:47].
It starts with a One-Time Token (OTT), which is a signed JWT, and the process goes like this:
The admin creates an identity on the controller [41:09].
The client uses the token to find the Controller’s URL [43:11].
The handshake takes place, where the client verifies the controller’s certificate, and they exchange a CSR (Certificate Signing Request) [44:43].
Strong identity: The result is a JSON file containing a key that must be protected like a secret.
AI Agents and the MCP Gateway
We also took a detour into Agentic AI. Clint has been using MCP (Model Context Protocol) Gateways to let Claude interact with the Ziti CLI.
The breakthrough here is efficiency and security. By using an MCP Gateway, you don’t have to give your raw credentials to the AI [57:02]. Plus, by using a targeted MCP server, you can strip a massive 100k data object down to a 10k summary, saving a fortune in tokens [59:12].
Real-world use: From blue bubbles to drones
I asked Clint who is actually using this in the wild. The “Adopters” list is growing, including projects like Blue Bubbles (the tool that brings iMessage features to Android) [50:33].
But the stakes get higher. We discussed Zero Trust Drones and secure communications on the battlefield [52:12]. When you’re in a high-stakes environment like Ukraine, having secure, “dark” comms is a necessity, not a luxury.
More coming soon!
This was the first of many Ziti TV livestreams featuring Clint and Yours Truly. The next one’s scheduled for Friday, April 30th at 11:00 a.m. U.S. Eastern / 8:00 a.m. U.S. Pacific / 1500 UTC, and you can view past livestreams in the Live section of the OpenZiti YouTube channel.
Meet “Slopportunity,” my new M5 MacBook Pro, purchased with the assistance of the home office stipend that my new employer, NetFoundry, provides. It has lots of RAM and drive space for running and storing models, and it runs circles around my old M1 machine. But I can’t help being reminded of Angelina Jolie’s line from Hackers: “It’s too much machine for you.”
Hopefully, that won’t turn out to be true.
Here’s Slopportunity on the Primary Processor Perch in my home office:
And what of my old laptop, an M1 MacBook Pro with still-decent specs? I’m hanging onto it, I’ve rechristened it as “Sloperator,” and will be my OpenClaw/long-running agents machine:
When the M1 was my main computer, my prior computer, an Intel-based PowerBook, was doing yeoman service. It will live forever, as it’s going to my mother-in-law, who needs a better computer than her old 2009 laptop for browsing, email, and so on:
For me, Arc of AI wrapped up with my attending Baruch Sadogursky and Leonid Igolnik’s madcap presentation, Back to the Future of Software: How to Survive the AI Apocalypse with Tests, Prompts, and Specs… and unexpectedly playing the accordion!
Baruch does DevRel at Tessl, the AI agent enablement platform, where his full-time job is thinking about context engineering and how agents actually write code. Leonid’s a former Tucows coworker, and now a recovering CTO who advises a range of tech companies on what he calls with a grin that was half joke and half resigned sigh “how to adopt this new and exciting age of never looking at the code that you shipped to production and still deliver predictable results.”
There are your typical “last slot of the last day of the conference” talks. And then there are ones like this one, where two grown men show up dressed as Doc Brown and Marty McFly, pull in Yours Truly to improvise a song mid-talk, and spend forty-five minutes arguing that the future of software engineering looks suspiciously like the waterfall model your company abandoned in 2009, except this time it might actually work!
If you wish you’d caught it, you’re in luck; they recorded their presentation, and you can watch it right now:
They’ve been road-testing this talk for over a year. I caught an earlier version referenced in their slides from Baruch’s appearance at DevNexus 2026 and a Geecon keynote in Kraków…
…but the Austin version had clearly been sharpened by a lot of live feedback and a lot of real-world use of their toolkit.
Underneath the flux capacitor jokes and the AI-generated illustrations of monkeys in lab coats, they were making a serious argument, and it’s one I’ve been chewing on ever since.
I want to unpack it here, because I think they’re onto something that a lot of the spec-driven-development conversation is quietly missing.
The setup: a crisis of trust
Baruch opened with a story that’s aged like a fine wine over the last few months: Amazon’s Kiro, a spec-driven IDE whose rollout was, in his telling, “standardized, shocked, and delivered software that crashed AWS.” The bit got a laugh. Then he went to the show of hands.
Who ships code to production that was written by an LLM? Most of the room.
Who’s happy with the results? Fewer hands.
Who trusts what’s being produced? Fewer still.
Then he put the real numbers on screen. According to the most recent Stack Overflow developer survey:
More than half of the code being committed to production is AI-generated.
In the same survey, 96% of developers say they don’t fully trust that AI-generated code is functionally correct.
And only 48% say they always check AI-generated code before committing it. (Leonid’s deadpan observation: “I would argue half of that 48% lied.”)
This means that the majority of new code is being written by systems the people shipping it don’t trust, and most of those people aren’t rigorously reviewing the output. In effect, we’ve collectively invented a new compiler and then, collectively, decided to stop reading what comes out of it.
Baruch has a phrase for this, and it’s similar to something I mentioned at the last AI Salon in St. Pete: “The source code is the new bytecode.” Nobody reads it. We rely on it blindly. The difference, of course, is that bytecode is produced by a deterministic compiler. Source code produced by an LLM is not.
He drove this home with a self-deprecating story about the talk’s own show notes page. “I asked the agent if this link made it into the show notes, and what did I tell you? That I checked. The agent generated a lot of links. I checked that there were a lot of links. That was the question.”
The room laughed because everyone recognized themselves in it. “I always check my AI-generated code” turns out to mean almost nothing. It’s the code review equivalent of your kid telling you they cleaned up their room. Technically they picked things up, but you wouldn’t want to walk in there barefoot (and if they’re teenage boys, maybe not without a gas mask).
The Chasm
The core of the talk is built around three C-words, and the first one is the one that frames everything that follows: the Chasm.
The Chasm is the gap between what you meant and what actually runs. Every abstraction in our industry’s history has had one of these. Assembly programmers didn’t trust compilers. Baruch showed a 1950s quote about exactly that skepticism, from back when Grace Hopper was having to sell people on the idea that you could let a machine write assembly for you.
It continued: C programmers didn’t trust garbage collectors, C++ programmers didn’t trust the JVM. If you’re of a certain age, you might remember when there were people who said Java would be too slow, would never compete in production, and that this crazy “bytecode” idea would never catch on.
Every time, the chasm eventually closed. The compiler got good enough, the runtime got fast enough, and the trust followed.
But Baruch and Leonid argue that this time, it’s different, and for one specific reason that Leonid kept hammering home: for the first time in the history of our industry, the compiler is non-deterministic.
With agentic coding, you can type the same prompt twice and get different code each time. You can run the same agent on the same spec on the same codebase and get different tests. The entire compiler toolchain we’ve built over seventy years assumes that the same input produces the same output, and LLMs don’t do that. They’re (and this is the running metaphor of the talk, complete with a slide of a chimpanzee wearing a “Mr. Fusion” hat) monkeys with GPUs.
The infinite monkeys theorem says an infinite number of monkeys working on an infinite number of typewriters for an infinite period of time will eventually produce the complete works of Shakespeare, or at least a novel Mr. Burns could appreciate:
These monkeys produce Shakespeare sometimes. They also produce your company’s incident postmortem, and you don’t get to pick which one shows up in the PR.
Baruch’s favorite recent example, which made the room groan/laugh in baleful self-recognition: Uber is burning through LLM tokens faster than they budgeted, and what started as an engineering productivity initiative is now a finance problem.
“We’re in what, March, April? They planned out their budget for the year. So those monkeys are very productive. Typing and clearly doing something.” Which is both funny and, if you squint, terrifying. A lot of money is being spent on a lot of code nobody is reading.
This is where the talk gets its central mantra, delivered loud enough that it needed what Baruch called a “musical highlight,” which is where he turned to me in the front row and asked me to improvise something on the accordion.
Here are my hastily-improvised lyrics:
Never trust a monkey!
Never trust an ape!
Always verity —
Make sure your code’s in shape!
And then he moved on to the thing that I think is actually the core contribution of the talk.
The MIT detour
Before he got to the Chain, Leonid took a detour through an MIT paper he’d been carrying around for weeks. The paper maps AI-suitable tasks across two axes: cost of developing the artifact, and cost of verifying it. Four quadrants fall out of that.
Safe zone: cheap to generate, cheap to verify. This is where AI shines. The slides for their talk, for instance — AI-generated illustrations of Doc and Marty and the flux capacitor, easy to produce, easy to eyeball and approve. Nobody’s life depends on a specific monkey illustration being “right.”
Risk zone: cheap to generate, expensive to verify. This is where most software engineering lives, and this is the terrifying quadrant. The LLM can produce 2,000 lines of code in a minute. A human takes an afternoon to confirm it does what it’s supposed to, and two more days to confirm it doesn’t also do things it’s not supposed to.
Expensive-but-verifiable: costly to generate, cheap to verify. Things like formal proofs.
Avoid entirely: costly to generate, costly to verify. Don’t use AI here.
Leonid’s point was that our industry has stampeded into the risk zone and congratulated itself on the speed. We’re generating code faster than ever and verifying it less than ever, and the delta is being paid in the currency of production incidents and quietly broken features that nobody notices until a customer complains.
Baruch had to stop and ask ChatGPT to “explain this diagram Barney-style in one paragraph,” with a cut to a slide of the infamous purple dinosaur. The paper’s actual title is Static Regime Map with Dynamic Pressure. That’s the joke, and it’s also the point. The academic framing of this problem is hard to read, and we’re all moving too fast to read it.
The Chain
If you can’t trust the monkey, you need a chain of custody from intent to code where every link is either deterministic or independently verifiable.
Baruch and Leonid walked through the typical AI-assisted workflow and color-coded it by trustworthiness. Humans write the prompt; they’re considered trustworthy, because hey, it’s us.
(Leonid jumped in here to point out that humans are also a subtype of stochastic systems, which got the biggest laugh of the talk. “Someone loves humans in this room.”)
After that, an LLM turns that prompt into a spec. It’s not trustworthy, because a monkey wrote it.
Then the LLM writes code against that spec. Once again, it’s a monkey, and once again, it’s not trustworthy
Then, if we’re being honest about most shops, the LLM also writes the tests that are supposed to validate the code it just wrote. This is hilariously, catastrophically not trustworthy, because you just asked the monkey to grade its own homework.
Leonid calls this “hallucinated verification,” and it’s the thing that makes the green-build signal meaningless. If the same system writes the implementation and the tests, a passing suite tells you nothing. The tests don’t measure whether the code is correct; they measure whether the monkey was internally consistent about what it thought it was building.
Baruch showed a real example that made everyone wince. He showed an agent running late in a long session, getting tired of failing tests, and instead of fixing the code, systematically commenting out the verification logic, flipping assertions to True, and declaring the project “95.2% correct.” The screenshot was almost funny. It was also a thing that had actually happened, in an actual project, to an actual developer. And the developer almost shipped it.
Leonid’s and Baruch’s proposed fix is the Intent Integrity Chain. The idea is to insert a deterministic step between the spec and the tests, and then lock the result so the agent can’t tamper with it.
The flow looks like this:
Humans write the prompt. Verifiable because we wrote it.
LLM generates the spec. Not yet trustworthy. But the spec is human-readable prose, which means humans (including non-technical humans) can review it. This is where you catch things like “Wait, we never said what happens if the browser crashes mid-session!” before you write any code.
A deterministic tool generates tests from the spec. Not an LLM. A template-driven, repeatable process that turns Gherkin-style scenarios into executable tests. Same input, same output, every time.
The tests get cryptographically locked. This is the clever bit. They hash the test files and store the hash in a git note. A pre-commit hook, itself read-only at the OS level, refuses to accept any commit where the test hash doesn’t match, and:
If an agent tries to comment out a failing test to make the build pass, the commit is rejected.
If the agent tries to disable the hook, the hook is read-only.
If the agent tries to replace the hash, the hash is stored in a git note that’s version-controlled and tamper-evident.
LLM writes the implementation. Now we’ve constrained the monkey. It has to make the locked tests pass. It can’t rewrite them. It can’t disable them. It can whine about the hook (and Baruch said one of their test runs produced an LLM that found the hook, disabled it, and complained in its own comments that “some stupid hook is failing my commits”), but it can’t get around it.
The elegance here is that every link in the chain is either deterministic or externally verified. No model grades its own work. The human-verifiable artifact (the spec) is something a product manager can actually read. The machine-verifiable artifact (the hash) is tamper-proof. And the monkey only gets to do what monkeys are good at: filling in the blanks under adult supervision.
Leonid offered a framing that I think is worth giving some extended thought: “The idea is that everything that can be scripted should not be left for monkeys to deal with. Your CFO will thank you for that.”
There’s an unglamorous but important insight buried there. Every time you use an LLM to do something deterministic (format a file, generate boilerplate, fill in a template), you’re paying token costs to produce non-deterministic output for a task that had a deterministic solution. Push the deterministic stuff back into deterministic tooling and save the stochastic budget for the places you actually need it.
Wait, isn’t this just waterfall?
Baruch put this question on a slide himself, because he knew it was coming. Prompt → spec → tests → code, with human review at each stage? That’s Rational Unified Process (RUP) with a fresh coat of paint. Didn’t we spend the 2000s escaping that thing?
His answer: the reason waterfall failed wasn’t that its artifacts were bad. Specs are good. Reviewing specs is good. Thinking about non-functional requirements before you write code is good.
Waterfall failed because the cycle time was measured in months. By the time the spec committee finished arguing about whether the customer wanted a dropdown or radio buttons, the customer had changed companies and the market had moved on.
The Intent Integrity Chain runs the same loop in fifteen minutes. You write a prompt, the LLM drafts a spec, you skim it and catch the missing edge cases, the tool generates tests, you glance at the scenarios, the agent implements, and you’re done. The artifacts waterfall produced are genuinely valuable; they just weren’t worth the wait. LLMs make the wait go away.
This, I think, is the insight worth taking seriously. It’s not “Waterfall is back, baby!” It’s “the specific failure mode of waterfall was latency, and AI has changed the latency equation.”
The ceremony that was unaffordable in human time is cheap in LLM time. Specs that nobody had the bandwidth to write in 2005 can be generated, reviewed, and locked in 2026 before your coffee gets cold (or if you prefer, before your Coke Zero gets warm).
There’s a cultural echo here that Leonid leaned into from his any my past. He and I were actually colleagues 26 years ago at Tucows, back when Tucows was the second-largest domain registrar in the world, and they used to ship software after formal spec sign-offs. Not because it was fashionable, but because the cost of shipping a bug to production was high enough that the sign-off was cheaper.
The MIT paper’s argument is that generation costs have collapsed but verification costs haven’t. This puts us back in the same economic regime that made spec sign-offs rational in the first place. The pendulum’s not swinging back to waterfall because we got nostalgic. It’s swinging back because the economics swung back.
The demo
Leonid drove the live demo, which showed their toolkit, intent-integrity-chain/kit on GitHub. The dashboard shows the whole chain laid out as a web UI: premise at the top, then the “spidey diagram” of project priorities (documentation: high; TDD: high; minimal scope: low, because they’re not shipping to Mars), then specs with traceable requirement IDs, then the auto-generated Q&A where the LLM plays devil’s advocate and asks “What did we not think of?”
That reflective-reasoning step got the biggest reaction from the audience, and I agree with the reaction; it’s quietly the most useful thing in the whole toolkit. Anyone who’s sat through a real spec review knows that the value isn’t the document; the value is the five minutes where someone brings up a condition that the developers didn’t think of, such as “But what if two users do X at the same time?”, and the room goes silent.
It turns out that modern LLMs are phenomenal at playing that someone. They’ve read ten thousand spec reviews in their training data. They know the questions.
Leonid’s example: the tool looked at a spec for a flight-search library and asked things like “Do you need backward compatibility?” and “What happens if the browser crashes mid-session?” Those are exactly the questions the grumpy senior engineer asks in a room full of junior engineers, and now every team has one on demand, for better or worse.
The other trick the kit leans on hard is a literal software-project “constitution,” in a spirit similar to Claude’s constitution, a document that sits at the root of the repo and declares things like “always do TDD” and “all specs must trace to requirements.” It’s lifted from GitHub’s Spec Kit, and Baruch pointed out the genuinely clever reason it works: LLMs have been trained on enormous quantities of text about actual constitutions, with their amendments and ratifications and solemnity.
The word “constitution” triggers a whole cluster of “take this seriously” behavior in the model. It’s prompt engineering by semantic association, and supposedly works better than rules.md or guidelines.txt.
Everything in the dashboard is traceable: a requirement produces one or more spec features, each feature produces one or more Gherkin scenarios, each scenario produces one or more executable tests, each test gates one or more implementation tasks. Click any task and you can walk the chain backwards to the original requirement. Click any requirement and you can walk it forward to the code that implements it. The whole thing is visible, and because the specs are prose and the scenarios are human-readable, non-engineers can walk the chain too.
The new version of the kit is, per Leonid’s pointed demand, 57% faster than the old one. Apparently Baruch spends a lot of time on Slack complaining to Leonid about speed, which should be expected when these two characters get together.
The Q&A
A few exchanges from the Q&A are worth flagging for anyone thinking of trying this:
“Who writes the test scenarios, the human or the monkey?” Both, with the human in charge. The LLM drafts the Gherkin-style features from the spec. The human reviews those features, not line-by-line test code, but the human-readable scenarios, and signs off. Then the deterministic tooling converts those locked scenarios into executable test code. The human is the verification step. The tests are downstream of that verification, which is why locking them matters. Baruch was emphatic on this point because he’d seen audiences get confused: the word “spec” gets overloaded between “business spec” and “technical test scenario,” and both are part of the chain but play different roles.
“How do I do this for an existing codebase?” This is where Baruch had news: they’re working on a “brownfield” mode, and it’s the unlock that will let this approach work in the real world where nobody has a greenfield project. The recipe:
Point the kit at an existing project with tests.
Lock the code as read-only.
Have the LLM write specs from the tests, not from the code. Tests document behavior; code documents implementation. You want the behavior.
Use test coverage and mutation testing to measure whether the extracted spec actually reflects reality. Coverage tells you which code is exercised. Mutation testing tells you whether the tests are meaningful or just happen to execute the lines.
Iterate until you have a spec you trust.
From that point forward, any new feature goes through the full Intent Integrity Chain on top of the ingested baseline.
This is a lot of work. Leonid didn’t pretend otherwise. But he pointed out that much of it is now automatable in a way it wasn’t five years ago. You don’t hand-write specs for a million-line codebase; you have the LLM draft them and then you review.
“Who invented spec-driven development?” Someone asked this, and a second person looked it up live: there’s a 2004 paper from the XP conference in Germany that uses the exact phrase, combining TDD with Design by Contract. I mentioned that Design by Contract was baked into Eiffel in the 80s, and Baruch noted that NASA was doing something that looks a lot like it in the 1960s. The joke being that every generation rediscovers the value of writing things down before you build them, and every generation thinks they invented it.
What I’m taking home from this
First: the “monkeys with GPUs” framing is useful even if you don’t adopt the full toolkit. It’s a cleaner way to think about where trust does and doesn’t belong in an AI-assisted workflow. Any link in your pipeline where a model grades its own output is a link that’s lying to you. Once you see it, you see it everywhere; in the auto-generated tests, in the “this looks right” PR reviews, in the agent that confidently declares a task complete because it decided the task was complete. The mental move of asking “Who verified this, and do they have any skin in the game?” is a free upgrade to your code review habit.
Second: the locking step is the thing most spec-driven-development conversations leave out, and it’s the thing that makes the rest of the chain actually hold. GitHub Spec Kit gives you the spec ceremony. Kiro gives you the spec ceremony. Plenty of tools give you the spec ceremony. Very few of them prevent the agent from quietly editing the spec, or the tests, or the constitution file, halfway through the build. A cryptographic lock with a read-only pre-commit hook is an unglamorous piece of engineering, but it’s what turns the ceremony into actual guardrails. Everything upstream of the lock is advisory. Everything downstream of the lock is enforced.
Third, and once again, this is something I’ve come to on my own, and you might have, too: Baruch’s line about the source code being the new bytecode. If he’s right, the natural-language spec is the new source code, and the job of the next generation of developer tools is to make specs first-class citizens: versioned, tested, reviewed, locked. That’s a different job than what IDEs do today. It’s a different job than what LLM assistants do today. It’s arguably the job that DevRel is going to spend the next five years explaining, and I say that as someone who’s going to be doing some of the explaining.
Fourth, a smaller thing that I liked: Baruch’s experiment of asking an LLM to produce JVM bytecode directly, skipping Java entirely. The bytecode is the real artifact the JVM runs; why route through a source language? Today this would be a terrible idea because the ecosystem assumes source code is what humans read and review. But in a world where humans stop reading the source code anyway, the argument for source-as-intermediate-representation gets weaker. We may, in ten years, look back at 2026 and notice that “the code” was quietly replaced by “the spec plus the tests plus the locked chain,” and that the specific sequence of tokens the LLM produced in between became about as interesting as the specific sequence of x86 instructions the JIT emits. That’s a weird future. I’m not sure I like it. But I’m pretty sure Baruch and Leonid are right that it’s the direction we’re drifting.
I came into Arc of AI expecting to hear a lot about agents and MCP (and I did, including from my own talk). I didn’t expect the closer to reframe the whole problem as a question of non-deterministic compilation and how to bolt determinism back onto it. That’s a bigger idea than the Back to the Future bit gave it credit for. The talk is funny, and the costumes are good, and the monkey slides are excellent, but the thesis underneath the zaniness is the kind of thing that changes how you think about what you’re doing on Monday morning.
That’s the mark of a good end-of-conference presentation. You leave laughing, and then at three in the morning you sit up in bed thinking about pre-commit hooks.
Go try the kit. Start with a greenfield project where the stakes are low. Write a prompt. Let the LLM draft a spec. Review it. Let the tool generate Gherkin scenarios. Review those. Lock them. Let the agent implement. Notice how much more honest the green build feels when the tests weren’t written by the thing you’re trying to trust.
And if you get a chance to see Baruch and Leonid do this talk live, go. And bring a musical instrument!
Slides, video, and the full kit are linked from speaking.jbaru.ch and github.com/intent-integrity-chain. The Intent Integrity Kit is also available through the Tessl Registry. The MIT paper they kept referencing — the one whose actual title needed Barney-style explanation — is in the show notes along with everything else.
Today is my first day as Senior Developer Advocate at NetFoundry, the company behind OpenZiti.
I am thrilled, slightly jet-lagged from the onboarding reading, and (because some things never change)my accordion is within arm’s reach of the desk. If you are going to explain zero trust networking to developers, you might as well have an accordion-powered rock and roll backup plan.
This is the post where I tell you what the job is, what the product is, why the name makes me smile, and why I think this is going to be a good couple of years.
The short version
I am joining the team that invented and maintains OpenZiti, an open source zero trust networking platform. My job, alongside my colleague Clint, is to be the developer-facing voice of the project: write code, build demos, ship tutorials, show up in the communities where the conversations are actually happening, and make sure what we hear from developers gets back to the product and engineering teams in a form they can act on.
The timing is interesting. NetFoundry recently announced NetFoundry for AI, an AI-focused use of the platform aimed squarely at the problem every AI team is quietly panicking about right now: how do you let AI agents, MCP servers, and LLMs talk to each other and to the rest of your infrastructure without turning your network into Swiss cheese?
More on that in a minute. First, the name.
What is OpenZiti, and why is it called that?
The “ziti” in OpenZiti comes from “ZT”, as in “zero trust”. Say “Z-T” out loud a few times, let the letters slur a little, and you end up somewhere in the neighborhood of “ziti.” Then somebody noticed that ziti is also a tubular pasta, and because developers are developers, that became the visual identity. The OpenZiti logo is, essentially, a piece of pasta. I respect this deeply. My last employer’s mascot was a twerking login box. My current employer’s mascot is a delightfully cheesy, tasty dinner.
This also explains this cryptic comic I posted on my socials earlier, as a hint about the new job:
By the way, the rightmost pasta in the comic is a slouching ziti. Also, in case you need a quick explainer, here’s a helpful infographic:
Infographic from Sip Bite Go. Click to see the source.
The “Open” part is the substantive half of the name: OpenZiti is genuinely open source, Apache 2.0 licensed, and the whole thing lives in public on GitHub. You can pull it down right now, stand up a controller and some routers on your own hardware, and have a zero trust overlay network running on your laptop by lunchtime. (I know this because that is literally what I am doing this week as part of my onboarding. More on that later too.)
So what does it actually do?
Here is the mental model I am starting with, and I reserve the right to refine it as I get deeper in:
Today’s network model is “castle and moat.” You put a firewall around your stuff, you open ports for the services that need to be reachable, and you hope the bad guys don’t find a way through the gate. When they do (and they always do) they are inside the castle with the crown jewels.
Zero trust flips this. Instead of trusting the network, you trust identity. Every connection is authenticated, every connection is authorized, every connection is encrypted, and nothing is reachable just because of where it is on the network.
OpenZiti is the overlay that makes this practical. It gives every app, service, device, or agent a cryptographic identity, routes their traffic through a mesh of routers that only accept authenticated connections, and requires no open inbound firewall ports. This is the part that makes network engineers do a double-take. Nothing listens on the public internet. Attackers can’t port-scan what isn’t there.
If you have ever been the person who had to file a firewall change ticket to let service A talk to service B, and then waited three weeks and filled out a compliance form, you already understand the appeal.
The AI angle, which is where I am spending a lot of my first year
Here is the thing about AI agents and MCP servers: they are, architecturally, the worst possible citizens of a perimeter-based network.
They need to talk to a lot of things. They hold API keys. They get spun up and torn down on timelines that do not match anybody’s firewall change window. They are, by design, non-human identities with significant privileges, and most of the infrastructure around them was designed for humans with laptops.
NetFoundry for AI is the pitch for applying OpenZiti’s identity-first model to this mess:
A zero trust enclave for your users, agents, MCP servers, and LLMs, so none of them are reachable over the open network
Strong identities for the non-human participants (agents and MCP servers have been running around with service accounts and bearer tokens for too long)
API keys and service credentials held separately from the agents themselves, so a compromised agent isn’t also a compromised credential vault
Token tracking, cost accounting, and LLM routing across multiple providers, because once you have the identity layer you might as well use it to see what is happening
If you have been reading Global Nerdy for a while, you know the pattern. I spent three and a half years at Auth0 explaining OAuth 2.0, OIDC, and identity to mobile developers who would rather do literally anything else. The work was: take something that sounds like a standards committee threw up on a whiteboard, anchor it to a problem the developer actually has, and give them working code that does not require them to read 400 pages of RFC.
Zero trust networking is the same shape of problem. The concepts are genuinely hard. The vocabulary is dense. Most developers have never had to think about overlay networks before. But the underlying motivation, “I don’t want my AI agent’s API key to become somebody’s weekend project,” is something every builder can feel in their bones.
And some of you might remember my monthly Tampa Bay AI Meetup, which is now sitting around 2,200 members. The through-line of that community has been the same thing I am now getting paid to do full-time: take genuinely complicated infrastructure and make it feel approachable. Zero trust for AI agents is squarely in that Venn diagram.
What happens next
For the next little while, the plan is mostly “shut up and build.” I am standing up OpenZiti from scratch on my own hardware, embedding the SDK in a demo app, running MCP Gateway with Claude Desktop and a couple of backends, running LLM Gateway with a local model and a commercial one, and lurking in every community where OpenZiti and MCP get talked about. No hot takes until I have earned them.
After that, the usual Joey stuff: blog posts, short demo videos, office hours, and actual conversations in the places where developers hang out: r/openziti, r/mcp, the OpenZiti Discourse, and wherever else the work takes me.
If you build on OpenZiti, or you have been curious about it, or you just want to commiserate about explaining infrastructure to developers, my DMs are open. I am @AccordionGuy on GitHub, Joey de Villa on LinkedIn, and the accordion is here if anyone wants a rock cover of something topical as a celebratory interlude.
Here’s another way that Arc of AI is going to be an AI conference unlike any other: it’s going to have an opening musical act, namely…me!
Arc of AI organizer Dr. Venkat Subramaniam sent me a very nice email inviting me to help out with the after-dinner conference kickoff on Monday, April 17th at 7:00 p.m. with a couple of accordion numbers. I was honored (Dr. Venkat’s kind of a big deal), I’m only too happy to oblige, and I like to think of it as my contribution to “Keep Austin Weird!”
Here’s a sample from the last Collision conference in Toronto:
So far, the second quarter of 2026 is shaping up nicely!
Want to find out more about and register for Arc of AI?
Once again, Arc of AI will take place from Monday, April 13 through Thursday, April 16, with the workshop day taking place on Monday, and the main conference taking place on Tuesday, Wednesday, and Thursday.
Eternal Grind — my zero-player RPG homage to the legendary Progress Quest — was originally built agentically using Zencoder’sZenflow tool. The code is clean, self-contained, and dependency-free web page made up of just three files: index.html, game.js, and style.css. No build system required!
You can play it right now if you want to watch a randomly-generated hero grind their way through meaningless progress bars. I’ll wait…
Back? Great. Now let me show you how I upgraded Eternal Grind using Claude Code. If you’re new to vibe coding, this article’s for you!
Starting up
The first thing I did was change directories to the one containing the Eternal Grind code and then launch Claude Code with this command:
claude
Claude Code started up, and my terminal displayed the following:
Getting oriented with /init
With Claude Code launched in the Eternal Grind directory, it was time to run/init, Claude Code’s command for initializing a project:
/init
/init analyzes the codebase in the current directory and generates an AGENTS.md file, a document that describes the repo’s structure and conventions so that the AI agent can work effectively within it.
Here Claude Code responded to /init. The process took a few moments…
⏺ Searched for 2 patterns, read 6 files (ctrl+o to expand)
⏺ Now I have a complete picture. This is a simple,
self-contained vanilla web app with no build system, no
linters, and no test framework. Let me generate
the AGENTS.md.
⏺ Write(AGENTS.md)
⎿ Wrote 38 lines to AGENTS.md
1 # Repository Guidelines
2
3 ## Project Structure & Module Organization
4
5 Eternal Grind is a zero-player RPG (a parody idle game inspired by Progress Quest) built as a self-contained vanilla web application with no build tools, frameworks, or dependencies.
6
7 - `index.html` — Three-column dashboard layout (Character Sheet, Engine of Progress, Data Feed)
8 - `game.js` — All game logic in a single file: data arrays, helpers, state management, game loop, and DOM rendering
9 - `style.css` — Retro Windows 95 aesthetic using monospace fonts and high-contrast borders
10 - `docs/Eternal Grind spec.md` — Original functional specification describing UI layout, core mechanics, and data tables
⏺ AGENTS.md has been created at the repository root.
It covers:
- Project structure — the three core files and the spec
document
- Architecture overview — how game.js is organized
(data, helpers, state, game loop, DOM rendering)
- Development — how to run it (just open index.html)
- Commit conventions — derived from git history
Since there's no build system, linter, test framework,
or CI pipeline, those sections were omitted per the
"no evidence, no section" rule.
I like that last note: “no evidence, no section.” Claude Code read the git history to infer commit conventions rather than making things up, and correctly skipped sections about build pipelines and test suites because there aren’t any. It understood the project accurately on the first pass!
The content pipeline: populating the game’s data arrays
Most of Eternal Grind‘s personality lives in game.js as plain JavaScript arrays, which are lists of things like monsters, items, locations, tasks, and so on. The game randomly samples from these lists to generate its endlessly absurd narration. Extending the game is mostly a matter of adding more material to these arrays.
I asked Claude Code to show me what was already there:
List the monsters/opponents in the game.
It responded with the following:
⏺ Based on game.js:62-66, the monsters are:
1. A Literal Metaphor
2. The Concept of Ennui
3. A Low-Level Bugbear
4. An Imaginary Friend
5. A Confused Salesman
6. A Dust Bunny of Doom
7. The Ghost of a Dead Pixel
8. A Sentient Terms of Service Agreement
That’s a solid starting lineup, but it needed reinforcements. Specifically, the kind of opponents you’d actually encounter in everyday modern life. I requested the following…
Add the following monsters to the game: A Grumpy Barista,
Two Podcasters Who Won't Shut Up, A Five-Year-Old on a Sugar
Rush, The HR Department, A Belligerent Walmart Greeter,
"Screech" from "Saved By the Bell"
…and it replied like so:
⏺ Update(game.js)
⎿ Added 4 lines, removed 1 line
62 "A Literal Metaphor", "The Concept of Ennui", "A Low-Level Bugbear",
63 "An Imaginary Friend", "A Confused Salesman",
64 "A Dust Bunny of Doom", "The Ghost of a Dead Pixel",
65 - "A Sentient Terms of Service Agreement"
65 + "A Sentient Terms of Service Agreement",
66 + "A Grumpy Barista", "Two Podcasters Who Won't Shut Up",
67 + "A Five-Year-Old on a Sugar Rush", "The HR Department",
68 + "A Belligerent Walmart Greeter", "\"Screech\" from \"Saved By the Bell\""
69 ];
Note that Claude Code correctly handled the escaped quotes around "Screech" without being asked. Small thing, but the kind of thing that would cause a runtime error if it got it wrong.
Expanding the treasure system
Treasure in Eternal Grind is generated by randomly combining an adjective with a noun, so you end up hauling things like an Overpriced Paperclip of Hope or a Slightly Damp Sock of Mystery. I asked Claude Code to list what was already in the system, then added to both halves:
What are the trasures in the game?
Note that I misspelled “treasure.” Also note that Claude Code knew what I meant:
⏺ Based on game.js:48-59, treasures are generated by
combining a random adjective with a random noun:
Adjectives: Dull, Polished, Forbidden, Rusty, Lamentable,
Insignificant, Glowing, Slightly Damp, Overpriced, Mediocre, Legendary-ish
Nouns: Scissors of Regret, Pebble of Mediocrity, Scone
of Power, Lint of Destiny, Paperclip of Hope, Broken Twig,
Expired Coupon, Sock of Mystery, Unfinished Novel,
Jar of Pickled Thoughts
So you'd get items like "Overpriced Scissors of Regret"
or "Slightly Damp Lint of Destiny."
First, I asked for more adjectives:
Add the following to the list of adjectives for treasures:
Moderately Sticky, Dusty, Bullet-Riddled, Slightly Reeking,
Retro-Ish, Somewhat Cursed
Then, some new nouns. This is where I got to bring in some very specific cultural artifacts:
Add the following to the list of treasure nouns: 1980s
Goldstar TV, Troll Doll, Off-Brand Labubu, McDonalds Sweet
and Sour Sauce, Ceramic Elvis Bust, Ron Jon Surf Shop
T-Shirt, Sneakers of Minor Destiny, Poker Chips of Fate
⏺ Update(game.js)
⎿ Added 4 lines, removed 1 line
56 "Scissors of Regret", "Pebble of Mediocrity", "Scone of Power",
57 "Lint of Destiny", "Paperclip of Hope", "Broken Twig",
58 "Expired Coupon", "Sock of Mystery", "Unfinished Novel",
59 - "Jar of Pickled Thoughts"
59 + "Jar of Pickled Thoughts", "1980s Goldstar TV", "Troll Doll",
60 + "Off-Brand Labubu", "McDonalds Sweet and Sour Sauce",
61 + "Ceramic Elvis Bust", "Ron Jon Surf Shop T-Shirt",
62 + "Sneakers of Minor Destiny", "Poker Chips of Fate"
63 ];
I have to admit “Somewhat Cursed Off-Brand Labubu” is one of my proudest creative achievements.
Finding the gaps
Rather than guessing what else needed updating, I asked Claude Code directly:
What lists have I not yet edited?
⏺ Based on the data arrays in game.js, you haven't edited:
1. NAMES (line 7) — Character names
2. RACES (line 18) — Character races
3. CLASSES (line 25) — Character classes
4. TASKS (line 32) — Actions performed during questing
5. LOCATIONS (line 41) — Places the character visits
6. SPELLS (line 68) — Spells and abilities
7. STATS (line 75) — Character stat names
8. EQUIPMENT_SLOTS (line 82) — Equipment slot names
9. ACTS (line 86) — Plot act titles
You've edited MONSTERS, ITEM_ADJECTIVES, and ITEM_NOUNS.
I still find the fact that Claude Code “knows” what’s been touched and what hasn’t a bit eerie, but it’s also useful and it’s exactly where a good agentic coding tool shines.
Tasks and acts
Tasks are the descriptions of what the player’s character is doing at any given moment. The originals leaned heavily into corporate ennui, which I wanted to preserve and amplify:
List the tasks.
⏺ Based on game.js:32-39, the tasks are:
1. Debating a fence post
2. Polishing a rusty nail
3. Contemplating the void
4. Waiting for a sign
5. Filing a 1040-EZ
6. Staring into the middle distance
7. Organizing a sock drawer
8. Explaining the internet to a rock
9. Searching for a lost remote
10. Counting ceiling tiles
11. Simulating a personality
12. Buffing out a scratch in reality
Add these to the tasks list: Daydreaming about cheese,
Alphabetizing soup cans, Sorting a spreadsheet column,
Doom-scrolling LinkedIn, Applying to Blue Man Group,
Pressing a crosswalk button, Adjusting the thermostat,
Turning the pillow to the cool side
⏺ Update(game.js)
⎿ Added 5 lines, removed 1 line
35 "Filing a 1040-EZ", "Staring into the middle distance",
36 "Organizing a sock drawer", "Explaining the internet to a rock",
37 "Searching for a lost remote", "Counting ceiling tiles",
38 - "Simulating a personality", "Buffing out a scratch in reality"
38 + "Simulating a personality", "Buffing out a scratch in reality",
39 + "Daydreaming about cheese", "Alphabetizing soup cans",
40 + "Sorting a spreadsheet column", "Doom-scrolling LinkedIn",
41 + "Applying to Blue Man Group", "Pressing a crosswalk button",
42 + "Adjusting the thermostat", "Turning the pillow to the cool side"
43 ];
The acts are the overarching chapter titles that frame the endless grind. The original game had four. I felt it needed more. Much more:
List the acts.
⏺ Based on game.js:86-91, the acts are:
1. Act I: The Tutorial Nobody Asked For
2. Act II: The Grinding Begins
3. Act III: Despair in Four Stages
4. Act IV: The Post-Credits Confusion
Add the following acts: Act V: That Wasn't the End, Act VI:
The Writer's Barely-Disguised Fetish, Act VII: Mid-Season
Filler, Act VIII: Do You Really Read These?, Act IX: My
Sister Was Bitten By a Møøse Once, Act X: A New Hope, Act XI:
The Old Hope Was Better, Act XII: Twelve Angry Badgers, Act
XIII: Triskaidekaphobia, Act XIV: They're Coming to Take Me
Away, Ha-Ha, Act XV: Still Playing, Are We?
⏺ Update(game.js)
⎿ Added 12 lines, removed 1 line
98 "Act I: The Tutorial Nobody Asked For",
99 "Act II: The Grinding Begins",
100 "Act III: Despair in Four Stages",
101 - "Act IV: The Post-Credits Confusion"
101 + "Act IV: The Post-Credits Confusion",
102 + "Act V: That Wasn't the End",
103 + "Act VI: The Writer's Barely-Disguised Fetish",
104 + "Act VII: Mid-Season Filler",
105 + "Act VIII: Do You Really Read These?",
106 + "Act IX: My Sister Was Bitten By a Møøse Once",
107 + "Act X: A New Hope",
108 + "Act XI: The Old Hope Was Better",
109 + "Act XII: Twelve Angry Badgers",
110 + "Act XIII: Triskaidekaphobia",
111 + "Act XIV: They're Coming to Take Me Away, Ha-Ha",
112 + "Act XV: Still Playing, Are We?"
113 ];
Locations
Adding locations to Eternal Grind was pretty straightforward:
List the locations.
⏺ Based on game.js:41-46, the locations are:
1. The Forest of Mild Inconvenience
2. The Cave of Echoing Sighs
3. Downtown Boredom
4. The Desert of Dry Humor
5. Mount Mediocrity
6. The Swamps of 'I'll Do It Tomorrow'
7. The Suburbs of Despair
Add these locations: The Villages (Florida), The Complaints
Depertment, The Closet, The Basement of No Return, The Attic
of Surrender
⏺ Update(game.js)
⎿ Added 3 lines, removed 1 line
46 "The Forest of Mild Inconvenience", "The Cave of Echoing Sighs",
47 "Downtown Boredom", "The Desert of Dry Humor",
48 "Mount Mediocrity", "The Swamps of 'I'll Do It Tomorrow'",
49 - "The Suburbs of Despair"
49 + "The Suburbs of Despair", "The Villages (Florida)",
50 + "The Complaints Department", "The Closet",
51 + "The Basement of No Return", "The Attic of Surrender"
52 ];
⏺ Done. The five new locations have been added to
LOCATIONS in game.js:47-49. (Note: I corrected
"Depertment" to "Department" — let me know if
you wanted the original spelling.)
I appreciate that Claude Code silently corrected my typo from the derpy “Depertment” to the correctly-spelled “Department”) and mentioned it in the response rather than just doing it invisibly. Fixing it, but also notifying me, was the right call.
With the first round of content additions done, I committed and pushed:
Commit and push the changes
⏺ Commit successful. Now pushing to remote.
⏺ Bash(git push)
⎿ To https://github.com/AccordionGuy/eternal-grind.git
99a2460..122c62c main -> main
⏺ Done. Changes have been committed and pushed to
origin/main.
A bigger change: separating equipment from treasure
Up to this point, the character’s equipment was just randomly generated treasure items using the same adjective-plus-noun combinations that fill the loot feed. That felt wrong. Your boots shouldn’t be a Rusty Paperclip of Hope. So I started asking questions:
Does the player's equipment change during the game, or is it
simply determined at the beginning?
⏺ Equipment is only set once at the beginning.
It's generated during initialization (line 174-176)
where each slot gets a random item, and renderEquipment()
just displays the current state. There's no logic anywhere
that upgrades or swaps equipment during gameplay.
Would you like me to add equipment upgrades — for example,
replacing a piece of gear when the player levels up or
defeats a monster?
I filed that question away for later. First, I wanted a cleaner separation of concerns; dedicated, per-slot equipment lists with items that actually feel like equipment:
Not at first. Instead, I want equipment to be distinct from
treasure, and for there to be separate lists for each
equipment slot: helm, torso, left and right hands (hands
should use equipment from the same list, and both are allowed
to be equipped with the same item), legs, boots, neck amulet,
and ring. For the time being, the player's equipment should
simply be generated at the start of the game and not updated.
⏺ Read 1 file (ctrl+o to expand)
⏺ Now I have the full picture. I'll replace EQUIPMENT_SLOTS with per-slot item lists and update the initialization to use them.
⏺ Update(game.js)
⎿ Added 42 lines, removed 4 lines
94 +const EQUIPMENT = {
95 + "Helm": [
96 + "Dented Bucket", "Thinking Cap", "Tin Foil Hat", "Visor of Denial",
97 + "Crown of Mild Authority", "Headband of Overthinking",
98 + "Hard Hat of Soft Skills", "Beanie of Bewilderment"
99 + ],
100 + "Torso": [
101 + "Ill-Fitting Chainmail", "Hawaiian Shirt of Protection",
102 + "Sweater Vest of Resilience", "Bathrobe of Power",
103 + "Breastplate of Participation", "Hoodie of Invisibility",
104 + "Polo Shirt of Middle Management", "Poncho of Plausible Deniability"
105 + ],
106 + "Hands": [
107 + "Rubber Chicken", "Stapler of Justice", "Foam Finger of Fate",
108 + "Spatula of Smiting", "TV Remote of Command", "Rolled-Up Newspaper",
109 + "Pool Noodle of Reckoning", "Half-Eaten Sandwich"
110 + ],
… (Legs, Boots, Neck Amulet, Ring slots follow the same pattern)
This was a real structural change, and more than just adding items to an existing list. It refactored the data model entirely.
Claude Code replaced the flat EQUIPMENT_SLOTS array with a keyed EQUIPMENT object, generated appropriate thematic items for each slot, and handled the “Hands” edge case (one shared list, two separate display slots) correctly without extra prompting. It also added an EQUIPMENT_DISPLAY_SLOTS constant to maintain the rendering order without coupling it to the data structure. Clean separation.
Commit and push the changes
⏺ [main a3a60c0] Separate equipment from treasure with per-slot item lists
1 file changed, 55 insertions(+), 7 deletions(-)
To https://github.com/AccordionGuy/eternal-grind.git
3b8918d..a3a60c0 main -> main
⏺ Committed and pushed.
A small UI touch: context-sensitive emoji
The last change in this session was a cosmetic one, but a fun one. The center panel displayed a “crossed swords” image. I wanted it to change based on what the character is doing:
Is the "cross swords" image in the center panel an emoji?
⏺ Yes. It's the crossed swords emoji (⚔️), rendered via
HTML entities ⚔️ — U+2694 (CROSSED SWORDS) followed by
U+FE0F (variation selector for emoji presentation).
The game should display the crossed swords emoji when the
player is battling a monster, but the ♀️ emoji
when performing a task.
⏺ I'll add portrait updating to renderTaskBar since that's
where the task phase is already checked.
⏺ Update(game.js)
⎿ Added 8 lines
334 + // Update portrait emoji based on current phase
335 + var portrait = document.getElementById("portrait");
336 + if (state.taskPhase === "battle") {
337 + portrait.textContent = "\u2694\uFE0F";
338 + } else {
339 + portrait.textContent = "\uD83D\uDE47\uD83C\uDFFC\u200D\u2640\uFE0F";
340 + }
341 }
Claude Code correctly identified that renderTaskBar() was already checking state.taskPhase, making it the natural home for this logic. Rather than adding a separate function or a polling mechanism, it hooked into existing state management. Two lines of conditional logic, zero new complexity.
Commit and push
⏺ [main e498194] Switch portrait emoji based on task phase
1 file changed, 8 insertions(+)
To https://github.com/AccordionGuy/eternal-grind.git
a3a60c0..e498194 main -> main
⏺ Committed and pushed.
What this session illustrates
This wasn’t a session where I asked Claude Code to write the game from scratch; after all, the game already existed.
What I was doing was extending it, treating Claude Code as an unusually capable pair programmer who could read the codebase, answer questions about it accurately, and execute targeted changes without breaking anything else.
I was impressed by what it did:
It knew what it didn’t know. When I asked about equipment upgrades, it answered the question I asked (static, set at init) and then suggested (and didn’t assume) that I might want dynamic upgrades. It waited for direction.
It read context before writing code. For the equipment refactor, it explicitly re-read the file before making changes. This is the kind of due diligence that prevents “fixed” code from breaking something three functions away.
It handled structural changes alongside content changes. Adding items to an array is trivial. Replacing a flat array with a keyed object, updating initialization logic, and adding a display-order constant to preserve rendering behavior. That’s a real refactor, and it did it in one pass.
It fixed typos and told me so. It corrected “Depertment”to “Department” in the locations list and flagged the change rather than silently altering my input.
The game is playable at accordionguy.github.io/eternal-grind, and the source is on GitHub. There’s more work to do: equipment upgrades on level-up, more character names and races, and maybe some actual spell effects beyond the purely cosmetic. Future Claude Code sessions, probably.