Your Agent's Knowledge Has a Shelf Life
Your Agent's Knowledge Has a Shelf Life and every day you ignore it, your agents and skills are likely drifting into error land or worse, silent failures. But we have seen this play before and I propose a fix for it. Meet skill-versions.com
I work at Vercel and for the last few months I've been building a collection of 21 specialist agents and 57 skills for Claude Code, Codex, and Cursor. Each agent and skill encodes deep knowledge about a specific Vercel product: the AI SDK, Next.js, Workflow, Sandbox, TurboRepo, and many more we use daily. Not surface-level "here's how to install it" knowledge - the real deep stuff so that I can help my team members and our customers more effectively. The API patterns that actually work in production, the migration paths to a better, more confident existence, and the battle scars won from real world execution in customer environments, from Startup to Enterprise. It's the kind of thing you write once, test against real projects, and feel good about.
And then Next.js ships a new major version.
Or the AI SDK jumps from v4 to v6. Or Payload CMS deprecates an access control pattern you documented as the recommended approach, and they're shipping new releases almost daily. Suddenly your carefully crafted agent instructions are confidently telling developers to use NPM packages or APIs that no longer exist.
This isn't a hypothetical. This is the reality of maintaining any collection of agent skills at scale, and I think it's a problem nobody's really talking about yet.
The problem nobody warned me about
If you've worked with AI coding agents, whether Claude Code, Codex, Cursor, or whatever your tool of choice is, you've probably written some form of instruction files. Maybe it's a CLAUDE.md, maybe it's .cursorrules, maybe it's a full skill file following the Agent Skills Specification. These files tell the agent how to do things correctly for your stack.
Here's what I've learned the hard way: those files are snapshots of knowledge at a point in time. They don't update themselves. They don't know when the upstream product ships a breaking change. They just sit there, each day getting a little more stale, slowly becoming wrong.
Across this skill collection, I'm tracking 22 different Vercel products. Each product has its own release cadence. Some ship weekly. Some ship daily. The AI SDK went from version 4.x to 6.x in what felt like a blink. At a company like Vercel with a proud culture of "You Can Just Ship Things", the one thing you know is a hard truth, things are constantly changing.
The combinatorial problem is real: 22 products, each with their own velocity, mapped across 57 skill files. There's no way to keep all of that fresh through manual review. I tried. It doesn't work.
What "stale" actually looks like
Let me be specific about what goes wrong, because "stale knowledge" sounds abstract until it bites you.
A developer asks their agent to set up Vercel Workflow. The agent reads the skill file, which says npm install @vercel/workflow. But the package was renamed to just workflow months ago. The install fails. The developer doesn't know why. They lose trust in the agent. They go back to reading docs manually.
Or worse: the API mostly works, but there's a new parameter that enables a critical feature such as, automatic retry configuration and the skill doesn't mention it because it didn't exist when the skill was written. The developer gets a working but suboptimal implementation, and they never know what they're missing.
This is the insidious part. Stale knowledge doesn't always fail loudly. Sometimes it just quietly produces worse outcomes.
Agents and skills need dependency management
We've solved this problem for code. npm outdated tells you when your packages are behind. Dependabot opens PRs when new versions ship. We have entire ecosystems built around keeping dependencies fresh.
But for agent knowledge? Nothing. Your skill files are flying blind.
So I built something. It's not complicated, but it's been genuinely useful, and I think the pattern generalizes well beyond my project.
Step 1: A product registry
At the root of the repo of my agents and skills, there's now a product-registry.json that maps every product to its npm package, GitHub repo, documentation URL, current verified version, and the skill files that encode knowledge about it. It looks like this:
{
"ai-sdk": {
"displayName": "Vercel AI SDK",
"npm": { "packages": ["ai"], "primaryPackage": "ai" },
"verifiedVersion": "6.0.105",
"verifiedAt": "2026-02-28T00:00:00Z",
"skills": ["ai-sdk-core", "ai-sdk-tools", "ai-sdk-react", "ai-sdk-multimodal"],
"agents": ["ai-sdk-engineer"]
}
}
Twenty-two entries. One source of truth.
Step 2: A version checker
A TypeScript script reads the registry, hits the npm API in parallel for all 22 products, and compares what's published against what we've verified. It takes about three seconds. No AI involved, this is plain deterministic code.
$ npm run check-versions
Skills Knowledge Check
==================================================
STALE (2):
ai-sdk 6.0.105 -> 6.1.0 (minor)
payload 3.78.0 -> 3.80.0 (minor)
CURRENT (15): upstash-redis, next, turbo, ...
That's it. Now I know exactly which skills need attention, and I know before any developer hits a problem.
Step 3: A frontmatter convention
In my agents/skills repos, every skill file related to a product now carries a product-version field in its YAML frontmatter:
---
name: ai-sdk-core
description: "Generate text and structured output with Vercel AI SDK."
product-version: "6.0.105"
---
This makes staleness visible at the file level. You can grep for it. You can cross-reference it against the registry. You can build CI checks around it. It's simple, but simple is the point.
Step 4: AI-assisted refresh
This is where it gets interesting. When the checker identifies stale products, I can spawn the relevant specialist agent, the one that already has deep expertise in that product, and ask it to research the changelog, identify what changed, and apply targeted edits to the affected skill files. The agent updates the patterns, bumps the version in frontmatter, and updates the registry.
The human stays in the loop for review, but the tedious work of reading changelogs and mapping changes to skill files is handled by the agents themselves.
This isn't just my problem
I want to be clear: there's nothing in this pattern that's specific to Vercel products or to my project's architecture. If you're maintaining agent skills, for any product, in any framework, you have this problem and it will be a perpetual one. You just might not have noticed yet.
The Agent Skills Specification already provides a standard for how skills are structured. The product-version convention is a natural extension. I think it should be part of the spec. A skill that doesn't declare which version of the product it was written for is a skill that can't tell you when it's wrong.
Here's what I'd love to see happen:
product-versionbecomes a standard frontmatter field in the Agent Skills Spec. Any skill that references a versioned product should declare which version it targets.- A generic version-checking package that works with any registry file. You shouldn't need to build this from scratch for your project.
npx skill-versions checkshould just work. - CI integration that flags stale skills the same way we flag outdated dependencies. A GitHub Action that runs weekly and opens issues when skills drift.
The meta-observation
There's something worth sitting with here. We're building agents that write code, review code, and architect systems. But the knowledge that makes those agents effective is itself a maintenance burden. The agents are only as good as their instructions, and instructions rot.
The software engineering discipline has spent decades building tools to manage change in code, whether in version control, dependency management, continuous integration, automated testing. Agent knowledge needs the same rigor. Not because it's glamorous work, but because the alternative is agents that slowly get worse at their jobs while everyone assumes they're fine.
I've been maintaining open source projects long enough to know that the unsexy maintenance work is what separates tools people trust from tools people try once and abandon. This is that work for agents.
If you're building agent skills at any scale, even a handful of instruction files for your team, I cannot more strongly recommend you start tracking versions now. Future you will be grateful.
Bearing gifts
In order to help solve this for myself and others, I spent sometime building out an NPM package to identify (check) and resolve through AI-Assisted upkeep (refresh) this staleness problem. Have a look at skill-versions and use it locally or include it in your CI/CD pipelines. Hopefully it can save you to stress and gnashing of teeth I have experienced.
The Agent Skills Specification lives at agentskills.io. I've proposed adding product-version as a standard field - feedback welcome.