Summary

This is an internal guide from Anthropic on how they use Claude Code Skills at scale — hundreds of skills in active use across the company. The article categorizes skill types, shares best practices for writing them, and explains how to distribute and measure them within an organization. It treats skills not merely as markdown files but as structured folders with scripts, assets, hooks, and data.

Key Points

  • Skills fall into 9 categories: Library/API Reference, Product Verification, Data Fetching, Business Process Automation, Code Scaffolding, Code Quality/Review, CI/CD, Runbooks, and Infrastructure Operations
  • The most valuable content in any skill is the Gotchas section — failure modes Claude has actually hit
  • Skills are folders, not just markdown — use the file system for progressive disclosure (scripts, templates, reference docs)
  • The description field is a trigger condition for the model, not a human-facing summary
  • On-demand hooks (activated only when a skill is called) are useful for opinionated, situational behaviors like blocking destructive commands
  • Skills can store memory in log files or JSON; use ${CLAUDE_PLUGIN_DATA} for stable storage across upgrades
  • Distribution options: check into repo (.claude/skills) or use an internal plugin marketplace

Insights

  • The framing of skills as “context engineering” is notable — rather than prompting Claude at runtime, you’re pre-loading relevant knowledge and tools into its context strategically
  • “Avoid railroading Claude” is a key tension: give enough structure to be useful, but leave flexibility to adapt to the situation
  • The recommendation to have engineers spend a week just making verification skills excellent signals how high-leverage correct output validation is in agentic workflows
  • Composing skills by name reference (without native dependency management) is a current limitation — it works but relies on the target skill being installed
  • The PreToolUse hook for logging skill usage at the company level is a simple but effective way to measure adoption and undertriggering

Connections

Raw Excerpt

The highest-signal content in any skill is the Gotchas section. These sections should be built up from common failure points that Claude runs into when using your skill. Ideally, you will update your skill over time to capture these gotchas.