I Built a Skill Reviewer. Then I Ran It on Itself.
I built a tool that reviews Claude Code skills for quality issues. Then I pointed it at its own source files. It found real problems. The irony wasn't lost on me. But the more interesting question ...

Source: DEV Community
I built a tool that reviews Claude Code skills for quality issues. Then I pointed it at its own source files. It found real problems. The irony wasn't lost on me. But the more interesting question is: why did this happen, and what does it tell us about how LLM-based quality tools actually work? The Setup I maintain rashomon, a Claude Code plugin for prompt and skill optimization. It includes a skill reviewer agent that evaluates skill files against 8 research-backed patterns (BP-001 through BP-008) and 9 editing principles. One of those patterns—BP-001—says: don't write instructions in negative form. Research shows LLMs often fail to follow "don't do X" instructions—negated prompts actually cause inverse scaling, where larger models perform worse. The fix is to rewrite them positively: instead of "don't skip P1 issues," write "evaluate all P1 issues in every review mode." Simple enough. Except both my agent definition files had a section called ## Prohibited Actions full of "don't" ins