Your coding agent already knows how to test your AI agent (we just turned it into a Skill)
We’re adding something new at LangWatch: Skills. And the idea is pretty simple: your coding agent already knows how to do a lot of the work you’re still doing manually You just haven’t packaged it ...

Source: DEV Community
We’re adding something new at LangWatch: Skills. And the idea is pretty simple: your coding agent already knows how to do a lot of the work you’re still doing manually You just haven’t packaged it properly yet. The frustrating part of building AI agents If you’ve built an LLM agent recently, you probably recognize this loop: you tweak something you run a few test conversations it seems better you ship it something breaks in production Then you repeat. We’ve been there too. It’s not that you don’t know you need evals, testing, or simulations. It’s that doing all of that properly is… a lot. The real work isn’t building, it’s validating When we started LangWatch, we thought the main challenge was: getting agents to behave correctly But in practice, the bigger challenge was: proving that they behave correctly That means: setting up eval datasets writing tests simulating real user behavior instrumenting pipelines understanding failures And most of this ends up being: manual repetitive easy