Thoughts on Hermes?

damnthefilibuster@lemmy.world · 1 month ago

Thoughts on Hermes?

unfortunate_ferret@piefed.ca · 1 month ago

I’ve been using it with Opencode Go, Ollama, and Claude Code (it can delegate tasks to models through all those, so you can have Claude plan and Deepseek Flash build); I really like it.

I ran into that problem with the agent reporting that subagents succeeded, or work had been done, where it hadn’t (“I said I tested that, but I didn’t. That’s on me. Won’t happen again”), so I built a self-check enforcement system for it. You or your agent can set up the system by reading this: https://github.com/obelisk-complex/hermes-agent/blob/main/self-check-enforcement-system-v15.md

It includes the source patch which adds a hook on_output; this allows you to intercept text sent directly from the LLM to the user, which in vanilla is unblockable. So, this system ensures that if something remains unfinished, the LLM can’t say it’s done; it has to acknowledge what it didn’t do before it can send you a message to close the conversation loop. I’ve built the fork to automatically merge upstream changes around this patch daily at 0400 Pacific time, so I should stay up to date (ish).

I also put in a feature request to get this added upstream. Feature request here: https://github.com/NousResearch/hermes-agent/issues/45881

damnthefilibuster@lemmy.world · 1 month ago

what’s opencode go and how is it different than opencode?

I’ll check out the subagent reporting issue. I did run into it with Gemma-4 but Qwen3.5 and 3.6 both work well in completing tasks. Local models aren’t perfect, but they’re damn close!

unfortunate_ferret@piefed.ca · edit-2 1 month ago

The harness helps a lot even with local models. In fact, I just found this this morning and cherrypicked it: https://github.com/DietrichGebert/ponytail

Recommend doing the same, and for superpowers if you don’t have 'em already: https://github.com/obra/superpowers

Opencode Go is the $10/month cloud model subscription from the same group maintaining the OpenCode software. Opencode Zen is a pay-as-you-go version which gives you access to Claude models as well. Keeping pay-as-you-go to subagents only (e.g. telling your agent to launch an opus subagent via your opencode zen key) is actually surprisingly economical - when you’re not going turn after turn with hundreds of thousands of tokens of context, claude is pretty reasonably priced.

What I’m doing is spreading out my usage over multiple cheap subscriptions, and augmenting with the occasional pay-as-you-go frontier agent, to get quality in line with what you get out of Claude, at usage that would require the $200/month level, for a lot less money than that.

damnthefilibuster@lemmy.world · 1 month ago

I’m a little surprised to hear you say PAYG for Opus sub agents is economical. Maybe the superpowers and ponytail really do have a massive impact on things. I’ll send these to three people I know building heavy production apps right now. And integrate them into my own Hermes setup.

Thank you for the recommendations!

unfortunate_ferret@piefed.ca · edit-2 1 month ago

Any time, I hope they’re helpful! (☞ﾟヮﾟ)☞

I’m a little surprised to hear you say PAYG for Opus sub agents is economical

I did say it was surprising! 😂 To give you an idea what I mean by “economical”, it’s never more than a few bucks a day, even on days of heavy use and development with “loop until clean” instructions on QA (for which I use Opus). I accidentally blew through my opencode go quota really early in the first month, so I ended up on PAYG; here’s the usage graph:

And here’s the numbers breakdown for the highest day (I was evaluating GLM5.1 for general tasks - don’t use it for that, it’s really token hungry)

That includes a lot of experimentation too while I figured out which models were best for what. I hid Fable because it crushed the rest of the table - really expensive, but worth it for one-shotting very long tasks on the Anthropic subscription is what I found.

damnthefilibuster@lemmy.world · 1 month ago

Damn those are good numbers!