Folks,

I’m setting up Hermes Agent on my Mac with Ollama hosting a local model. But I’m on the fence on whether I should go with Hermes or OpenClaw. Hermes makes some pretty bold claims about “growing with you” and “self improvement”.

Anyone have any insight into whether it’s as good as promised?

  • obelisk_complex@piefed.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 days ago

    I’ve been using it with Opencode Go, Ollama, and Claude Code (it can delegate tasks to models through all those, so you can have Claude plan and Deepseek Flash build); I really like it.

    I ran into that problem with the agent reporting that subagents succeeded, or work had been done, where it hadn’t (“I said I tested that, but I didn’t. That’s on me. Won’t happen again”), so I built a self-check enforcement system for it. You or your agent can set up the system by reading this: https://github.com/obelisk-complex/hermes-agent/blob/main/self-check-enforcement-system-v15.md

    It includes the source patch which adds a hook on_output; this allows you to intercept text sent directly from the LLM to the user, which in vanilla is unblockable. So, this system ensures that if something remains unfinished, the LLM can’t say it’s done; it has to acknowledge what it didn’t do before it can send you a message to close the conversation loop. I’ve built the fork to automatically merge upstream changes around this patch daily at 0400 Pacific time, so I should stay up to date (ish).

    I also put in a feature request to get this added upstream. Feature request here: https://github.com/NousResearch/hermes-agent/issues/45881

    • damnthefilibuster@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 days ago

      what’s opencode go and how is it different than opencode?

      I’ll check out the subagent reporting issue. I did run into it with Gemma-4 but Qwen3.5 and 3.6 both work well in completing tasks. Local models aren’t perfect, but they’re damn close!

      • obelisk_complex@piefed.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        4 days ago

        The harness helps a lot even with local models. In fact, I just found this this morning and cherrypicked it: https://github.com/DietrichGebert/ponytail

        Recommend doing the same, and for superpowers if you don’t have 'em already: https://github.com/obra/superpowers

        Opencode Go is the $10/month cloud model subscription from the same group maintaining the OpenCode software. Opencode Zen is a pay-as-you-go version which gives you access to Claude models as well. Keeping pay-as-you-go to subagents only (e.g. telling your agent to launch an opus subagent via your opencode zen key) is actually surprisingly economical - when you’re not going turn after turn with hundreds of thousands of tokens of context, claude is pretty reasonably priced.

        What I’m doing is spreading out my usage over multiple cheap subscriptions, and augmenting with the occasional pay-as-you-go frontier agent, to get quality in line with what you get out of Claude, at usage that would require the $200/month level, for a lot less money than that.

        • damnthefilibuster@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 days ago

          I’m a little surprised to hear you say PAYG for Opus sub agents is economical. Maybe the superpowers and ponytail really do have a massive impact on things. I’ll send these to three people I know building heavy production apps right now. And integrate them into my own Hermes setup.

          Thank you for the recommendations!

          • obelisk_complex@piefed.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            3 days ago

            Any time, I hope they’re helpful! (☞゚ヮ゚)☞

            I’m a little surprised to hear you say PAYG for Opus sub agents is economical

            I did say it was surprising! 😂 To give you an idea what I mean by “economical”, it’s never more than a few bucks a day, even on days of heavy use and development with “loop until clean” instructions on QA (for which I use Opus). I accidentally blew through my opencode go quota really early in the first month, so I ended up on PAYG; here’s the usage graph: image

            And here’s the numbers breakdown for the highest day (I was evaluating GLM5.1 for general tasks - don’t use it for that, it’s really token hungry)
            image

            That includes a lot of experimentation too while I figured out which models were best for what. I hid Fable because it crushed the rest of the table - really expensive, but worth it for one-shotting very long tasks on the Anthropic subscription is what I found.