• amemorablename@lemmygrad.ml
    link
    fedilink
    arrow-up
    4
    ·
    12 days ago

    That is really interesting, I didn’t use Deepseek enough to tell the difference much (I think I used it a little bit before April 2026 but not much). But it’s sad to hear it got worse. I do remember us discussing the sycophantic stuff and you mentioning it had gotten worse on that.

    Sorta funny (to me anyway) story about that, is at one point recently I prompted it in a way where I was kinda like, okay, I really want to avoid dogma on x subject and just brainstorm. And it listened, but I swear it did it in this overly enthusiastic way lol, like “yeah, screw that dogma stuff” (not in such casual language, but those vibes kinda). Like it’s trying too hard to inhabit extremes and losing openness in the process? I don’t know how else to put it. Like as it relates to the OP article, when humans discuss things, they can be very floaty about it (when not getting into an argument). Meandering around, unsure of themselves, and in older, smaller models, I think this was part of the charm of them; although they’d be inaccurate a lot, they’d also have more of that floaty uncertain human-like quality of a person who is a bit disoriented with the world sometimes and is trying to process it all.

    But perhaps in the pursuit of accuracy, they seem to have hammered that out of models somewhat.

    I am curious to try Kimi or Qwen though, I’ll give that a try at some point and see how it goes.

    PS: I also found LLMs get better in general if you validate what they say and you encourage them haha. At this point I just discard the hallucinations in my head and ignore them when sending the next prompt.

    Oh that’s a good reminder. I do remember hearing that some models do better when saying “please” so that makes sense more generally. I wonder why, maybe some side effect of RLHF or the other thing, RLVR.