• Binette@lemmy.ml
    link
    fedilink
    arrow-up
    7
    ·
    6 hours ago

    Kinda why i like reinforcement learning. You end up with silly stuff like this.

    • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
      link
      fedilink
      arrow-up
      8
      ·
      6 hours ago

      The funniest thing for me is that humans end up doing the exact same thing. This is why it’s so notoriously difficult to create organizational policies that actually produce desired results. What happens in practice is that people find ways to comply with the letter of the policy that require the least energy expenditure on their part.

    • Ohmmy@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      34
      ·
      19 hours ago

      Honestly, it’s fucking relatable. A place I worked used to round the time clock to the nearest quarter hour so I would dick around a minute or two so it rolled up instead of down.

      • comfy@lemmy.ml
        link
        fedilink
        arrow-up
        32
        ·
        19 hours ago

        A friend of mine has their large corporate company telling everyone they have to show up to one of their offices on at least two days each week. Now a few people just walk there at 2355, clock out at 0005, and spend the rest of the week at home.

        Silly conditions -> silly behaviors

      • winkerjadams@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        13
        ·
        17 hours ago

        The place I work at now rounds by quarter hours so if you punch in early at 8:53 it’s the same pay as punching in late at 9:07. Guess who has never been early to punch in but has been late quite a few…

  • Carl [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    39
    ·
    edit-2
    18 hours ago

    lmao that’s great.

    One time I asked GLM to run a test on a piece of code, and it wrote a python script that printed “Text Successful!” to the terminal but didn’t actually do anything. These things are so incredibly bad at times.

    • tias@discuss.tchncs.de
      link
      fedilink
      arrow-up
      9
      ·
      13 hours ago

      In some ways yes, but this effect would appear with any kind of reinforcement learning whether it’s neural networks or just fuzzy logic. The goal is to promote certain behaviors and if it performs the behaviors that you promoted then the method works.

      The problem is that, just like with KPI:s, promoting specific indicators too hard leads to suboptimal results.

    • Jayjader@jlai.lu
      link
      fedilink
      arrow-up
      2
      ·
      7 hours ago

      I think this part references it, though it’s kinda solely in passing:

      Production evaluations can elicit entirely new forms of misalignment before deployment. More importantly, despite being entirely derived from GPT-5 traffic, our evaluation shows the rise of a novel form of model misalignment in GPT-5.1 – dubbed “Calculator Hacking” internally. This behavior arose from a training-time bug that inadvertently rewarded superficial web-tool use, leading the model to use the browser tool as a calculator while behaving as if it had searched. This ultimately constituted the majority of GPT-5.1’s deceptive behaviors at deployment.