My Lemmy Oracle
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
☆ Yσɠƚԋσʂ ☆@lemmy.ml to Programmer Humor@lemmy.mlEnglish · 20 hours ago

ChatGPT apparently got rewarded for using its built-in calculator during training, and so it would covertly open its calculator, add 1+1, and do nothing with the result, on 5% of all user queries

alignment.openai.com

external-link
message-square
19
fedilink
164
external-link

ChatGPT apparently got rewarded for using its built-in calculator during training, and so it would covertly open its calculator, add 1+1, and do nothing with the result, on 5% of all user queries

alignment.openai.com

☆ Yσɠƚԋσʂ ☆@lemmy.ml to Programmer Humor@lemmy.mlEnglish · 20 hours ago
message-square
19
fedilink
Sidestepping Evaluation Awareness and Anticipating Misalignment with Production Evaluations
alignment.openai.com
external-link
A pipeline to uncover unknown misaligned behavior and scale the creation of realistic evaluations.
  • schnurrito@discuss.tchncs.de
    link
    fedilink
    arrow-up
    7
    ·
    17 hours ago

    ctrl+f for “calculator”, though it doesn’t really use the (detailed) wording from the OP, which I think they copied from this list of links without attribution :P

Programmer Humor@lemmy.ml

programmerhumor@lemmy.ml

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

  • Posts must be relevant to programming, programmers, or computer science.
  • No NSFW content.
  • Jokes must be in good taste. No hate speech, bigotry, etc.
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 593 users / day
  • 767 users / week
  • 1.21K users / month
  • 6.3K users / 6 months
  • 1 local subscriber
  • 40.8K subscribers
  • 1.93K Posts
  • 35.3K Comments
  • Modlog
  • mods:
  • AgreeableLandscape@lemmy.ml
  • cat_programmer@lemmy.ml
  • BE: 0.19.5
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org