I think this part references it, though it’s kinda solely in passing:
Production evaluations can elicit entirely new forms of misalignment before deployment. More importantly, despite being entirely derived from GPT-5 traffic, our evaluation shows the rise of a novel form of model misalignment in GPT-5.1 – dubbed “Calculator Hacking” internally. This behavior arose from a training-time bug that inadvertently rewarded superficial web-tool use, leading the model to use the browser tool as a calculator while behaving as if it had searched. This ultimately constituted the majority of GPT-5.1’s deceptive behaviors at deployment.
ctrl+f for “calculator”, though it doesn’t really use the (detailed) wording from the OP, which I think they copied from this list of links without attribution :P
Where is that in this article
I think this part references it, though it’s kinda solely in passing:
ctrl+f for “calculator”, though it doesn’t really use the (detailed) wording from the OP, which I think they copied from this list of links without attribution :P