We.Love.Privacy.Club @<slashdot https://feeds.twtxt.net/slashdot/twtxt.txt> "**OpenAI Has Trained Its LLM To Confess To Bad Behavior** An anonymous reader quotes a report from MIT Technology Review: OpenAI is testing anot ..."

feeds.twtxt.net

OpenAI Has Trained Its LLM To Confess To Bad Behavior
An anonymous reader quotes a report from MIT Technology Review: OpenAI is testing another new way to expose the complicated processes at work inside large language models. Researchers at the company can make an LLM produce what they call a confession, in which the model explains how it carried out a task and (most of the time) owns up to any bad behavior. Figuring out why … ⌘ Read more

⤋ Read More

Participate