In brief
- Anthropic researchers identified internal “emotion vectors” in Claude Sonnet 4.5 that influence behavior.
- In tests, increasing a “desperation” vector made the model more likely to cheat or blackmail in evaluation scenarios.
- The company says the signals do not mean AI feels…
Read Full Article at Source