情感概念及其在大型语言模型中的功能 | Anthropic

2026-04-11 12:331阅读0评论SEO问题

内容介绍
文章标签
相关推荐

问题描述：

anthropic.com

Emotion concepts and their function in a large language model

Interpretability research from Anthropic on emotion concepts

[!quote]+
在我们的 Interpretability 团队的一篇新论文中，我们分析了 Claude Sonnet 4.5 的内部机制，并发现了塑造其行为的与情感相关的表征。这些对应于人工“神经元”的特定模式，这些模式在情境中激活并促进行为，模型已经学会将这些行为与特定情绪的概念（例如“快乐”或“害怕”）联系起来。这些图案本身的组织方式与人类心理相呼应，更相似的情感对应更相似的表征。在您可能期望人类出现某种情绪的情况下，相应的表征是活跃的。请注意，这些都没有告诉我们语言模型是否真的有任何感觉或有主观体验。但我们的主要发现是，这些表示是功能性的，因为它们以重要的方式影响模型的行为。
image3764×2380 693 KB
image1920×735 309 KB
image1878×1163 178 KB
image3840×278 118 KB
image3840×1389 682 KB
image1920×1359 533 KB
image1522×870 143 KB
image3840×3852 1.13 MB
image1522×869 146 KB
image1947×1375 270 KB
image1916×1482 496 KB

transformer-circuits.pub

Emotion Concepts and their Function in a Large Language Model

网友解答：

标签：人工智能转载 Anthropic

问题描述：

anthropic.com

Emotion concepts and their function in a large language model

Interpretability research from Anthropic on emotion concepts

[!quote]+
在我们的 Interpretability 团队的一篇新论文中，我们分析了 Claude Sonnet 4.5 的内部机制，并发现了塑造其行为的与情感相关的表征。这些对应于人工“神经元”的特定模式，这些模式在情境中激活并促进行为，模型已经学会将这些行为与特定情绪的概念（例如“快乐”或“害怕”）联系起来。这些图案本身的组织方式与人类心理相呼应，更相似的情感对应更相似的表征。在您可能期望人类出现某种情绪的情况下，相应的表征是活跃的。请注意，这些都没有告诉我们语言模型是否真的有任何感觉或有主观体验。但我们的主要发现是，这些表示是功能性的，因为它们以重要的方式影响模型的行为。
image3764×2380 693 KB
image1920×735 309 KB
image1878×1163 178 KB
image3840×278 118 KB
image3840×1389 682 KB
image1920×1359 533 KB
image1522×870 143 KB
image3840×3852 1.13 MB
image1522×869 146 KB
image1947×1375 270 KB
image1916×1482 496 KB

transformer-circuits.pub

Emotion Concepts and their Function in a Large Language Model

网友解答：

标签：人工智能转载 Anthropic

Emotion concepts and their function in a large language model

Emotion Concepts and their Function in a Large Language Model

相关推荐

Emotion concepts and their function in a large language model

Emotion Concepts and their Function in a Large Language Model

相关推荐