Chargement en cours...

Weber–Fechner law in temporal difference learning derived from control as inference

This study investigates a novel nonlinear update rule for value and policy functions based on temporal difference (TD) errors in reinforcement learning (RL). The update rule in standard RL states that the TD error is linearly proportional to the degree of updates, treating all rewards equally withou...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Keiichiro Takahashi, Taisuke Kobayashi, Tomoya Yamanokuchi, Takamitsu Matsubara
Format: Artigo
Langue:Inglês
Publié: Frontiers Media S.A. 2025-09-01
Collection:Frontiers in Robotics and AI
Sujets:
Accès en ligne:https://www.frontiersin.org/articles/10.3389/frobt.2025.1649154/full
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!