Загрузка...
Safety-Guaranteed, Accelerated Learning in MDPs with Local Side Information
In environments with uncertain dynamics, synthesis of optimal control policies mandates exploration. The applicability of classical learning algorithms to real-world problems is often limited by the number of time steps required for learning the environment model. Given some local side information a...
Сохранить в:
| Опубликовано в: : | Proc Am Control Conf |
|---|---|
| Главные авторы: | , |
| Формат: | Artigo |
| Язык: | Inglês |
| Опубликовано: |
2020
|
| Предметы: | |
| Online-ссылка: | https://ncbi.nlm.nih.gov/pmc/articles/PMC7676387/ https://ncbi.nlm.nih.gov/pubmed/33223606 https://ncbi.nlm.nih.govhttp://dx.doi.org/10.23919/acc45564.2020.9147372 |
| Метки: |
Добавить метку
Нет меток, Требуется 1-ая метка записи!
|