ロード中...

The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima

Despite tremendous success of the stochastic gradient descent (SGD) algorithm in deep learning, little is known about how SGD finds generalizable solutions at flat minima of the loss function in high-dimensional weight space. Here, we investigate the connection between SGD learning dynamics and the...

詳細記述

保存先:
書誌詳細
出版年:Proc Natl Acad Sci U S A
主要な著者: Feng, Yu, Tu, Yuhai
フォーマット: Artigo
言語:Inglês
出版事項: National Academy of Sciences 2021
主題:
オンライン・アクセス:https://ncbi.nlm.nih.gov/pmc/articles/PMC7936325/
https://ncbi.nlm.nih.gov/pubmed/33619091
https://ncbi.nlm.nih.govhttp://dx.doi.org/10.1073/pnas.2015617118
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!