seminar 20211103 Assessing Generalization of SGD via Disagr

水木社区手机版

主题:seminar 20211103 Assessing Generalization of SGD via Disagr
楼主|vinbo|2021-10-31 12:59:31|只看此ID
Assessing Generalization of SGD via Disagreement
Speaker

Yiding Jiang, Carnegie Mellon University
Time

2021-11-03 10:00 ~ 11:00, in 3 days (Asia/Shanghai Time)
Venue

Tencent Meeting
Meeting Info

    Time: 2021/11/03 10:00-11:00
    Meeting ID: 784 571 907
    Password: 742518
    Link: https://meeting.tencent.com/dm/maEZfUpM86ea

Abstract

Generalization in deep learning has attracted a large amount of attention in the past few years since it defies the wisdom of traditional statistical learning theory. In this work, we empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data. This builds on -- and is a stronger version of -- the observation in Nakkiran & Bansal '20, which requires the second run to be on an altogether fresh training set. We further theoretically show that this peculiar phenomenon arises from the well-calibrated nature of ensembles of SGD-trained models. This finding not only provides a simple empirical measure to directly predict the test error using unlabeled test data, but also establishes a new conceptual connection between generalization and calibration.
--
FROM 202.121.181.*

BYR-Team©2010. KBS Dev-Team©2011 登录完整版