Information-Theoretic Generalization Bounds for SGLD
In this review, we study the progression of information-theoretic generalization bounds for Stochastic Gradient Langevin Dynamics (SGLD). SGLD is an important optimization algorithm with many applications in statistical learning. We discuss the formulation of SGLD and its applications. We review the first information-theoretic generalization bounds by Russo et al. and Xu et al. which apply to a more general class of learning algorithms, as well as newer work by Bu et al. on this subject. After surveying these fundamental works we conduct a review of the more specific works by Pensia et al. which focus on the class of iterative learning algorithms, of which SGLD is a part. We also review the work Haghifam, Negrea et al., which presents a new frontier in information theoretic bounds for SGLD by formulating the data-dependant estimation framework. Finally, we present a simple novel information-theoretic method to bound the generalization error of a particular formulation of SGLD with a square error loss function.
You can find the paper here.