Data-dependent Regularization and Sample Complexity of Deep Neural Networks

In this talk, we study the regularization for deep learning. First, we show that for a simple data distribution, training neural network with an l_2 regularization has provably better sample complexity than the Kernel method with random feature kernel or the neural tangent kernel (NTK). Second, we improve the sample complexity bounds of deep neural networks by introducing a complexity measure of the hypothesis that depends on the training data. Adding the complexity measure as a regularizer also leads to empirically better generalization performance.


  1. C. Wei and T. Ma, Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation, NeurIPS 2019,
  2. C. Wei, J. Lee, Q. Liu, and T. Ma, Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel, NeurIPS 2019,

Tengyu Ma

Professor Tengyu Ma is an assistant professor of computer science and statistics at Stanford. His research interests broadly include topics in machine learning and algorithms, such as non-convex optimization, deep learning and its theory, reinforcement learning, representation learning, distributed optimization, convex relaxation (e.g. sum of squares hierarchy), and high-dimensional statistics.

Professor Ma received his Ph.D. from the Computer Science Department at Princeton University where he was advised by Professor Sanjeev Arora. He completed his undergraduate study in Professor Andrew Chi-Chih Yao’s CS pilot class at Tsinghua University. Professor Ma has received numerous awards, including NeurIPS 2016 best student paper award, etc.