
DATA8003 - Theoretical Foundation of Deep Learning
Course Instructor

Professor Yingyu LIANG
Associate Professor
HKU Musketeers Foundation Institute of Data Science and
Department of Computer Science, School of Computing and Data Science, HKU
Professor Yingyu Liang is an Associate Professor in the Musketeers Foundation Institute of Data Science and Department of Computer Science at The University of Hong Kong. He is also an Associate Professor at the Department of Computer Sciences at the University of Wisconsin-Madison. Before that, he was a postdoc at Princeton University. He received his Ph.D. in 2014 from Georgia Tech, and M.S. (2010) and B.S. (2008) from Tsinghua University. He is a recipient of the NSF CAREER award.
His research group aims at providing theoretical foundations for modern machine learning models and designing efficient algorithms for real world applications. Recent focuses include optimization and generalization in deep learning, robust machine learning, and their applications.

Professor Difan ZOU
Assistant Professor
HKU Musketeers Foundation Institute of Data Science and
Department of Computer Science, School of Computing and Data Science, HKU
Professor Difan Zou is an Assistant Professor in HKU IDS & Computer Science, School of Computing and Data Science, at The University of Hong Kong. He received his Ph.D. in Computer Science, University of California, Los Angeles (UCLA). He received a B. S degree in Applied Physics, from School of Gifted Young, USTC and a M. S degree in Electrical Engineering from USTC. He has published multiple papers on top-tier machine learning conferences including ICML, NeurIPS, ICLR, COLT, etc. He is a recipient of Bloomberg Data Science Ph.D. fellowship. His research interests are broadly in machine learning, optimization, and learning structured data (e.g., time-series or graph data), with a focus on theoretical understanding of the optimization and generalization in deep learning problems.
Course Description
Deep learning has achieved great success in many real-world applications. However, the reason why deep learning is so powerful remains elusive. The goal of this course is to introduce theoretical tools and methods that are developed to understand and explain the success of deep learning. In particular, this course will cover multiple aspects of machine learning, including landscape analysis, optimization, generalization, and algorithm designs. We will start with the introduction of the basic setup of machine learning problems, including loss function, training algorithms, and generalization performance evaluation. Then we will further introduce the conventional optimization theory and statistical learning theory, and discuss its limitation in studying over-parameterized deep neural network models. We will also introduce the neural tangent kernel (NTK) theory, a modern theoretical method can handle over-parameterization and nonconvex issues in deep learning. Finally, we will discuss representation learning and benign-overfitting of over-parameterized learning models, and their connections to the optimization and generalization in deep learning. The instructor will give lectures on the selected topics. Students will need to complete the homework (including programming and mathematical derivations) and a course project.
This course will be similar to STATS214 / CS229M: Machine Learning Theory in Stanford, and CS269: Foundations of Deep Learning in UCLA.
Prerequisites
We require students to have prior knowledge in undergraduate linear algebra, statistics, probability, and calculus. Background in optimization and machine learning is not required but preferred.
HKU IDS
Research Postgraduate Programme
DATA8003 - Theoretical Foundation of Deep Learning (Computation)
Course Description
Deep learning has achieved great success in many real-world applications. However, the reason why deep learning is so powerful remains elusive. The goal of this course is to introduce theoretical tools and methods that are developed to understand and explain the success of deep learning. In particular, this course will cover multiple aspects of machine learning, including landscape analysis, optimization, generalization, and algorithm designs. We will start with the introduction of the basic setup of machine learning problems, including loss function, training algorithms, and generalization performance evaluation. Then we will further introduce the conventional optimization theory and statistical learning theory, and discuss its limitation in studying over-parameterized deep neural network models. We will also introduce the neural tangent kernel (NTK) theory, a modern theoretical method can handle over-parameterization and nonconvex issues in deep learning. Finally, we will discuss representation learning and benign-overfitting of over-parameterized learning models, and their connections to the optimization and generalization in deep learning. The instructor will give lectures on the selected topics. Students will need to complete the homework (including programming and mathematical derivations) and a course project.
This course will be similar to STATS214 / CS229M: Machine Learning Theory in Stanford, and CS269: Foundations of Deep Learning in UCLA.
Prerequisites
We require students to have prior knowledge in undergraduate linear algebra, statistics, probability, and calculus. Background in optimization and machine learning is not required but preferred.