Skip to content


Research Postgraduate Programme

DATA8003 - Theoretical Foundation of Deep Learning (Computation)

Course Instructor

Dr Difan ZOU

Course Description 

Deep learning has achieved great success in many real-world applications. However, the reason why deep learning is so powerful remains elusive. The goal of this course is to introduce theoretical tools and methods that are developed to understand and explain the success of deep learning. In particular, this course will cover multiple aspects of machine learning, including landscape analysis, optimization, generalization, and algorithm designs. We will start with the introduction of the basic setup of machine learning problems, including loss function, training algorithms, and generalization performance evaluation. Then we will further introduce the conventional optimization theory and statistical learning theory, and discuss its limitation in studying over-parameterized deep neural network models. We will also introduce the neural tangent kernel (NTK) theory, a modern theoretical method can handle over-parameterization and nonconvex issues in deep learning. Finally, we will discuss representation learning and benign-overfitting of over-parameterized learning models, and their connections to the optimization and generalization in deep learning. The instructor will give lectures on the selected topics. Students will need to complete the homework (including programming and mathematical derivations) and a course project. 

This course will be similar to STATS214 / CS229M: Machine Learning Theory in Stanford, and CS269: Foundations of Deep Learning in UCLA. 


We require students to have prior knowledge in undergraduate linear algebra, statistics, probability, and calculus. Background in optimization and machine learning is not required but preferred.