Skip to content

HKU IDS Scholar

Professor Jinhong DU

Assistant Professor
HKU Musketeers Foundation Institute of Data Science and
Department of Statistics and Actuarial Science, School of Computing and Data Science, HKU

jinhongd@hku.hk

🌐 https://jaydu1.github.io/dujinhong/

Key expertise

Statistics and Biostatistics

About me

Prof. Jinhong Du is an HKU-100 Assistant Professor at the University of Hong Kong, beginning in Fall 2025. He holds joint appointments in the HKU Musketeers Foundation Institute of Data Science, and the Department of Statistics and Actuarial Science, School of Computing and Data Science.
 
His research bridges statistical theory and high-impact applications, focusing on causal inference, interpretable machine learning, high-dimensional statistics, and statistical genomics. Dr. Du’s work has been published in leading venues across multiple disciplines, including premier statistics journals like the Journal of the American Statistical Association and the Journal of the Royal Statistical Society, Series B, top machine learning conferences like NeurIPS and ICML, and prominent scientific journals such as the Proceedings of the National Academy of Sciences.
 
He earned his Ph.D. in Statistics and Machine Learning from Carnegie Mellon University, an M.S. in Statistics from the University of Chicago, and a B.S. in Statistics from Sun Yat-sen University.

Current Research Project

Feature importance quantification faces a fundamental challenge: when predictors are correlated, standard methods systematically underestimate their contributions. We prove that major existing approaches target identical population functionals under squared-error loss, revealing why they share this correlation-induced bias. To address this limitation, we introduce Disentangled Feature Importance (DFI), a nonparametric generalization of the classical R-squared decomposition via optimal transport. DFI transforms correlated features into independent latent variables using a transport map, eliminating correlation distortion. Importance is computed in this disentangled space and attributed back through the transport map’s sensitivity. DFI provides a principled decomposition of importance scores that sum to the total predictive variability for latent additive models and to interaction-weighted functional ANOVA variances more generally, under arbitrary feature dependencies.
We develop a comprehensive semiparametric theory for DFI. For general transport maps, we establish root-n consistency and asymptotic normality of importance estimators in the latent space, which extends to the original feature space for the Bures-Wasserstein map. Notably, our estimators achieve second-order estimation error. By design, DFI avoids the computational burden of repeated submodel refitting and the challenges of conditional covariate distribution estimation, thereby achieving computational efficiency.

Selected Publications

Research Interests

Causal inference, Interpretable machine learning, Statistical network analysis, High-dimensional statistics, Statistical genomics