Skip to content

IDS Research Seed Funds 2023: Four Transformative Projects Conclude in Triumph

Outcome Presentation Session Recap

On October 15, 2025, our visionary teams present their two years of breakthrough outcomes from the HK$300,000 seed money, awarded from the HKU IDS Research Seed Funds launched in year 2023 (“IDS-RSF2023”). Principal Investigators of the 4 awarded projects, IDS scholars, and IDS management members, joined together for a celebration of impact, after the 2 years of hard work on interdisciplinary research directions, benefitting sectors territory-wide!

This event was more than a showcase of good works, but it was a tribute to collaboration and real-world application, centering the themes of: 

  • Foundation of Data Science 
  • Machine Learning; and  
  • Application of Data Science

 

The outcome presentation session was graced by the 4 teams – those four groundbreaking projects under IDS-RSF2023 included:  

Project #1 Individualised Bleeding and Stroke Risk Prediction in Patients with Atrial Fibrillation: A Machine Learning Approach Using Multimodal Healthcare Big Data
PI
Prof. Esther Chan (Pharmacology & Pharmacy)
Co-PI
Prof. Qingpeng Zhang (HKU IDS / Pharmacology & Pharmacy)
Co-Investigators include
  • Prof. Celine Chui, School of Nursing / School of Public Health 
  • Prof. Eric Wan, Family Medicine and Primary Care / Pharmacology & Pharmacy 
  • Prof. Gary Lau, School of Clinical Medicine 
  • Prof. Reynold Cheng, School of Computing & Data Science 
Research Areas Concerned
Precision Medicine, Multimodal Healthcare AI
Executive Summary
Atrial Fibrillation (AF) is a global health concern affecting over 30 million individuals, and is characterised by an irregular and abnormally fast heart rate, with symptoms including palpitations, dizziness, fatigue and shortness of breath. Patients with AF have a fivefold increased risk of stroke; therefore, oral anticoagulation therapy is often prescribed to prevent stroke occurrence. However, oral anticoagulants are associated with an increased risk of bleeding, which can sometimes be fatal. Thus, it is crucial to balance the risk of stroke and bleeding, and quantify the risk-benefit of various interventions to optimise treatment for each individual. Although current risk scores exist, they are limited in that they predict risk at the population level, do not consider temporal variables, and are derived from Caucasian populations, which may not be generalisable to the Chinese population. Therefore, we aimed to use machine learning methods and healthcare data to develop a risk prediction tool that can provide an individualised bleeding and stroke risk score for patients with AF. Ultimately, this would assist clinicians’ prescribing decisions, and ease patients’ concerns regarding potential side effects, improving medication adherence and patient outcomes.
 
We obtained anonymised electronic health records from the Hong Kong Hospital Authority, and identified patients who were diagnosed with AF and prescribed anticoagulants. We incorporated static variables including baseline age and sex, dynamic variables including comorbidities, laboratory test readings, types and pattern of anticoagulants usage, and the use of other concurrent medications, and temporal variables including a cosine-transformed month variable to capture cyclicality. The outcome was whether each patient experienced a stroke event in the subsequent month. We used traditional logistic regression and Cox regression models to first develop a benchmark model for performance evaluation, and subsequently used deep learning methods to improve the performance of these benchmarks.
 
Model performance was assessed using the area under the curve (AUC); the benchmark models achieved an AUC of 37.7% and 48.4%, whereas the machine learning models achieved AUCs of 94% or above, marking a significant improvement in performance. Our models successfully identified the majority of stroke cases in the test set, but also had a significant number of false positives, demonstrating high sensitivity but relatively low precision. Our model also identified variables most strongly associated with the risk of stroke, and revealed potentially important contributors such as a history of stroke, the use of certain drugs including oral anticoagulants, mucolytics, and anti-arrhythmic drugs, and laboratory variables such as renal function and international normalized ratio.
 
In this project, we developed a machine learning model for stroke prediction in patients with AF, accounting for dynamic and temporal variables that are not considered in current clinical risk scores. This model is readily extendable to predict the risk of bleeding as well. This model provides clinical interpretability, offering insights into factors that contribute most to stroke risk, potentially uncovering novel clinical associations. Most importantly, the model achieved a very high sensitivity, successfully identifying most stroke cases. From a clinical standpoint, this sensitivity is valuable, as missing a high-risk patient carries serious consequences, and identifying high-risk patients will allow for timely intervention.
Project #2 Split Learning over 5G+ Edge Computing: Enabling Deep Learning on Resource-constrained Devices
PI
Prof. Xiaohao Chen (Electrical and Electronic Engineering)
Co-PI
Prof. Xihui Liu (HKU IDS / Electrical and Electronic Engineering)
Research Areas Concerned
Edge AI, Privacy-Preserving Learning
Executive Summary
With the funding support from IDS, the PI and co-PI have explored resource-constrained split federated learning (SFL) and its potential to democratize machine learning in edge computing systems, publishing a series of papers, including the first convergence analysis (showing the impact of model splitting) and optimization framework for SFL [R4] and hierarchical SFL [R3], communication-efficient split learning (SL) frameworks [R2, R5], and SFL frameworks under device/data heterogeneity [R1]. These findings explore the important theoretical trade-offs in resource-constrained SL and provide powerful algorithm design to improve its effectiveness and efficiency.
 

[R1] Wei Wei, Zheng Lin, Xihui Liu, Hongyang Du, Dusit Niyato, Xianhao Chen, “Optimizing Split Federated Learning with Unstable Client Participation”, IEEE Transactions on Mobile Computing (under review)
[R2] Zheng Lin , Guangyu Zhu, Yiqin Deng, Xianhao Chen, Yue Gao , Kaibin Huang, and Yuguang Fang, “Efficient Parallel Split Learning Over Resource-Constrained Wireless Edge Networks”, IEEE Transactions on Mobile Computing, 2024
[R3] Zheng Lin; Wei Wei; Zhe Chen; Chan-Tong Lam; Xianhao Chen; Yue Gao , “Hierarchical Split Federated Learning: Convergence Analysis and System Optimization”, IEEE Transactions on Mobile Computing, 2025.
[R4] Zheng Lin , Guanqiao Qu, Wei Wei , Xianhao Chen , and Kin K. Leung, “AdaptSFL: Adaptive Split Federated Learning in Resource-Constrained Edge Networks”, IEEE Transactions on Networking, 2025.
[R5] Tao Li, Yiyang Song, Yulin Tang, Cong Wu, Xihui Liu and Xianhao Chen, “Communication-efficient Split Federated Fine-tuning of Large Language Models with Temporal Compression”, WWW 2026 (under review).

Project #3 Intelligent Tutoring System for Collaborative Learning: A Hypergraph Approach to Analyzing Asynchronous Learning Process Data
PI
Prof. Shihui Feng (Education)
Co-PI
Prof. Alec Kirkley (HKU IDS / Urban Planning & Development)
Research Areas Concerned
AI in Education, Hypergraph Analytics
Project #4 CREC – An LLM-based Conversational Public Legal Knowledge Recommendation System
PI
Prof. Benjamin Kao (School of Computing & Data Science)
Co-PI
Prof. Chao Huang (HKU IDS / School of Computing & Data Science)
Research Areas Concerned
LegalTech, Explainable LLMs
Executive Summary
With advancements in legal technology, access to legal information—such as judgments and legislation—has become widely available online. The Free Access to Law Movement (FALM), a global association of over 60 member organizations, promotes free access to legal data. As a FALM member, the Hong Kong Legal Information Institute (HKLII), operated by The University of Hong Kong (HKU), provides a platform for accessing Hong Kong ordinances, regulations, and court judgments. Recent integration of AI technologies has further enhanced legal research on HKLII platform, aligning with the growing trend of AI adoption in legal information systems.
 
Despite improved accessibility, non-legal professionals face challenges in comprehending and navigating primary legal sources due to their formal language and complexity. Additionally, users often struggle to identify applicable legal principles for their specific situations. To address
comprehensibility, HKU’s Law and Technology Centre developed the Community Legal Information Centre (CLIC), which simplifies legal concepts into layperson-friendly articles. As of 2025, CLIC hosts 2,386 articles, yet navigability remains an issue as users must manually locate relevant content.
 
In this project, we improve both comprehensibility and navigability by constructing a Legal Question Bank (LQB), comprising approximately 50,000 model legal questions linked to CLIC articles. The LQB allows users to find pre-formulated questions matching their inquiries, directing them to the corresponding CLIC content without extensive manual searching. The LQB was developed using two methods: (1) machine-generated questions (MGQs) produced by large language models (LLMs) and (2) human-composed questions (HCQs) crafted by legal experts.
 
To further enhance usability, we designed the CLIC Recommender (CRec), a conversational tool that assists users in identifying relevant legal questions. CRec interprets users’ verbal or colloquial descriptions of their situations, disambiguates their issues, and recommends a shortlist of pertinent questions from the LQB. Users can then select the most relevant question and access the corresponding CLIC article for detailed explanations. This approach significantly streamlines legal research for non-experts, reducing the need for manual browsing.
 
Since its deployment on the CLIC platform in January 2025, CRec has supported public legal research, with the platform recording over 7 million article views in the past year. The tool handles an average of 600 user conversations monthly, demonstrating its practical utility. Additionally, the Faculty of Law has adopted CRec as an educational resource, integrating it into clinical legal training to enhance law students’ research and problem-solving skills.

The IDS Research Seed Funds (IDS-RSF) 2023 represents the inaugural effort by IDS to advance interdisciplinary research through a collaborative synergy of data science and artificial intelligence, supported by dedicated IDS funding. This initiative has successfully cultivated research partnerships between IDS scholars and members from diverse departments and faculties. We are pleased to report the promising research outcomes resulting from these collaborations. We look forward with great expectation to the groundbreaking contributions expected from IDS-RSF 2025.