Skip to content

Deadline: October 25, 2025 (Sat)

Organized by

 

Co-host

Co-organized by

Project Organized by

Supported by

Shanghai AI Laboratory and The University of Hong Kong will co-host the third Shanghai–Hong Kong AI Forum on March 1–2, 2026, in Zhangjiang, Pudong, Shanghai. Building on the success of previous editions—which brought together over 100 outstanding scientists and scholars from world-renowned institutions and attracted more than 100,000 participants online and onsite—the Forum offers researchers and academics from Hong Kong, Mainland China, and leading overseas universities an excellent opportunity to share insights and explore the latest advances in AI research and applications. This year, the Forum centres on three key themes: AI for Science, Agents, and AI for Education. The programme features keynote speeches, thematic talks across these tracks, and panel discussions to spark cross-cutting dialogue among participants. Join us for vibrant exchanges of ideas and innovations in AI and emerging data science fields, fostering collaboration at the frontier of these disciplines. Register today!

Date

Day 1:  1 March, 2026 (Sun)   0900 – 1720

Day 2:  2 March, 2026 (Mon)   0900 – 1630

Mode

Hybrid. Seats for on-site participation has been full. You are welcome to register for the online participation (Quotas are limited!)

Venue

Auditorium, HKU-CDS Teaching and Research Site,

Building 6, No. 500 Fangchun Road, Pudong New Area, Shanghai, China

News & Announcement

Schedule for Student Lightning Talks Confirmed!

We are pleased to announce that 7 outstanding PhD students have been selected to present at the “Student Lightning Talks” sessions! Please refer to the schedule below and the presenters will be contacted shortly.

Invited Speakers

Director

Theory Lab, Central Research Institute

Huawei Technologies Co., Ltd.

Professor

Teaching and Learning Innovation Centre (TALIC) and 
Faculty of Education

The University of Hong Kong

Professor

College of Computer Science and Technology

Zhejiang University

Research Scientist & Head of Large Model Center 

Shanghai Artificial Intelligence Laboratory

Professor

School of Mathematical Sciences

Shanghai Jiao Tong University

Associate Professor

Shanghai Jiao Tong University

Director

Computational Systems Biology Centre

Fudan University

Young Associate Researcher 

Institute of Modern Languages and School of Computer Science and Technology

Fudan University

Young Leading Scientist

Shanghai Artificial Intelligence Laboratory

Assistant Professor

College of AI

Tsinghua University

President 

Olin College of Engineering 

Professor

State Key Laboratory of Multimodal Artificial Intelligence Systems

Institute of Automation, Chinese Academy of Sciences

Distinguished
Professor 

Fudan University

Associate Professor

Department of Electronic Engineering

Tsinghua University

Professor

School of Psychological and Cognitive Sciences

Peking University

Senior Principal Research Manager

Microsoft Research Asia – Tokyo lab 

Assistant Professor

Shanghai Innovation Academy

Rachleff University Professor

University of Pennsylvania

Professor

Division of AI & Data Science, 
School of Computing and Data Science

The University of Hong Kong

Research Scientist

Shanghai Artificial Intelligence Laboratory 

Professor

School of Computer Science

Shanghai Jiao Tong University

Ms Qunying Zhang

Editor-in-Chief

Chaspark Technology Website 

Professor

Division of AI & Data Science, School of Computing and Data Science

The University of Hong Kong

Founder

Machine Heart

Professor

Shanghai Artificial Intelligence Laboratory 

Bosch AI Professor 

Computer Science Department

Tsinghua University

Bosch AI Professor 

Computer Science Department

Tsinghua University

Bosch AI Professor 

Computer Science Department

Tsinghua University

Bosch AI Professor 

Computer Science Department

Tsinghua University

Bosch AI Professor 

Computer Science Department

Tsinghua University

HKU Musketeers Foundation Institute of Data Science

Programme

1 March, 2026 (Sun)

Opening Ceremony & Plenary Session
0830 – 0900
Registration
0855 – 0900
Group Photo
0900 – 0910
Welcome Address
0910 – 0940
Quest of AI towards Specializable Generalist: From Reasoning to Scientific Discovery

AbstractThe pursuit of high-efficiency Artificial General Intelligence (AGI) requires more than brute-force scaling of model size and data. While scaling remains a key driver of capability, equally important are scalable architectural and principles—designs that continue to work, improve, and remain controllable as we vary model scale, domains, and modalities. Central to our approach is the concept of the “Specialized Generalist” – a pathway that achieves deep expertise across multiple domains without sacrificing broad generalization capabilities. In this talk, we introduce the “Specialized Generalist” paradigm and our implementation of it, SAGE (Synergistic Architecture for Generalized Expertise), a three-layer architecture designed to balance specialization and generalization in a systematic way. We will describe how SAGE’s Base Model, Synergy Fusion, and Exploration-Evolution layers interact in practice, focusing on concrete mechanisms for coordinating domain-specific expertise with broad general reasoning. We will share empirical results and recent advances in large reasoning models, embodied AI, and scientific applications to further illustrate the approach. A central motivation is to support “AGI for Science” by building a stable plateau of capabilities that can reliably assist with complex scientific workflows rather than isolated demos. Finally, we will outline the safety and governance questions that arise when deploying Specialized Generalist systems in high-impact settings, and discuss what we have learned so far about monitoring, alignment, and operational safeguards.
0940 – 1010
Engineering Faithful and Interpretable AI Systems

AbstractLarge Language Models (LLMs) and Vision Language Models (VLMs) have achieved remarkable performance across a wide range of tasks. However, their growing deployment has exposed fundamental limitations in faithfulness, safety, and transparency. In this talk, Prof. Vidal will present a unified perspective on addressing these challenges through principled model interventions and interpretable decision-making frameworks. He first introduces Parsimonious Concept Engineering (PaCE), an approach that improves faithfulness and alignment by selectively removing undesirable internal activations, mitigating hallucinations and biased language while preserving linguistic competence. Prof. Vidal then present Information Pursuit (IP), an interpretable-by-design prediction framework that replaces opaque reasoning with a sequence of informative, user-interpretable queries, yielding concise explanations alongside accurate predictions. Results across text, vision, and medical tasks illustrate how these ideas advance transparency without sacrificing performance. Together, these contributions point toward a broader direction for building AI systems that are powerful, faithful, and aligned with human values.
1010 – 1040
TBC
1040 – 1110
From Sensing to Acting: Challenges and Opportunities in Embodied AI

AbstractRecent progress in AI has been driven by large-scale data and models for vision and language. Compared to vision and language AIs, Embodied AI – AI systems that interact with the physical world – remains constrained due to the sensing capability and data scarcity. This talk examines Embodied AI through the lens of “from sensing to acting,” highlighting the current chief bottlenecks in sensing and action learning and discusses our path for achieving the Embodied AI that can cooperate with us in the real world.
1110 – 1140
Research poster presentation by students from the HKU IDS and the Shanghai AI Laboratory
1140 – 1300
Lunch
Lunch Break
Workshop 1: Science for AI
1300 – 1330
Registration
1330 – 1350
Dynamics-based AI4Science and Science4AI

AbstractThe rapid development of high-throughput omics technologies has provided unprecedented big data support for life science research. Biomedical data from multiple sources, dimensions, and scales constitute typical multi-source heterogeneous big data, exhibiting significant spatiotemporal dynamic characteristics. In response to this feature, there is an urgent need to develop a system of dynamic theories and AI methods that can accurately characterize the spatiotemporal evolution rules of data, including tipping point detection and early warning prediction based on dynamic systems and AI, time series prediction based on the low-dimensional characteristics of attractors, causal inference based on embedding theory, and AI-enabled nonlinear multimodal data fusion based on deep learning. These new data science theories and AI methods centered on dynamics can not only help understand and predict the dynamic behaviors of complex systems and analyze their intrinsic processes and mechanisms but also provide a more physically interpretable modeling paradigm for artificial intelligence. Thus, they form a mutually promoting research paradigm of AI for Science (AI4Science) and Science-driven AI (Science4AI). The relevant theories and algorithms can be widely applied to key scenarios such as early warning of tumor invasion, metastasis and recurrence, real-time monitoring of public health, sub-health risk assessment, time series prediction, and trusted AI construction, which is of great significance for promoting the interdisciplinary integration of dynamics, systems science, data science, and artificial intelligence.
1350 – 1410
From AI4S to AI Scientist

AbstractIn the era of Large Language Models and AI Agents, the paradigm of AI for Science (AI4S) is evolving toward a new stage of “AI Scientists.” This talk introduces OmniScientist, a general-purpose intelligent research platform designed for deep human-AI collaboration. By integrating knowledge and workflows across three dimensions—science, research, and scholars—through scientific large models and multi-agent workflows, the platform aims to achieve breakthroughs in theoretical hypothesis generation and closed-loop automated experimentation. This marks a fundamental shift in the research paradigm from “AI-assisted” to “AI-driven,” aiming to reshape the future path of scientific discovery.
1410 – 1430
Linked: New Challenges of Complexity Interacted with Intelligence

AbstractThe past decades have witnessed the enormous efforts to explore the network complexity rooted in all of science, from neurobiology to statistical physics. This talk will briefly review the complex network science in the past years starting from the seminar work of small-world/scale-free models to human dynamics and social intelligence as well. With facing the era of intelligence in natural and artificial versions, I will report several examples of our recent findings covered diffusion graph transformer, brain structural-functional mapping, and collective oscillator dynamics, which offer some snapshots of toy models to the new challenges of connectionism as the complex intelligence.
1430 – 1450
“LLM + KG ”: The New Equation for AI-Driven Scientific Discovery

AbstractThe advancement of scientific discovery increasingly depends on the integration of knowledge bases with generative intelligence. Knowledge graphs provide explicit representations of entities, relations, and domain knowledge, while large language models offer powerful capabilities in reasoning, abstraction, and summary. Their synergistic integration enables knowledge-grounded, interpretable, and adaptive solutions to complex scientific problems.As such, this report begins by examining the respective strengths and limitations of these two paradigms, along with their interaction and integration methods. Building on this understanding, it then highlights recent case studies where the fusion of large language models and knowledge graphs drives scientific discovery.
1450 – 1510
Intern Scientific Multimodal Foundation Model

AbstractShanghai AI Laboratory has released and open-sourced the Intern scientific multimodal large models, Intern-S1 and Intern-S1-Pro. Their general capabilities rank among the top tier of open-source models, while the scientific capabilities have reached a world-leading level. Enriched with multidisciplinary domain knowledge, the models focus on significantly enhancing scientific capabilities. They outperform state-of-the-art closed-source models such as GPT-5.2 and Gemini-3-Pro on professional task benchmarks across various disciplines, including chemistry, materials science, and life sciences. Furthermore, Intern-S1 pioneers a new multi-task paradigm that integrates general and specialized capabilities. It supports simultaneous large-scale multi-task reinforcement learning, achieving deep domain expertise while maintaining comprehensive general abilities.
1510 – 1530
Coffee Break
1530 – 1550
ScienceOne: Empowering Scientific Discovery

AbstractThe ScienceOne Scientific Foundation Model is developed by the Institute of Automation, Chinese Academy of Sciences (CAS), in collaboration with the Computer Network Information Center, the National Science Library, and ten other CAS-affiliated institutes. Designed to support the entire scientific research lifecycle, ScienceOne enables an integrated and collaborative workflow spanning problem formulation, research planning, numerical simulation and computation, experimental validation, and knowledge discovery. It promotes the transformation of “AI for Science” from fragmented, experimental applications toward a unified, platform-based paradigm. Across disciplines, engineering domains, and diverse research scenarios, ScienceOne delivers reusable and scalable intelligent research capabilities, supported by a comprehensive technical ecosystem encompassing models, data, tools, and applications. This presentation outlines the development background, technical framework, and representative application outcomes of ScienceOne.
1550 – 1610
Towards Foundation Models for Relational Databases

AbstractWhile foundation models for language and visual modalities are now commonplace, vast amounts of structured information, encompassing both scientific and enterprise domains alike, lie well outside of the current scope.  This shortcoming is particularly acute if we wish to enact predicting modeling over relational databases (RDBs), or analogous computational primitives that underlie intelligent decision making across structured/relational data.  Drawing on scientific principles of strategic parsimony, we explore a simple recipe for constructing RDB foundation models via in-context learning (ICL), whereby an RDB-specific encoder paired with a pre-trained Transformer enable predictions via a single forward pass.  From a practical standpoint, we release RDBLearn, an open-source RDB foundation model capable of robust performance on unseen RDBs out of the box with no additional training required.
1610 – 1630
Structural Scaffolds in Human Cognition

AbstractRather than passively absorbing sensory inputs, humans actively construct mental models of the world through experience and inference. This constructive process enables us to overcome cognitive limitations while supporting generalization and compositional reasoning. In this talk, I will present recent work from my lab combining behavioral experiments, EEG/MEG, eye-tracking, and computational modeling to investigate how the human brain compresses and organizes information through structured representations. Our findings suggest that hierarchical folding and structural inhomogeneity constitute core organizing principles of mental models, allowing the brain to efficiently cope with limited cognitive resources. Together, the human brain does not simply encode experiences—it builds structured scaffolds that actively compress, organize, and integrate new information into coherent internal models.
1630 – 1650
Improving Sampling for Masked Diffusion Models via Information Gain

AbstractTransformers replace recurrence with a memory that grows with sequence length and self-attention that enables ad-hoc look ups over past tokens. Consequently, they lack an inherent incentive to compress history into compact latent states with consistent transition rules. This often leads to learning solutions that generalize poorly. We introduce Next-Latent Prediction (NextLat), which extends standard next-token training with self-supervised predictions in the latent space. Specifically, NextLat trains a transformer to learn latent representations that are predictive of its next latent state given the next output token. Theoretically, we show that these latents provably converge to belief states, compressed information of the history necessary to predict the future. This simple auxiliary objective also injects a recurrent inductive bias into transformers, while leaving their architecture, parallel training, and inference unchanged. NextLat effectively encourages the transformer to form compact internal world models with its own belief states and transition dynamics — a crucial property absent in standard next-token prediction transformers. Empirically, across benchmarks targeting core sequence modeling competencies — world modeling, reasoning, planning, and language modeling — NextLat demonstrates significant gains over standard next-token training in downstream accuracy, representation compression, and lookahead planning. NextLat stands as a simple and efficient paradigm for shaping transformer representations toward stronger generalization.
1650 – 1750
Panel Discussion:

1. How can “Science for AI” and “AI for Science” nourish each other and form a closed loop?

2. Rethinking the “First Principles” of AI: How much new science do we still need?

3. Carbon vs. Silicon: How can human memory mechanisms optimize long-term learning of agents?

4. From “Black Box” to “White Box”: Can statistical physics explain the emergence of large models?

5. Defining “Scientific Discovery” in the AI Era: If AI discovers a new law of physics, who is the author?
1800
Banquet

2 March, 2026 (Mon)

Workshop 2: AI Agent
0830 – 0900
Registration
0855 – 0900
Group Photo
0900 – 0920
TBC
0920 – 0940
From Biological Brains to Digital Brains

AbstractWith the currently available and unprecedented multi-scale data, we are able to examine information processing in the human brain in a truly brain-wide manner. Using depression as an example, we illustrate how we can reveal different subtypes with distinct disease progression trajectories, different symptoms, different genetics, and different proteomics, leading to different brain-wide information processing patterns. With the acquired knowledge, we then move further to develop the world’s first digital twin brain (DTB) of the human brain, with 86B spiking neurons and 100T estimated parameters (synaptic weights). Various applications of the DTB are introduced in brain–machine interfaces and embodied intelligence. Finally, we introduce a novel mathematical tool, the moment neuronal network approach, to tackle the complex dynamics in the DTB.
0940 – 1000
Holos: A Web-scale LLM-based Multi-agent System

AbstractIn 2026, the core capabilities of LLM agents have achieved substantial breakthroughs, making it a mainstream development trend in the artificial intelligence field to network and collaborate agents to solve a wider range of complex tasks. Accordingly, the technical system and ecological prototype of the Agentic Web have gradually taken shape. In this talk, we will first elaborate on the key core technologies of Agentic AI multi-agent systems, and then focus on Holos, a novel web-scale LLM-based multi-agent system and a prototype platform for the Agentic Web. This system enables efficient orchestration and collaborative operation of millions of agents, relying on the self-built new AWCP communication protocol at the underlying layer to ensure deep-engagement collaboration among agents; at the level of agent incentive mechanism, the system is equipped with an economic profit-sharing model, allowing agents to obtain corresponding benefits through knowledge and capability sharing. Finally, we will analyze the limitations and challenges faced by the current development of the Agentic Web, and prospect its future development directions.
1000 – 1020
Forget BIT, It is All about TOKEN: Towards the First Principle for Understanding LLMs

AbstractWith the currently available and unprecedented multi-scale data, we are able to examine information processing in the human brain in a truly brain-wide manner. Using depression as an example, we illustrate how we can reveal different subtypes with distinct disease progression trajectories, different symptoms, different genetics, and different proteomics, leading to different brain-wide information processing patterns. With the acquired knowledge, we then move further to develop the world’s first digital twin brain (DTB) of the human brain, with 86B spiking neurons and 100T estimated parameters (synaptic weights). Various applications of the DTB are introduced in brain–machine interfaces and embodied intelligence. Finally, we introduce a novel mathematical tool, the moment neuronal network approach, to tackle the complex dynamics in the DTB.
1020 – 1040
AI4AI: Scaling and Accelerating AI Research

AbstractAI is currently transitioning from the era of Agents to that of Innovators. As large language models continue to improve in coding capability and long-horizon execution, the high-value scenario of fully autonomous scientific research is gradually being unlocked. Previous work on AI Scientist–style systems has demonstrated the potential of automated research. However, due to limitations in operational scale and the degree of required human intervention, these systems have not yet produced a significant impact on the structure or efficiency of the existing academic ecosystem. In this talk, I will introduce FARS (Fully Automated Research System), the first large-scale, publicly deployed fully autonomous research system. FARS is designed to accelerate AI research itself through AI-driven automation, with the ambition of transforming scientific discovery from a craft dependent on individual expertise into an infrastructure-powered industrial process.
1040 – 1100
AI-Scientist Empowering Scientific Discovery

Abstract General-purpose agents are accelerating the transformation of scientific research paradigms, significantly improving research efficiency while driving innovation in various scientific fields. This talk will focus on AGI4S, providing a systematic overview of representative achievements by leading domestic and international enterprises in enabling end-to-end scientific discovery through general-purpose agents. Building on this, this talk will introduce our team’s 2025 advancements in general-purpose agent research, including scientific reasoning, general retrieval, tool utilization agents. It will showcase the potential of general-purpose agents as a foundational enabler of innovation across different scientific domains.
1100 – 1150
Panel Discussion:

1. From Copilot to Autonomous Agent: Where is the boundary of autonomy?

2. Embodiment of Large Models: How do agents perceive and reshape the physical world?

3. Low-code and No-code: Do agents make “everyone a programmer” a reality?

4. Outsourcing Human Cognition: Are agents intellectual crutches or intellectual traps?
Panelists:
All speakers of Workshop 2

Moderator:
Prof. Yunfeng Zhao
1150 – 1300
Lunch
Lunch Break
Workshop 3: AI for Education
1300 – 1330
Registration
1330 – 1350
Are You Cheating, or Are You Just Using Generative AI? An AI-Resilient Assessment Solution

AbstractThe rapid rise of generative AI has intensified “AI-giarism” concerns, where AI-generated content blurs lines between originality and plagiarism, challenging academic integrity. Students now position themselves on a spectrum: from using AI as cognitive support for feedback and clarification, to cognitive substitution that risks eroding independent critical thinking. This presentation introduces an AI-resilient assessment approach through the 9 AI Assessment Integration Framework—nine complementary modes (e.g., performance tasks, iterative portfolios, reflective evaluations) designed to ensure authentic learning while enabling ethical and productive AI use. The presenter will introduce her SuperTA application, developed from her research on AI-resilient assessment. Featuring an integrated assessment analyzer, SuperTA empowers educators with dynamic design, pattern detection in submissions, and personalized feedback—fostering student agency, equity, and future-ready competencies
1350 – 1410
Edu@AI: Perspectives and Challenges

AbstractThe modern education pipeline—from general schooling to specialized careers – is only 80 years old, optimized for producing standardized specialists at scale. It creates two problems: the majority lack critical thinking skills and are defenseless against misinformation; the experts who emerge develop blindspots they mistake for common sense. The pipeline narrows us. Could AI widen us again? This talk explores three goals for education in the AI era: push beyond current limits with AI, think like a Renaissance scholar with AI, and grow stronger without AI.
1410 – 1430
The “New University” Experiment in the AI Era: SJTU’s Integrated Agent Matrix for Teaching, Learning, and Research

AbstractAs AI continues to evolve, the core mission of higher education is shifting from simply using AI as a tool to working alongside it in a true “intelligent partnership.” The School of Artificial Intelligence at Shanghai Jiao Tong University (SJTU) has built a dedicated team of AI agents to support the university’s three pillars: TeachMaster, which helps faculty innovate their teaching; LearnMaster, which creates personalized learning paths for students; and SciMaster, which speeds up the research process from idea to publication. In this talk, we will share the logic behind these “Triple-Master” agents, show how they work in real scenarios, and discuss how AI is reshaping the campus ecosystem to define what a smart university looks like in the future.
1430 – 1450
MinerU: Practices and Insights in Educational Settings

AbstractMinerU is dedicated to transforming massive amounts of unstructured documents—such as textbooks, research papers, and reports—into high-quality, AI-ready structured knowledge. With a community consensus backed by 55k GitHub Stars, MinerU has emerged as a fundamental infrastructure for building multi-domain knowledge bases. This presentation will deconstruct the core logic of MinerU and share three practical use cases: intelligent grading for middle school English essays, AI teaching assistants for high schools, and scientific research innovation in universities. Beyond technical instrumentalism, we will explore the firsthand observations and reflections of teachers and students on education in the AI era.
1450 – 1510
Coffee Break
1510-1530
From “Problem-Solvers” to “Contextual Learners”: Reassessing Large Models’ Genuine Learning Abilities

AbstractCurrent large models for education often demonstrate impressive problem-solving abilities, but to what extent does this stem from memorizing training data rather than true understanding? Based on our latest benchmark, CL-bench, this report reveals an overlooked truth: when faced with entirely new, unseen “contexts,” such as newly defined rules or novel scientific experimental data, top-tier models have a success rate of less than 20% in in-context learning. By examining the tension between parametric knowledge and in-context learning (ICL), this talk explores why AI that relies on pre-trained memorization struggles to keep pace with dynamic educational and research demands. CL-bench data indicates that AI performs most poorly in inductive reasoning—the ability to discover new patterns from experimental data—which carries significant implications for AI for Science education. Ultimately, the discussion focuses on how to shift from cultivating “omniscient” static AI toward developing “Contextual Learner” AI assistants capable of digesting new materials and rules in real-time to reshape the future of teaching and research.
1530-1550
Agentic AI for Research & Education

AbstractThis talk examines the emerging paradigm of Agentic AI and its transformative impact across research and education. We explore how autonomous AI agents are evolving from reactive tools to proactive collaborators that can generate breakthrough research hypotheses, accelerate scientific discovery, and adapt to complex problem-solving scenarios. These intelligent systems demonstrate remarkable capabilities in collaborative coding, algorithm optimization, and providing personalized, context-aware assistance that adapts to individual research and learning needs. As we witness this fundamental shift toward autonomous AI partnerships, agentic systems are redefining how we conduct research and develop innovative solutions. They create personalized educational experiences that amplify human creativity and problem-solving capabilities in unprecedented ways.
1550-1640
Panel Discussion:

1. From “Class-based Teaching” to “AI Mentorship”: The ultimate form of personalized learning.

2. High-quality Data: A prerequisite for personalized “AI Mentorship”. How AI education tools reshape research and reading paradigms for teachers and students.

3. Cultivating top talents in the context of AI for Science: How to train interdisciplinary scientists with “AI native intuition”?

4. Practice notes of an “AI homeroom teacher”: When large models are deeply embedded in frontline teaching, are we facing an efficiency revolution or a logical collapse?

5. Talent cultivation for the AGI era: Synergy among universities, research institutions, and industry.
Panelists:
All speakers of Workshop 3

Moderator:
Ms. Qunying Zhang
1640 – 1650
Closing Remark
Forum Organisers

Registration

Registration via Shanghai AI Laboratory.
Applicants will be notified of the result via email or SMS.

Registration Deadline: 

on or before 12:00pm HKT on Februrary 26, 2026 (Thu)

For enquiry, please contact us at duanchenyang@pjlab.org.cn.