π¨βπ Biography
Hi! This is Shenghao Xie. I am currently a first-year Ph.D. student at Academy for Advanced Interdisciplinary Studies, Peking University, fortunately working with Prof. Lei Ma and Prof. Tiejun Huang. I am also a visiting student of Tsinghua Statistical Artificial Intelligence & Learning (TSAIL) Group, Tsinghua University, advised by Prof. Hang Su and Prof. Jun Zhu. Previously, I received my B.E. degree from School of Cyber Science and Engineering, Wuhan University in 2024, supervised by Prof. Shanghang Zhang.
My long-term research goal is to pursue the vision AGI and create corresponding social good. Recent interests have primarily focused on the vision foundation model, as well as its applications in AI4Healthcare:
Β· Vision Foundation Model. Firstly, I attempt to unlock the scaling law and zero-shot generalization in vision foundation models by integrating various data (both spatial and temporal, e.g., 2D, 3D, videos, and 4D) and tasks (both perception and generation, e.g., segmentation, caption, translation, and editing). Then I seek to further equip them with reasoning and emboddied interaction capabilities.
Β· AI for Healthcare. I am committed to addressing valuable medical problems and building AI systems that effectively assist doctors. Specifically, I develop data-driven discriminative models based on large-scale medical images (e.g., early cancer detection). Moreover, I also explore how to leverage generative models with clinically meaningful evaluation metrics for some data-scarce scenarios (e.g., rare diseases).
I am open to both collaborations and discussions, please feel free to send me an email.
π₯ News
- 2024.10: Β ππ Our paper βEmbedded Visual Prompt Tuningβ is accepted by MIA 2024 AIFM Special Issue.
π Selected Publications
(* denotes co-first author. β denotes corresponding author. View the full publication list on my google scholar.)

Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie, Wenqiang Zu, Mingyang Zhao, Duo Su, Shilong Liu, Ruohua Shi, Guoqi Li, Shanghang Zhang, Lei Ma
- The first comprehensive survey to dive deep into the trend of unifying understanding and generation in vision foundation models from the autoregression perspective.

π Honors and Awards
- 2024.06 Outstanding Bachelorβs Degree Thesis at Wuhan University.
- 2023.11 Lei Jun Computer Science Undergraduate Scholarship.
- 2023.08 First Prize at National College Student Information Security Contest.
π» Internships
- 2023.09 - 2024.09, research intern, Beijing Academy of Artificial Intelligence (BAAI), Beijing, China. Mentor: Lei Ma, Tiejun Huang.