Xinyang Liu (刘昕洋)

Incoming PhD,
Statistics and Data Sciences (SDS)
The University of Texas at Austin

Email: xinyangATK [AT] gmail [dot] com
Google Scholar | Github | Twitter

Bio

Howdy! I am an incoming PhD student at The University of Texas at Austin , advised by Prof. Mingyuan Zhou. I’m also work closely with Dr. Ruqi Zhang, an Assistant Professor at Purdue University. I received my M.S degree from Xidian University in 2024, advised by Prof. Bo Chen. Previously, I obtained my B.S degree from Xidian University in 2021.

My research interests lie in the general area of probablistic modeling, particularly in solving real-world problems through advanced Generative AI systems. My recent research focuses on Generative Modeling, including its theory exploration and various applications in data generation and multimodal learning. I am also highly interested in 2D & 3D generation, robot learning, planning, and agent learning upon Generative AI.

If you share the same research interests with me, feel free to reach out or add my WeChat.

news

Jun, 2025	Our paper “Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation” has been selected for an Oral Presentation at ICML 2025! Big congratulations to Tiansheng!
Mar, 2025	I’m thrilled to accept the offer from UT-Austin and can’t wait to enjoy the legendary barbecue 🔥🍖🤠 in Austin!
Jan, 2025	Happy Birthday🍰🕯️👑 The best gifts come from two accepted papers! Much appreciation to all of my collaborators and advisors! In “Optimal Stochastic Trace Estimation in Generative Modeling” (AISTATS 2025), we leverage the Hutch++ estimator in generative modeling and propose a practical algorithm that amortizes decompositions to reduce costs, while also providing theoretical guarantees specifically in generative modeling context. In “Advancing Graph Generation through Beta Diffusion” (ICLR 2025), we futher explore the potential of Beta Diffusion in graph modeling and propose a novel graph-driven generative process with concentration modulation technique, which makes Beta Diffusion unique again!
Jun, 2024	I graduated with a master’s degree in Xidian University!
Apr, 2024	Our paper “Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models” is accepted by UAI 2024.
Sep, 2023	Two papers are accepted by NeurIPS 2023!
Apr, 2023	Our paper “Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process” is accepted by ICML 2023.
Feb, 2023	We have open sourced a new version of PyDPM, and welcome to join the open source library of deep probabilistic models! PyDPM is a python library focuses on constructing Deep Probabilistic Models (DPMs). Our developed Pydpm not only provides efficient distribution sampling functions on GPU, but also has included the implementations of existing popular DPMs.

selected publications

(*) denotes equal contribution

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

Tiansheng Wen*, Yifei Wang*, zequn Zeng, Zhong Peng, Yudi Su, Xinyang Liu, Bo Chen, Hongwei Liu, Stefanie Jegelka, and Chenyu You

Forty-Second International Conference on Machine Learning (ICML), 2025

Oral Presentation [Top 1%]

ABSTRACT PDF Code

Many large-scale systems rely on high-quality deep representations (embeddings) to facilitate tasks like retrieval, search, and generative modeling. Matryoshka Representation Learning (MRL) recently emerged as a solution for adaptive embedding lengths, but it requires full model retraining and suffers from noticeable performance degradations at short lengths. In this paper, we show that sparse coding offers a compelling alternative for achieving adaptive representation with minimal overhead and higher fidelity. We propose Contrastive Sparse Representation (CSR), a method that sparsifies pre-trained embeddings into a high-dimensional but selectively activated feature space. By leveraging lightweight autoencoding and task-aware contrastive objectives, CSR preserves semantic quality while allowing flexible, cost-effective inference at different sparsity levels. Extensive experiments on image, text, and multimodal benchmarks demonstrate that CSR consistently outperforms MRL in terms of both accuracy and retrieval speed-often by large margins-while also cutting training time to a fraction of that required by MRL. Our results establish sparse coding as a powerful paradigm for adaptive representation learning in real-world applications where efficiency and fidelity are both paramount.
Optimal Stochastic Trace Estimation in Generative Modeling

Xinyang Liu*, Hengrong Du*, Wei Deng, and Ruqi Zhang

The 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025

ABSTRACT PDF Code

Hutchinson estimators are widely employed in training divergence-based likelihoods for diffusion models to ensure optimal transport (OT) properties. However, this estimator often suffers from high variance and scalability concerns. To address these challenges, we investigate Hutch++, an optimal stochastic trace estimator for generative models, designed to minimize training variance while maintaining transport optimality. Hutch++ is particularly effective for handling ill-conditioned matrices with large condition numbers, which commonly arise when high-dimensional data exhibits a low-dimensional structure. To mitigate the need for frequent and costly QR decompositions, we propose practical schemes that balance frequency and accuracy, backed by theoretical guarantees. Our analysis demonstrates that Hutch++ leads to generations of higher quality. Furthermore, this method exhibits effective variance reduction in various applications, including simulations, conditional time series forecasts, and image generation.
Advancing Graph Generation through Beta Diffusion

Xinyang Liu*, Yilin He*, Bo Chen, and Mingyuan Zhou

The Thirteenth International Conference on Learning Representations (ICLR), 2025

ABSTRACT PDF Code

Diffusion models have demonstrated effectiveness for generating natural images and have since been adapted to generate diverse types of data, including graphs. While this emerging family of diffusion-based graph generative models has shown remarkable performance gains over predecessors that rely on variational autoencoders or generative adversarial networks, it is important to note that the majority of these models utilize Gaussian or categorical-based diffusion processes, which may encounter difficulties when modeling sparse and long-tailed data distributions. In our work, we introduce Graph Beta Diffusion (GBD), a diffusion-based generative model adept at modeling diverse graph structures. Focusing on the sparse and range-bounded characteristics of graph adjacency matrices, GBD employs a beta diffusion process to ensure that the initial distribution aligns with the beta distribution, which is well-suited for modeling such data types. To enhance the realism of generated graphs further, we introduce a modulation technique that stabilizes the generation of important graph structures while maintaining flexibility for the rest. The superior performance of GBD in generating graphs, as demonstrated across three generic graph benchmarks and two biochemical graph benchmarks, underscores its effectiveness in capturing the complexities of real-world graph data.
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models

Xinyang Liu*, Dongsheng Wang*, Bowei Fang, Miaoge Li, Zhibin Duan, Yishi Xu, Bo Chen, and Mingyuan Zhou

Proceedings of the 40th Conference on Uncertainty in Artificial Intelligence (UAI), 2024

ABSTRACT PDF Code

For downstream applications of vision-language pre-trained models, there has been significant interest in constructing effective prompts. Existing works on prompt engineering, which either require laborious manual designs or optimize the prompt tuning as a point estimation problem, may fail to describe diverse characteristics of categories and limit their applications. We introduce a Bayesian probabilistic resolution to prompt tuning, where the label-specific stochastic prompts are generated hierarchically by first sampling a latent vector from an underlying distribution and then employing a lightweight generative model. Importantly, we semantically regularize the tuning process by minimizing the statistical distance between the visual patches and linguistic prompts, which pushes the stochastic label representations to faithfully capture diverse visual concepts, instead of overfitting the training categories. We evaluate the effectiveness of our approach on four tasks: few-shot image recognition, base-to-new generalization, dataset transfer learning, and domain shifts. Extensive results over 15 datasets show promising transferability and generalization performance of our proposed model, both quantitatively and qualitatively.
Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process

Zhibin Duan*, Xinyang Liu*, Yudi Su, Yishi Xu, Bo Chen, and Mingyuan Zhou

The 40th International Conference on Machine Learning (ICML), 2023

ABSTRACT PDF Code

Deep topic models have shown an impressive ability to extract multi-layer document latent representations and discover hierarchical semantically meaningful topics. However, most deep topic models are limited to the single-step generative process, despite the fact that the progressive generative process has achieved impressive performance in modeling image data. To this end, in this paper, we propose a novel progressive deep topic model that consists of a knowledge-informed textural data coarsening process and a corresponding progressive generative model. The former is used to build multi-level observations ranging from concrete to abstract, while the latter is used to generate more concrete observations gradually. Additionally, we incorporate a graph-enhanced decoder to capture the semantic relationships among words at different levels of observation. Furthermore, we perform a theoretical analysis of the proposed model based on the principle of information theory and show how it can alleviate the wellknown “latent variable collapse” problem. Finally, extensive experiments demonstrate that our proposed model effectively improves the ability of deep topic models, resulting in higher-quality latent document representations and topics.