LLaVi Lab - Publications

2026

Label-Wise uncertainty decomposition for Multi-label Classification by Maximizing Type II Likelihood

M. Li, J. Qiu, and W. Shi
Forty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2026.
Oral presentation
Paper

Mixing Expertise with Confidence: A Mixture of Experts Framework for Robust Multi-Modal Continual Learning

M. A. A. Forhad, Y. Zhu, A. Acharya, X. Liu, Q. Yu, and W. Shi
Forty-third International Conference on Machine Learning (ICML), 2026.
Paper Project Page

Anchoring Aleatoric Uncertainty: A Four-Term Decomposition of Predictive Risk at the Bayes-Optimal Predictor

M. Li, J. Qiu, and W. Shi
ICML 2026 Workshop on Structured Probabilistic Inference & Generative Modeling (SPIGM), 2026.
Paper

Evi-BALD: Bayesian Active Learning by Disagreement via Evidential Deep Learning

M. Li, and W. Shi
ICLR 2026 Workshop on Catch, Adapt, and Operate: Monitoring ML Models Under Drift (CAO), 2026.
Paper

ProMCP: Profiling Token Flows and Latency Costs in Model Context Protocol–Based LLM Agents

S. Anjum, W. Zheng, R. Kettimuthu, H. Fan, and Y. Feng.
Findings of The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026.
Paper Code

DMTrack: Spatio-Temporal Multimodal Tracking via Dual-Adapter

W. Li, S. Dong, H. Lu, Y. Zhang, H. Fan†, and L. Zhang†.
IEEE International Conference on Robotics and Automation (ICRA), 2026.
Paper Code

OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding

J. Yao*, X. Gu*, X. Deng, M. Dai, B. Fan, Z. Zhang, Y. Huang, H. Fan†, and L. Zhang†.
International Conference on Learning Representations (ICLR), 2026.
Paper Code-Data

IRDFusion: Iterative Relation-Map Difference guided Feature Fusion for Multispectral Object Detection

J. Shen, H. Zhan, X. Zuo, H. Fan, X. Yuan, J. Li, and W. Yang.
Pattern Recognition (PR), 2026, in press.
Paper Code

Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings

M. Li, W. Aljedaani, Y. Liu, N. Meka, X. Ye, J. Ding, and Y. Feng.
The Web Conference (WWW), 2026.
Paper Code

Harmful Factuality: LLMs Correcting What They Shouldn't

M. Li, H. Zhang, H. Fan, J. Ding, and Y. Feng
Findings of European Chapter of the Association for Computational Linguistics (EACL Findings), 2026.
Paper Code

OrgaCast: A Trustworthy Spatiotemporal Diffusion Model for Fluorescence Organoid Forecasting

D. Gao, A. Gomez, M. Li, M. El-mokahal, H. Yang, and Y. Feng
Annual AAAI Conference on Artificial Intelligence (AAAI), 2026.
Paper Code

Structured Context Learning for Generic Event Boundary Detection

X. Gu*, C. Li*, X. Wang, D. Hong, L. Zhang, T. Luo, L. Wen, and H. Fan
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026.
Paper

2025

Deep Active Re-Labeling: Toward Noise-Resilient Annotation Efficiency

M. A. A. Forhad, and W. Shi
2025 IEEE International Conference on Big Data (IEEE BigData), 886-895, 2025.
Paper Project Page

PlanarTrack: A High-quality and Challenging Benchmark for Large-scale Planar Object Tracking

Y. Jiao, X. Liu, X. Liu, X. Yuan, H. Fan, and L. Zhang
Computer Vision and Image Understanding (CVIU), 306: 130848, 2025.
Paper Data

Robust Ego-Exo Correspondence with Long-Term Memory

Y. Hu*, B. Fan*, X. Gu, H. Ren, D. Liu, H. Fan†, and L. Zhang†
Advances in Neural Information Processing Systems (NeurIPS), 2025.
Paper Code

LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers

L. Lin, H. Fan, Z. Zhang, Y. Huang, Y. Wang, Y. Xu, and H. Ling
Advances in Neural Information Processing Systems (NeurIPS), 2025.
Spotlight paper
Paper Code

All You Need is One: Capsule Prompt Tuning with a Single Vector

Y. Liu, J. Liang, H. Fan, W. Yang, Y. Cui, X. Han, L. Huang, D. Liu, Q. Wang, and C. Han
Advances in Neural Information Processing Systems (NeurIPS), 2025.
Paper

DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting

M. Li, H. Fan, S. Fu, J. Ding, and Y. Feng
Findings of Conference Empirical Methods in Natural Language Processing (EMNLP Findings), 2025.
Paper Code

ViT-RefineNet for Directional Signal Radio Map Reconstruction from Sparse Samples

L. Ye, T. Le, and Y. Huang
ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), 2025. (Long)
Paper

FUSE-Traffic: Fusion of Unstructured and Structured data for Event-aware Traffic forecasting

C. Yu, X. Xie, Y. Huang, and C. Qiu
ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), 2025. (Long)
Paper

EfficientLocNet: High-Performance and Lightweight Radio Source Localization with Multi-Scale Attention

T. Le, S. Yadav, X. Xie, C. Qiu, Y. Huang and X. Li
ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), 2025. (Short)
Paper

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

B. Fan, Y. Feng, Y. Tian, J. Liang, Y. Lin, Y. Huang, and H. Fan
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.
Paper Code

GSOT3D: Towards Generic 3D Single Object Tracking in the Wild

Y. Jiao, Y. Li, J. Ding, Q. Yang, S. Fu, H. Fan†, and L. Zhang†
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.
Paper Code-Data

Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking

Y. Li, Y. Jiao, D. Meng, H. Fan†, and L. Zhang†
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.
Paper Code

Edge-Aware Token Halting for Efficient and Accurate Medical Image Segmentation

Y. Guo, B. Song, H. Fan, and E. Cheng
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2025.
Paper Code

Efficient and Accurate Low-Resolution Transformer Tracking

S. Dong, Y. Feng, J. Liang, Q. Yang, Y. Lin, and H. Fan
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025.
Oral presentation
Paper Code

G3CN: Gaussian Topology Refinement Gated Graph Convolutional Network for Skeleton-Based Action Recognition

H. Ren, Z. Luo, H. Fan, X. Yuan, G. Wang, and L. Zhang
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025.
Oral presentation
Paper Code

High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion

L. Zhang, Y. Yu, J. Yao, and H. Fan
International Journal of Computer Vision (IJCV), 133: 5788-5805, 2025.
Paper Code

DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration

H. Zhang, H. Fan, K. Sha, Y. Huang, and Y. Feng
Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2025.
Paper Code

RoLocMe: A Robust Multi-agent Source Localization System with Learning-based Map Estimation

T. Le, L. Ye, and Y. Huang
International Joint Conference on Artificial Intelligence (IJCAI), 2025.
Paper

CorrBEV: Multi-View 3D Object Detection by Correlation Learning with Multi-modal Prototypes

Z. Xue, M. Guo, H. Fan, S. Zhang, and Z. Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
Paper

LaMOT: Language-Guided Multi-Object Tracking

Y. Li*, X. Liu*, L. Liu, H. Fan†, and L. Zhang†
IEEE International Conference on Robotics and Automation (ICRA), 2025.
Paper Code-Data

CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking

W. Li, X. Liu, H. Fan†, and L. Zhang†
IEEE International Conference on Robotics and Automation (ICRA), 2025.
Paper Code

The Devil is in the Quality: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection

Z. Zhang, Z. Li, H. Wang, H. Yuan, K. Wang, and H. Fan
IEEE International Conference on Robotics and Automation (ICRA), 2025.
Paper

Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding

X. Gu, Y. Shen, C. Luo, T. Luo, Y. Huang, Y. Lin, H. Fan†, L. Zhang†
International Conference on Learning Representations (ICLR), 2025.
Oral presentation
Paper Code

AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary Pedestrian Attributes

Y. Li, Z. Xiao, L. Yang, D. Meng, X. Zhou, H. Fan, and L. Zhang
IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 36(3): 5454-5468, 2025.
Paper Code

2024

DAAP: Privacy-Preserving Model Accuracy Estimation on Unlabeled Datasets Through Distribution-Aware Adversarial Perturbation

G. Cao, Z. Wang, Y. Feng, and X. Dong.
The 33rd USENIX Security Symposium (USENIX Security), 2024.
Paper

VastTrack: Vast Category Visual Object Tracking

L. Peng*, J. Gao*, X. Liu*, W. Li*, S. Dong*, Z. Zhang, H. Fan†, and L. Zhang†
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Paper Poster Code-Data

Optical Flow as Spatial-Temporal Attention Learners

Y. Lu, C. Han, Q. Wang, H. Fan, Z. Kong, D. Liu, and Y. Chen
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 11491-11506, 2024.
Paper

Cyclic Refiner: Object-Aware Temporal Representation Learning for Multi-View 3D Detection and Tracking

M. Guo, Z. Zhang, L. Jing, Y. He, K. Wang, and H. Fan
International Journal of Computer Vision (IJCV), 132: 6184–6206, 2024.
Paper

Beyond MOT: Semantic Multi-Object Tracking

Y. Li, Q. Li, H. Wang, X. Ma, J. Yao, S. Dong, H. Fan†, and L. Zhang†
European Conference on Computer Vision (ECCV), 2024.
Paper Code-Data

Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance

L. Lin, H. Fan, Z. Zhang, Y. Wang, Y. Xu, and H. Ling
European Conference on Computer Vision (ECCV), 2024.
Paper Code

Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning

S. Dong, Y. Feng, Q. Yang, Y. Huang, D. Liu, and H. Fan
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024.
Oral presentation
Paper Code

SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles

D. Qu, Q. Chen, T. Bai, A. Qin, H. Lu, H. Fan, S. Fu, and Q. Yang
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024.
Oral presentation
Paper Code

Robust Domain Adaptive Object Detection with Unified Multi-Granularity Alignment

L. Zhang, W. Zhou, H. Fan‡, T. Luo, and H. Ling (‡corresponding author)
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 9161-9178, 2024.
Paper Code

Divert More Attention to Vision-Language Object Tracking

M. Guo, Z. Zhang, L. Jing, H. Ling, and H. Fan
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 8600-8618, 2024.
Paper Code

Context-Guided Spatio-Temporal Video Grounding

X. Gu*, H. Fan*, Y. Huang, T. Luo, and L. Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Paper Poster Code

ProMotion: Prototypes As Motion Learners

Y. Lu, D. Liu, Q. Wang, C. Han, Y. Cui, Z. Cao, X. Zhang, Y. Chen, and H. Fan
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Paper

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

J. Zheng, H. Fan, and L. Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Paper

CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2

S. Song, Y. Huang, P. Jiang, X. Yu, W. Zheng, S. Di, Q. Cao, Y. Feng, Z. Xie, and F. Cappello
ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2024.
Paper

MaGIC: Multi-modality Guided Image Completion

H. Wang*, Y. Yu*, T. Luo, H. Fan, and L. Zhang
International Conference on Learning Representations (ICLR), 2024.
Paper Code

Local Compressed Video Stream Learning for Generic Event Boundary Detection

L. Zhang, X. Gu, C. Li, T. Luo, and H. Fan
International Journal of Computer Vision (IJCV), 132: 1187-1204, 2024.
Paper Code

PreciseDebias: An Automatic Prompt Engineering Approach for Generative AI to Mitigate Image Demographic Biases

C. Clemmer, J. Ding, and Y. Feng.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024.
Paper Code

SSPNet: Scale and Spatial Priors Guided Generalizable and Interpretable Pedestrian Attribute Recognition

J. Shen, T. Guo, X. Zuo, H. Fan, and W. Yang
Pattern Recognition (PR), 148: 110194, 2024.
Paper

ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection

J. Shen, Y. Chen, Y. Liu, X. Zuo, H. Fan, and W. Yang
Pattern Recognition (PR), 145: 109913, 2024.
Paper Code

Task-Free Fairness-Aware Bias Mitigation for Black-Box Deployed Models

G. Cao, Z. Wang, Y. Feng, X. Dong, Z. Zhang, Z. Qin, and K. Ren
IEEE Transactions on Dependable and Secure Computing (TDSC), 21: 3390-3405, 2024.
Paper

Evidential Mixture Machines: Deciphering Multi-Label Correlations for Active Learning Sensitivity

D Yu, M Li, W Shi, Q Yu
Advances in Neural Information Processing Systems (NeurIPS), 37: 112217-112236, 2024.
Paper

Balancing Explanations and Adaptation in Offline Continual Learning Systems Using Active Augmented Reply

M. A. A. Forhad, and W. Shi
IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), 484-490, 2024.
Paper

Unveiling statistical significance of online regression over multiple datasets

M Abu-Shaira, W Shi
IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), 274-279, 2024.
Paper

Macro-AUC-Driven Active Learning Strategy for Multi-Label Classification Enhancement

M Li, J Qiu, W Shi
IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), 280-286, 2024.
Paper

2023

A Multi-granularity Decade-Long Geo-Tagged Twitter Dataset for Spatial Computing

Y. Feng, Z. Meng, C. Clemmer, H. Fan, and Y. Huang
ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), 2023.
Paper Data

PIDray: A Large-scale X-ray Benchmark for Real-World Prohibited Item Detection

L. Zhang, L. Jiang, R. Ji, and H. Fan
International Journal of Computer Vision (IJCV), 131: 3170-3192, 2023.
Paper Code-Data

Towards Transferable Targeted Adversarial Examples

Z. Wang, H. Yang, Y. Feng, P. Sun, H. Guo, Z. Zhang, and K. Ren
onference on Computer Vision and Pattern (CVPR), 2023.
Paper

Addressing Weak Decision Boundaries in Image Classification by Leveraging Web Search and Generative Models

P. Dammu, Y. Feng and C. Shah
International Joint Conference on Artificial Intelligence (IJCAI), 2023.
Paper

Collaborative Three-Stream Transformers for Video Captioning

H. Wang, L. Zhang, H. Fan, and T. Luo
Computer Vision and Image Understanding (CVIU), 235: 103799, 2023.
Paper Code

Unsupervised Domain Adaptive Detection with Network Stability Analysis

W. Zhou*, H. Fan*, T. Luo, and L. Zhang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Paper Code

Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers

B. Gu, H. Fan, and L. Zhang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Paper Code

Accurate and Fast Compressed Video Captioning

Y. Shen, X. Gu, K. Xu, H. Fan, L. Wen, and L. Zhang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Paper Code

PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking

X. Liu*, X. Liu*, Z. Yi*, X. Zhou*, T. Le, L. Zhang, Y. Huang, Q. Yang, and H. Fan
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Paper Code

Towards Fairness-aware Adversarial Network Pruning

L. Zhang, Z. Wang, X. Dong, Y. Feng, X. Pang, Z. Zhang, and K. Ren
IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Paper

AnimalTrack: A Benchmark for Multi-Animal Tracking in the Wild

L. Zhang*, J. Gao*, Z. Xiao, and H. Fan
International Journal of Computer Vision (IJCV), 131: 496-513, 2023.
Paper Code

FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

B. Zhang, J. Tian, S. Di, X. Yu, Y. Feng, X. Liang, D. Tao, F. Cappello
ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2023.
Paper

Investigating Code Generation Performance of ChatGPT with Crowdsourcing Social Data

Y. Feng, S. Vanam, M. Cherukupally, W. Zheng, M. Qiu and H. Chen
47th IEEE Computer Software and Applications Conference (COMPSAC), 2023.
Best Track Paper Award
Paper Data

All: Supporting experiential accessibility education and inclusive software development

W Shi, H Moses, Q Yu, S Malachowsky, DE Krutz
ACM Transactions on Software Engineering and Methodology (TOSEM),33:2 1-30, 2023.
Paper

Actively testing your model while it learns: realizing label-efficient learning in practice

D Yu, W Shi, Q Yu
Advances in Neural Information Processing Systems (NeurIPS),36: 31404-31427, 2023.
Paper

STARS: spatial-temporal active re-sampling for label-efficient learning from noisy annotations

D Yu, W Shi, Q Yu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI),37:9 10980-10988, 2023.
Paper

Discover-then-rank unlabeled support vectors in the dual space for multi-class active learning

D Yu, W Shi
Fortieth International Conference on Machine Learning (pmlr), 2023.
Paper

2022

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

L. Lin*, H. Fan*, Z. Zhang, Y. Xu, and H. Ling
Advances in Neural Information Processing Systems (NeurIPS), 2022.
Paper Code

Divert More Attention to Vision-Language Tracking

M. Guo*, Z. Zhang*, H. Fan, and L. Jing
Advances in Neural Information Processing Systems (NeurIPS), 2022.
Paper Code

High-Fidelity Image Inpainting with GAN Inversion

Y. Yu, L. Zhang, H. Fan, and T. Luo
European Conference on Computer Vision (ECCV), 2022.
Paper

Towards Bridging the Distribution Gap: Instance to Prototype Earth Mover’s Distance for Distribution Alignment

Q. Zhou, R. Wang, G. Zeng, H. Fan, and G. Zheng
Medical Image Analysis (MedIA), 82: 102607, 2022.
Paper

Detection and Tracking Meet Drones Challenge

P. Zhu, L. Wen, D. Du, X. Bian, H. Fan, Q. Hu, and H. Ling
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 44(11): 7380-7399, 2022.
Paper Data

GL-GAN: Adaptive Global and Local Bilevel Optimization for Generative Adversarial Network

Y. Liu, H. Fan, X. Yuan, and J. Xiang
Pattern Recognition (PR), 123: 108375, 2022.
Paper

Learning Target-aware Representation for Visual Tracking via Informative Interactions

M. Guo, Z. Zhang, H. Fan, L. Jing, Y. Lyu, B. Li, and W. Hu
International Joint Conference on Artificial Intelligence (IJCAI), 2022.
Oral presentation
Paper Code

Hierarchical bayesian multi-kernel learning for integrated classification and summarization of app reviews

M Alshangiti, W Shi, E Lima, X Liu, Q Yu
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC), 558-569, 2022.
Paper

2021

Transparent Object Tracking Benchmark

H. Fan, H. Miththanthaya, Harshit, S. Rajan, X. Liu, Z. Zou, Y. Lin, and H. Ling
IEEE International Conference on Computer Vision (ICCV), 2021.
Paper Code-Data

CRACT: Cascaded Regression-Align-Classification for Robust Visual Tracking

H. Fan and H. Ling
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
Paper Project

Selected Publications

(*equal contribution and co-first authors, †equal advising and co-last authors)