Publications | Haoxuan Xu

2026

ASPLOS 2026

Wave: Leveraging Architecture Observation for Privacy-Preserving Model Oversight

Haoxuan Xu^*, Chen Gong^*, Beijie Liu^*, Haizhong Zheng, Beidi Chen, and Mengyuan Li

In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

Abs

Large Language Models (LLMs) inference increasingly require mechanisms that provide runtime visibility into what is actually executing, without exposing model weights or code. We present WAVE, a hardware-grounded monitoring framework that leverages GPU performance counters (PMCs) to observe LLM inference. WAVE is built on the insight that legitimate executions of a given model must satisfy hardware-constrained invariants, such as memory accesses, instruction mix, and tensor-core utilization, induced by the model’s linear-algebraic structure. WAVE collects lightweight PMC traces and applies a two-stage pipeline: (1) inferring architectural properties (e.g., parameter count, layer depth, hidden dimension, batch size) from the observed traces; and (2) using an SMT-based consistency checker to assess whether the execution aligns with the provisioned compute and the claimed model’s constraints. We evaluate WAVE on common open-source LLM architectures, such as LLaMA, GPT, and Qwen, across multiple GPU architectures, including NVIDIA Ada Lovelace, Hopper, and Blackwell. Results show that WAVE recovers key model parameters with an average error of 6.8% and identifies disguised executions under realistic perturbations. By grounding oversight in hardware invariants, WAVE provides a practical avenue for continuous, privacy-preserving runtime monitoring of LLM services.

2025

EuroS&P 2025

Latte: Layered Attestation for Portable Enclaved Applications

Haoxuan Xu^*, Jia Xiang^*, Zhen Huang, Guoxing Chen, Yan Meng, and Haojin Zhu

In IEEE 10th European Symposium on Security and Privacy (EuroS&P), 2025

Abs DOI PDF Code

Trusted Execution Environment (TEE) has become increasingly popular in privacy-protected cloud computing, and its rapid development has led to the availability of various heterogeneous TEE platforms on cloud servers. To facilitate portable TEE applications on heterogeneous TEE platforms, portable languages or intermediate representations (IRs) with platform-dependent TEE runtimes are adopted. However, existing remote attestation solutions for portable TEE applications follow a nested attestation pattern, i.e., attesting only the TEE runtime and relying on the TEE runtime to measure the loaded portable application, leading to potential security issues. On the other hand, directly packing the TEE runtime and the portable application into an enclave for secure attestation undermines the portability of the portable TEE applications. In this paper, we introduce the concept of portable identities to identify portable TEE applications, and propose a layered attestation framework, Latte, achieving both security and portability in attesting portable TEE applications. We provide a prototype implementation of Latte to validate its practicality, with WebAssembly as the portable IR, and Intel SGX and RISC-V Penglai as the exemplar heterogeneous TEEs. The evaluation demonstrates that Latte introduces minimal performance overhead compared with the nested attestation pattern.