CV

Tao Zhang

zhangtaolqy@mail.ustc.edu.cn
Hefei, Anhui, CN

Summary

Ph.D. candidate at the University of Science and Technology of China working on efficient LLM inference serving, AI infrastructure, and multi-agent systems.

Education

  • Institute of Advanced Technology; Future Network Laboratory
    Present
    University of Science and Technology of China
  • School of Communication and Information Engineering
    2023-06
    Chongqing University of Posts and Telecommunications

Skills

Research

  • LLM inference serving
  • AI infrastructure
  • multi-agent systems
  • distributed scheduling
  • multimodal model efficiency

Software Development

  • Backend development
  • Frontend development
  • Android development
  • distributed databases
  • traffic data processing

Publications

  • FAESR: Fine-Grained Rate Adaptation for Energy-Aware Super Resolution in Mobile Panoramic Video Streaming
    2025
    IEEE Transactions on Cognitive Communications and Networking
    First author; SCI 一区.
  • DisHelis: Optimizing Deployment of Disaggregated LLMs Inference Serving over Heterogeneous Environments via Hierarchical Max-Flow
    2026
    IEEE Transactions on Cognitive Communications and Networking
    First author; SCI 一区.
  • Multi-Timescale Joint Optimization of Task Scheduling, Instance Switching, and Resource Scaling for Disaggregated LLM Serving
    2026
    IEEE Transactions on Cognitive Communications and Networking
    First author; SCI 二区.
  • HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
    2026
    CVPR 2026
    Co-first author; Poster.
  • SpecCache: Speculative KV Cache Reuse for Efficient RAG Serving
    2026
    ACL 2026
    Co-first author; Oral.
  • SAVP: Scene-Aware Vision Token Pruning for Efficient Video Large Language Models
    2026
    EMNLP 2026
    Co-first author; Poster.
  • GSTEP: Global Spatio-Temporal Density-Driven Visual Token Pruning for Efficient Video Large Language Models
    2026
    ACM Multimedia 2026
    Co-first author; Poster.
  • LatCom: Latent Compression for Efficient Multi-Agent Collaboration
    2026
    EMNLP 2026
    Co-first author; Poster.

Languages

  • English
    CET-4 and CET-6; professional literature reading and academic writing

Interests

  • Research Interests
    Efficient LLM inference serving, AI infrastructure, Disaggregated serving, Multi-agent collaboration, Visual token pruning