GSTEP: Global Spatio-Temporal Density-Driven Visual Token Pruning for Efficient Video Large Language Models

Published in ACM Multimedia 2026, 2026