GSTEP: Global Spatio-Temporal Density-Driven Visual Token Pruning for Efficient Video Large Language ModelsPublished in ACM Multimedia 2026, 2026Share on Bluesky Facebook LinkedIn Mastodon X (formerly Twitter) Previous Next