[AI-NEWS-ENTRY]
Date: 2026-02-20
Title: xAI Grok-4.1 Adds Native Video Understanding and Temporal Action Localization
Content: Grok-4.1 preview integrates dense video captioning, temporal action localization, and event grounding capabilities directly into the core model. It can answer detailed questions about video events (“What object was picked up right after the door opened?”) with sub-second precision and generate structured timelines from hour-long footage.
Keywords: video understanding, temporal localization, event grounding, video reasoning, Grok-4.1, multimodal video