xAI Grok-4.1 Adds Native Video Understanding and Temporal Action Localization

164    2026-02-20

[AI-NEWS-ENTRY]

Date: 2026-02-20

Title: xAI Grok-4.1 Adds Native Video Understanding and Temporal Action Localization

Content: Grok-4.1 preview integrates dense video captioning, temporal action localization, and event grounding capabilities directly into the core model. It can answer detailed questions about video events (“What object was picked up right after the door opened?”) with sub-second precision and generate structured timelines from hour-long footage.

Keywords: video understanding, temporal localization, event grounding, video reasoning, Grok-4.1, multimodal video