Model Release
Zhipu AI Ships Fast Proprietary Model Update
Chinese AI lab Zhipu AI released a fast proprietary model update on April 3, positioned as a significant but non-frontier advancement. The release adds to a wave of Chinese model activity that included Alibaba Cloud's Qwen team shipping a proprietary update on March 31. Zhipu's update was tracked on LLM Stats alongside Google's open-source variants from the same week. Source →
Enterprise
OpenAI ChatGPT Super App Hits 900M Weekly Users
OpenAI unveiled a ChatGPT super app strategy that consolidates chat, coding, search, and agent capabilities into a unified interface. The company now reports 900 million weekly active users and significant enterprise revenue following its $852 billion valuation round. The move signals a shift toward centralizing AI capabilities into a single consumer and enterprise gateway. Source →
Enterprise
Criteo GO Brings Generative AI to Ad Optimization
Criteo launched Criteo GO, a generative AI advertising optimization platform designed to make enterprise-level marketing tools accessible to startups and SMEs. The product was highlighted among April 2026's most practical AI product launches for smaller businesses. It represents a growing trend of AI tools moving from large enterprise exclusivity to broad commercial availability. Source →
Agents
AWS Launches Autonomous Agents for Operations
Amazon Web Services released autonomous AI agents designed to streamline operational tasks for businesses, part of April 2026's wave of practical AI product launches. The agents target workflow automation rather than conversational AI, aligning with the broader industry shift toward agentic systems that execute multi-step processes. Details on pricing and availability were included in the initial rollout. Source →
Model Release
Claude Mythos 5 Emerges With 10T Parameters
Anthropic's Claude Mythos 5 has surfaced in benchmark tracking with a reported 10 trillion parameters, leading on multi-step planning tasks in cybersecurity and research domains. The model represents a new scale threshold for commercial AI systems. Prediction markets give a Claude Mythos broad release only a 39% probability before April 30, suggesting limited availability so far. Source →
Benchmark
Gemini 3.1 Ultra Scores 94.3% on GPQA Diamond
Google DeepMind's Gemini 3.1 Ultra posted a 94.3% score on the GPQA Diamond benchmark, establishing a new high-water mark for graduate-level reasoning evaluation. The model's native multimodal architecture distinguishes it from competitors that bolt on vision or audio capabilities. It sits alongside GPT-5.4 Thinking and Grok 4.20 at the frontier tier. Source →
Trend
Prediction Markets Price GPT-5.5 at 76% for April
Manifold prediction markets assign a 76% probability to a GPT-5.5 release before April 30, with DeepSeek V4 at 67% and Claude Mythos at 39%. The crowded pipeline reflects an industry release cadence that has accelerated to one significant model update every 72 hours on average. None of these anticipated releases have been confirmed by their respective companies. Source →
Benchmark
Grok 4.20 Claims 78% Non-Hallucination Rate
xAI's Grok 4.20, featuring a four-agent parallel system and a claimed 2 million token context window, reports a 78% non-hallucination rate on factual accuracy benchmarks. The model leads on factual accuracy metrics but trails competitors on reasoning tasks. Its future development path remains uncertain following SpaceX's acquisition of xAI. Source →
Enterprise
Microsoft Copilot Adds Anthropic Claude to Workflows
Microsoft's upgraded Copilot platform now allows multiple AI models, including OpenAI's GPT and Anthropic's Claude, to collaborate within a single workflow. A new Critique feature has one model generate output while another evaluates it, creating an adversarial quality loop. The Cowork agent is rolling out to early access enterprise customers. Source →
Trend
March 2026 Logged Over 30 Model Releases
March 2026 saw more than 30 new AI model releases or significant updates from OpenAI, Anthropic, Google, NVIDIA, and others, creating what developers are calling unprecedented decision fatigue. A single week from March 10 to 16 produced 12 distinct model launches across six labs spanning text, code, image, and audio modalities. Industry analysts now recommend adopting a model portfolio strategy with abstraction layers to manage rapid deprecation cycles. Source →