OpenAI is merging ChatGPT, Codex, and Atlas into a single desktop Superapp. GPT-5.4 launched with 1M token context, scoring 75% on OSWorld-V (above the 72.4% human baseline). Revenue has surpassed $25B and the company is exploring an IPO. This marks OpenAI's pivot from model provider to platform company — a shift worth monitoring closely.
The Strait of Hormuz blockade enters week four with 3,000+ vessels stranded and Brent crude above $110/barrel — up 50% since the conflict began. 84% of Hormuz crude flows to Asia, making this an existential energy threat for the region. Trump is threatening Iranian power plants while simultaneously suggesting winding down operations. This is the dominant macro story driving every market.
Google launched Stitch, a free AI tool that converts plain-language prompts and napkin sketches into production-quality reports. Figma dropped 12% on the news. The finstory AI newsletter tested it on real financials: one photo produced a board-ready CFO report with KPIs, trend charts, and driver callouts — work that normally takes an FP&A analyst 3 hours. The key insight: frame prompts as judgment instructions, not formatting requests.
OpenAI is consolidating its three flagship products into a single unified desktop Superapp. GPT-5.4 launched with 1M token context and scored 75% on OSWorld-V, surpassing the 72.4% human baseline. Revenue exceeds $25B and the company is exploring an IPO. This signals OpenAI's ambition to own the desktop AI layer.
Anthropic launched Sonnet 4.6 and Opus 4.6 with 1M token context windows. Opus 4.6 is now deployed in PowerPoint and Excel for enterprise workflows. Memory features persist across conversations, and Claude Code Channels extend access to Telegram and Discord for always-on agent experiences.
Google released Gemini 3.1 Pro scoring 77.1% on ARC-AGI-2 and launched Stitch, a free AI design tool that converts descriptions into high-fidelity interfaces. Figma dropped 12% on the news. Flash-Lite arrives at $0.25/M tokens, making frontier inference accessible at commodity prices.
Meta released Llama 4 with full agentic capability — the model can plan, execute multi-step tasks, and maintain context across long interactions. This is the biggest open-source AI release in years, giving developers free access to an agent-class model that can compete with proprietary alternatives.
NVIDIA announced Nemotron 3 Super at GTC — a 120B-parameter hybrid MoE model with only 12B active parameters, designed for multi-agent applications in software development and cybersecurity. The architecture delivers large-model quality at small-model inference cost.
Mistral released Small 4, a 119B MoE multimodal reasoning model unveiled at NVIDIA GTC alongside the Forge enterprise platform. Mistral is on track for $1B ARR, competing directly with larger labs while maintaining a lean European operation and open-weight philosophy.
DoorDash launched Tasks, paying its 8 million Dashers to photograph menus, check shelves, and record walkthroughs for real-time business intelligence. The delivery workforce becomes the world's largest AI data collection pipeline — a novel crowdsourced approach to training AI on real-world physical environments.
Chinese lab MiniMax released M2.5, reportedly rivaling Anthropic's Claude Opus 4.6 at significantly lower cost. This continues the pattern of Chinese models delivering frontier-level performance at a fraction of Western prices, pressuring the economics of the entire AI industry.
Cursor released Composer 2 with enhanced AI coding capabilities, delivering frontier-level code generation at improved cost-performance ratios. The upgrade cements Cursor's position as the dominant AI-native IDE for professional developers and pressures competing coding tools.
MIT researchers developed a new AI model for protein-based drug design that could accelerate pharmaceutical R&D pipelines. The model enables more precise targeting of disease mechanisms through computational protein engineering, potentially reducing the time and cost of bringing new therapeutics to clinical trials.
The US-Israel conflict with Iran enters week four. The Strait of Hormuz is blocked with 3,000+ vessels stranded and oil above $110/barrel. Trump threatens to "hit and obliterate" Iranian power plants while the humanitarian toll mounts. No clear off-ramp is in sight as both sides escalate.
The White House is sending mixed signals — Trump publicly suggested winding down Iran operations while simultaneously approving additional Marine deployments. The contradiction underscores the lack of a coherent exit strategy as the conflict enters its fourth week with escalating costs.
Iran launched ballistic missiles at a US-UK military base in the Indian Ocean, marking a significant escalation in direct military confrontation. The strike represents one of the most direct attacks on Western military infrastructure since the conflict began, raising fears of further escalation.
Over 1,000 people have been killed in Lebanon since March 2, including 111 children. The humanitarian crisis deepens as the regional conflict expands beyond its initial scope, with civilian casualties mounting and international calls for a ceasefire going unanswered.
A federal jury found Elon Musk liable for misleading Twitter investors before his $44B acquisition. One of the highest-profile securities fraud findings in Silicon Valley history, with potentially significant financial implications and questions about accountability for market manipulation.
Robert Mueller, who served as FBI Director and later as Special Counsel investigating Russian interference in the 2016 election, has died at age 81. A defining figure in American law enforcement and justice, Mueller's career spanned decades of public service at the highest levels.
Ukrainian forces are using overhead netting as a defensive countermeasure against Russian first-person-view attack drones. The low-tech solution demonstrates battlefield adaptation against increasingly sophisticated drone warfare, as both sides innovate in the evolving conflict.
84% of crude oil transiting the Strait of Hormuz flows to Asian markets, making the blockade an existential energy threat to the region. Asian economies face cascading impacts on manufacturing, transport, and inflation as the crisis enters its fourth week with no resolution in sight.
GPT-5.4 with 1M context, Superapp consolidation merging ChatGPT/Codex/Atlas. Revenue exceeds $25B with IPO exploration underway. Dominant position in consumer and enterprise AI.
Gemini 3.1 Pro scores 77.1% on ARC-AGI-2. Stitch launch disrupts design tools. Flash-Lite at $0.25/M tokens makes frontier inference accessible at commodity prices.
Llama 4 open-source with full agentic capability — plans, executes, maintains context. The biggest open-source AI release in years, giving developers free access to an agent-class model.
Small 4 (119B MoE) multimodal reasoning model. Forge enterprise platform launched at GTC. On track for $1B ARR with lean European operations and open-weight philosophy.
Opus/Sonnet 4.6 with 1M context. Office integration puts Opus in PowerPoint and Excel. Memory features across conversations. Claude Code Channels for Telegram/Discord. DOD lawsuit support from competitors.
MiniMax M2.5 rivals Opus at lower cost. Cursor Composer 2 upgrade enhances AI coding. Perplexity expanding into health and finance verticals. The challenger tier is increasingly competitive.
AI amplifies capability but over-reliance atrophies judgment. Use AI as a collaborator, not a replacement. Maintain critical thinking skills alongside AI tools to avoid dependency and preserve decision-making quality.
Gumloop enables no-code agentic workflows. n8n provides open-source automation. Claude Code + MCP servers let you build sophisticated agents without traditional coding — lowering the barrier to production-grade AI automation.
NVIDIA DGX Station brings data center AI to the desktop. Apple Private Cloud Compute runs Gemini models with strict privacy guarantees. On-device inference is becoming practical as smaller MoE models reduce compute requirements.
Google Stitch turns sketches into production designs for free. Generative 3D and video tools are maturing rapidly. AI-assisted design is now accessible to non-designers, democratizing visual communication and reporting.
Copilot Agent Mode orchestrates cross-app tasks autonomously within Microsoft 365. Match workflow architecture to the specific use case rather than chasing maximum autonomy — the right level of AI assistance depends on the task.
Claude Opus 4.6 is the current best coding model with 1M context. Cursor Composer 2 and Claude Code Channels expand AI coding accessibility. Together they represent the frontier of AI-assisted software development.
Get tomorrow's brief delivered to your inbox.
One email per day. Unsubscribe anytime.