Latest research
May 13, 2026
Introducing AIMIP: The AI weather and climate model intercomparison project
AIMIP is a new open benchmark and dataset for evaluating AI climate models, showing they can match or beat conventional models on some historical climate metrics while still struggling to generalize reliably to long-term warming trends and unseen climate scenarios.May 8, 2026
EMO: Pretraining mixture of experts for emergent modularity
EMO is a new mixture-of-experts model trained so modular expert groups emerge from data, enabling users to select small task-specific expert subsets while preserving near full-model performance.May 5, 2026
MolmoAct 2: An open foundation for robots that work in the real world
MolmoAct 2 is a fully open robotics foundation model that brings faster, stronger 3D action reasoning to real-world robot tasks, alongside a major new bimanual manipulation dataset for researchers to study, reproduce, and build on.April 30, 2026
AstaBench update: New results, plus adoption from industry
AstaBench’s latest update adds new frontier-model results, including GPT-5.5, and highlights growing adoption from groups including the UK AISI, General Reasoning, Elicit, SciSpace, Distyl AI, and EvoScientist.April 23, 2026
Introducing OlmoEarth embeddings: Custom embedding exports from OlmoEarth Studio for downstream analysis
OlmoEarth Studio now lets users export custom Earth-observation embeddings from our OlmoEarth foundation models and use them for tasks like similarity search, few-shot mapping, change detection, and unsupervised exploration.April 23, 2026
OlmPool: How small architectural choices compound to undermine long context extension
OlmPool is a controlled suite of 26 models showing how small architecture choices can compound to make long-context extension much harder, even when training data and extension recipes are held constant.April 20, 2026
Train separately, merge together: Modular post-training with mixture-of-experts
BAR is a recipe for post-training language models one capability at a time—train domain experts independently, merge them into a single mixture-of-experts model, and upgrade any expert without impacting the others.April 13, 2026
Evaluating agents for scientific discovery
Two benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.April 7, 2026
Introducing WildDet3D: Open-world 3D detection from a single image
WildDet3D is an open model that predicts 3D bounding boxes from a single image. It generalizes across cameras and object categories, and folds in depth signals when available—alongside a new dataset of verified 3D annotations.1-9Next