Latest research

May 13, 2026

Introducing AIMIP: The AI weather and climate model intercomparison project

AIMIP is a new open benchmark and dataset for evaluating AI climate models, showing they can match or beat conventional models on some historical climate metrics while still struggling to generalize reliably to long-term warming trends and unseen climate scenarios.

Read post

May 8, 2026

EMO: Pretraining mixture of experts for emergent modularity

EMO is a new mixture-of-experts model trained so modular expert groups emerge from data, enabling users to select small task-specific expert subsets while preserving near full-model performance.

Read post

May 5, 2026

MolmoAct 2: An open foundation for robots that work in the real world

MolmoAct 2 is a fully open robotics foundation model that brings faster, stronger 3D action reasoning to real-world robot tasks, alongside a major new bimanual manipulation dataset for researchers to study, reproduce, and build on.

Read post

April 30, 2026

AstaBench update: New results, plus adoption from industry

AstaBench’s latest update adds new frontier-model results, including GPT-5.5, and highlights growing adoption from groups including the UK AISI, General Reasoning, Elicit, SciSpace, Distyl AI, and EvoScientist.

Read post

April 23, 2026

Introducing OlmoEarth embeddings: Custom embedding exports from OlmoEarth Studio for downstream analysis

OlmoEarth Studio now lets users export custom Earth-observation embeddings from our OlmoEarth foundation models and use them for tasks like similarity search, few-shot mapping, change detection, and unsupervised exploration.

Read post

April 23, 2026

OlmPool: How small architectural choices compound to undermine long context extension

OlmPool is a controlled suite of 26 models showing how small architecture choices can compound to make long-context extension much harder, even when training data and extension recipes are held constant.

Read post

April 20, 2026

Train separately, merge together: Modular post-training with mixture-of-experts

BAR is a recipe for post-training language models one capability at a time—train domain experts independently, merge them into a single mixture-of-experts model, and upgrade any expert without impacting the others.

Read post

April 13, 2026

Evaluating agents for scientific discovery

Two benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.

Read post

April 7, 2026

Introducing WildDet3D: Open-world 3D detection from a single image

WildDet3D is an open model that predicts 3D bounding boxes from a single image. It generalizes across cameras and object categories, and folds in depth signals when available—alongside a new dataset of verified 3D annotations.

Read post

1-9Next