Foundation Models - IBM watsonx.ai

Choose the model you need

Select the IBM® Granite®, open-source or third-party model best suited for your business and deploy on-prem or in the cloud.

IBM's POV on AI models

Choose the right foundation model

What’s new?

Granite 3.3 is now available in the watsonx® foundation model library.

Mistral Medium 3 now available in watsonx.ai®

New model feature

Meta Llama 4 Maverick and Llama 4 Scout is now available in watsonx.ai®

New model feature

New Granite 3.3 models have speech-to-text capabilities and improved language model performance

New model feature

Foundation model library

Choose the model that best fits your specific use case, budget considerations, regional interests and risk profile.

View the embedding model library

IBM models

Tailored for business, IBM Granite family of open, performant and trusted models deliver exceptional performance at a competitive price, without compromising safety.

View the IBM model library

Learn more about Granite

Meta Llama models

Llama models are open, efficient large language models designed for versatility and strong performance across a wide range of natural language tasks.

View the Meta model library

Learn more about our partnership

Mistral AI models

Mistral models are fast, performant, open-weight language models designed for modularity and optimized for text generation, reasoning and multilingual applications.

View the Mistral model library

Other third-party model providers

There are several foundation models from other providers available on watsonx.ai.

View the model library

Client stories

What happens when you train a powerful AI model with your own unique data? Better customer experiences and faster value with AI. Explore these stories and see how.

Wimbledon

Wimbledon used watsonx.ai foundation models to train its AI to create tennis commentary.

Read the case study

The Recording Academy

The Recording Academy used AI Stories with IBM watsonx to generate and scale editorial content around GRAMMY nominees.

Read the announcement

The Masters

The Masters uses watsonx.ai to bring AI-powered hole insights combined with expert opinions to digital platforms.

Read the announcement

AddAI.Life

AddAI.Life uses watsonx.ai to access selected open-source large language models to build higher quality virtual assistants.

Read the case study

IBM foundation models

See how Granite models were trained

Learn more about Granite

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

granite-3-3-8b-instruct

New

Featured model

IBM

Supports reasoning and planning, questions and answers (Q&A), fill-in-the-middle support, summarization, classification, generation, extraction, RAG and coding tasks.

128k

0.20

granite-3-2-8b-instruct

IBM

Supports reasoning and planning, Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

0.20

granite-vision-3-2-2b

IBM

Supports image-to-text use cases for chart, graphs and infographics analysis, and context Q&A.

16,384

0.10

granite-3-2b-instruct (v3.1)

IBM

Supports Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

0.10

granite-3-8b-instruct (v3.1)

Featured model

IBM

Supports Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

0.20

granite-guardian-3-8b (v3.1)

IBM

Supports detection of HAP/ or PII, jailbreaking, bias, violence and other harmful content.

128k

0.20

granite-guardian-3-2b (v3.1)

IBM

Supports detection of HAP or PII, jailbreaking, bias, violence and other harmful content.

128k

0.10

granite-13b-instruct

IBM

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

8192

0.60

granite-8b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

128k

0.60

granite-20b-multilingual

Deprecated

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in French, German, Portuguese, Spanish and English.

8192

0.60

granite-34b-code-instruct

Deprecated

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

8192

0.60

granite-20b-code-instruct

Deprecated

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

8192

0.60

granite-3b-code-instruct

Deprecated

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

128k

0.60

granite-8b-japanese

Deprecated

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in Japanese.

4096

0.60

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Meta models

Learn more about our partnership

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

llama-4-scout-17b-16e-instruct

New

Meta

Multimodal reasoning, long-context processing (10M tokens), code generation and analysis, multilingual operations (200 languages supported), STEM and logical reasoning.

128k

Free preview

llama-4-maverick-17b-128e-instruct-fp8

New

Meta

Multimodal reasoning, long-context processing (10M tokens), code generation and analysis, multilingual operations (200 languages supported), STEM and logical reasoning.

128k

Input: 0.35 / Output: 1.40

llama-3-3-70b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

0.71

llama-3-2-90b-vision-instruct

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

2.00

llama-3-2-11b-vision-instruct

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

0.35

llama-guard-3-11b-vision

Meta

Supports image filtering, HAP or PII detection and harmful content filtering.

128k

0.35

llama-3-2-1b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

0.10

llama-3-2-3b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

0.15

llama-3-405b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

Input: 5.00 / Output: 16.00

llama-3-1-70b-instruct

Deprecated

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

1.80

llama-3-1-8b-instruct

Deprecated

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

0.60

llama-3-70b-instruct

Deprecated

Meta

Supports RAG, generation, summarization, classification, Q&A, extraction, translation and code generation tasks.

8192

1.80

codellama-34b-instruct

Deprecated

Meta

Task-specific model for code by generating and translating code from a natural language prompt.

16384

1.80

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Mistral models

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

mistral-medium-2505

New

Mistral AI

Supports coding, image captioning, image-to-text transcription, function calling, data extraction and processing, context Q&A, mathematical reasoning

128k

Input: 3.00 / Output: 10.00

mistral-small-3-1-24b-instruct-2503

New

Mistral AI

Supports image captioning, image-to-text transcription, function calling, data extraction and processing, context Q&A and object identification

128k

Input: 0.10 / Output: 0.30

pixtral-12b

Mistral AI

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

0.35

mistral-large-2

Mistral AI

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in French, German, Italian, Spanish and English.

128k*

Input: 3.00 / Output: 10.00

Mistral-Small-24B-Instruct-2501

Deprecated

Mistral AI

Supports language tasks, agentic workflows, RAG and more in dozens of languages with a fast response time.

32768

0.35

mixtral-8x7b-instruct

Deprecated

Mistral AI

Supports Q&A, summarization, classification, generation, extraction, RAG and code generation tasks.

32768

0.60

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Third-party foundation models

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

allam-1-13b-instruct

SDAIA

Supports Q&A, summarization, classification, generation, extraction, RAG and translation in Arabic.

4096

1.80

jais-13b-chat (Arabic)

core42

Supports Q&A, summarization, classification, generation, extraction and translation in Arabic.

2048

1.80

flan-t5-xl-3b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt-tuning.

4096

0.60

flan-t5-xxl-11b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

1.80

flan-ul2-20b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

5.00

elyza-japanese-llama-2-7b-instruct

ELYZA

Supports Q&A, summarization, RAG, classification, generation, extraction and translation tasks.

4096

1.80

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Embedding model library

Use IBM developed and open-sourced embedding models, deployed in IBM watsonx.ai, for retrieval augmented generation, semantic search and document comparison tasks. Or choose a third-party embedding model provider.

IBM embedding models

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

granite-embedding-107m-multilingual

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

granite-embedding-278m-multilingual

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

slate-125m-english-rtrvr-v2

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

slate-125m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

slate-30m-english-rtrvr-v2

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

slate-30m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Third-party embedding models

Model name

Provider

Use cases

Context length

Price

USD/1 million tokens*

all-mini-l6-v2

New

Microsoft

Retrieval augmented generation, semantic search and document comparison tasks.

256

0.10

all-minilm-l12-v2

OS-NLP-CV

Retrieval augmented generation, semantic search and document comparison tasks.

256

0.10

multilingual-e5-large

Intel

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Resources

How to choose the right AI foundation model

View the full Granite cookbook

Generative AI and ML for the enterprise

Hugging Face and IBM working together in open source

Intellectual property

IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI portfolio has an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.

During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks through prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed-models.

Given the rapidly changing generative AI technology landscape, our end-to-end processes are expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.

Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer’s use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.

The current watsonx models now under these protections include:

(1) Slate family of encoder-only models

(2) Granite family of a decoder-only model

Learn more about licensing for Granite models

Take the next step

Start operationalizing and scaling generative AI and machine learning for business by exploring our free trial or booking a live demo.

Start your free trial

Book a live demo

More ways to explore

Connect with the IBM Community

Read SaaS documentation

Read software documentation

Find support

Footnotes

*Supported context length by model provider, but actual context length on platform is limited. For more information, please see Documentation.

Inference is billed in Resource Units. 1 Resource Unit is 1,000 tokens. Input and completion tokens are charged at the same rate. 1,000 tokens are generally about 750 words.

Not all models are available in all regions. See our documentation for details.

Context length is expressed in tokens.

The IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more details. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities are the same.

Foundation models in watsonx.ai

Choose the model you need

What’s new?

Foundation model library

Client stories

IBM foundation models

Meta models

Mistral models

Third-party foundation models

Embedding model library

IBM embedding models

Third-party embedding models

Resources

Intellectual property

Footnotes