New from Meta

Llama 4 Maverick

17B x 128 Experts Instruct

Meta's latest flagship MoE model with 128 specialized experts and 128K context length. Superior instruction following with efficient inference using only 17B active parameters per token.

Model Specifications

Parameters17B x 128 Experts

ArchitectureMixture of Experts (MoE)

Context Length128K tokens

Active Parameters~17B per token

DeveloperMeta AI

LicenseLlama License

Why Choose Llama 4 Maverick

128 Experts

Massive MoE architecture with 128 specialized experts.

128K Context

Industry-leading context length for complex tasks.

Instruction Tuned

Fine-tuned for following complex instructions accurately.

Efficient Inference

Only 17B parameters active per token despite 128 experts.

Pricing Options

Serverless API

Pay per token with auto-scaling

₹30 input /1M tokens

₹60 output /1M tokens

Auto-scaling
No minimum
99.9% uptime
Rate limits apply

Recommended

Dedicated Instance

Reserved GPU for consistent performance

₹350/hour

4x H100 GPUs
No rate limits
Fine-tuning support
Private deployment

Use Cases

Long Document Analysis

Process documents up to 128K tokens in a single context.

Code Understanding

Analyze entire codebases with extended context.

Research Assistance

Summarize and analyze lengthy research papers.

Conversational AI

Build chatbots with excellent instruction following.

Ready to Deploy Llama 4 Maverick?

Experience Meta's most capable open model with 128K context.