← Back to blog
ProductApr 14, 2026

New models on Cloudach: Llama 3.1, Command R+, and DBRX

We're expanding the Cloudach model catalog with four new additions today. All models are available immediately via our OpenAI-compatible API — no SDK changes required.


Llama 3.1 — 8B and 70B

Meta's Llama 3.1 release is a significant upgrade over Llama 3. The headline change is the context window: both the 8B and 70B variants now support 128K tokens, making them practical for long-document summarisation, multi-turn agents, and large codebases.

Beyond context, Meta improved multilingual performance and released updated instruction-tuning. The 70B model in particular is competitive with proprietary frontier models on several reasoning and coding benchmarks.

Llama 3.1 8B Instruct
8B params · 128K context
chatcodefastlong-context
llama31-8b
$0.10 / $0.12 per 1M tokens (in/out)
Llama 3.1 70B Instruct
70B params · 128K context
chatcodepowerfullong-context
llama31-70b
$0.75 / $0.95 per 1M tokens (in/out)

Command R+

Cohere's Command R+ is a 104-billion-parameter model purpose-built for retrieval-augmented generation (RAG), tool use, and multi-step agentic workflows. Its 128K context window lets you pass large document sets directly into the prompt without chunking, and its native tool-calling support maps cleanly to OpenAI's function-calling format.

If you're building a knowledge base assistant, a customer-support bot backed by live data, or an orchestration layer for multi-tool agents, Command R+ is worth evaluating.

Command R+
104B params · 128K context
chatragagentslong-context
command-r-plus
$1.20 / $1.50 per 1M tokens (in/out)

DBRX

Databricks' DBRX is a 132-billion-parameter mixture-of-experts (MoE) model that activates only 36B parameters per forward pass, keeping inference cost and latency close to dense 40B models while delivering quality that matches or exceeds LLaMA 2 70B and Mistral across coding, reasoning, and general knowledge benchmarks.

DBRX is particularly strong on code generation and SQL tasks — a natural fit for data-engineering and analytics use cases.

DBRX Instruct
132B (36B active) params · 32K context
chatcodepowerful
dbrx
$1.10 / $1.40 per 1M tokens (in/out)

Getting started

All four models are available now in your model catalog. Deploy any of them and you'll get an OpenAI-compatible endpoint instantly — swap in the model ID and your Cloudach API key and you're done:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cloudach.com/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="llama31-70b",   # or command-r-plus, dbrx, llama31-8b
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Questions? Reach us at support@cloudach.com.