Blog

From the team

ProductApr 14, 2026

New models on Cloudach: Llama 3.1, Command R+, and DBRX

Four new models are now available — Llama 3.1 8B and 70B with 128K context, Cohere Command R+ for RAG and agents, and Databricks DBRX for coding and reasoning.

MLApr 14, 2026

Fine-tune Llama 3 on your own data with Cloudach

A practical guide to fine-tuning Llama 3 with LoRA on Cloudach. Why fine-tuning beats prompting for domain tasks, how LoRA works under the hood, and how to get from raw data to a deployed model.

MLApr 14, 2026

How to choose the right open-source LLM

A practical decision framework for picking the right open-source LLM. Decision tree, use case matrix, benchmark comparisons, and cost tradeoffs for Mistral, Llama 3, Mixtral, and more.

EngineeringApr 10, 2026

How we achieve sub-100ms TTFT on Llama 3 with vLLM

A deep dive into our inference stack — continuous batching, flash attention, and tensor parallelism tuning.

ProductApr 5, 2026

Cloudach is now in public beta

We're opening up the platform to all developers. Deploy your first model free — no credit card required.

EngineeringMar 28, 2026

Building an OpenAI-compatible API gateway from scratch

How we designed our API layer to be a drop-in replacement for the OpenAI SDK with any open-source model.