Beginner~10 minPython

LangChain integration

Use Cloudach as a ChatOpenAI provider in LangChain. Because Cloudach is fully OpenAI-compatible, you only need to change two config values. This guide covers basic chat, streaming, and building LCEL chains.

Overview

What you'll learn:

Configure ChatOpenAI to use Cloudach models (Llama 3, Mistral, Mixtral)
Stream tokens as they're generated
Build a prompt → model → parser chain with LangChain Expression Language (LCEL)
Run a complete multi-question demo script

You need a Cloudach API key. Sign up free — no credit card required.

Install

pip install langchain langchain-openai

Set your API key in the environment (recommended) or pass it directly in code:

export CLOUDACH_API_KEY="sk-cloudach-YOUR_KEY"

Step 1 — Basic usage

Instantiate ChatOpenAI with openai_api_base pointing at Cloudach and your Cloudach API key. Everything else — invoke, batch, tool calling — works identically to the OpenAI version.

import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(
    model="llama3-70b",
    openai_api_key=os.environ["CLOUDACH_API_KEY"],
    openai_api_base="https://api.cloudach.com/v1",
    temperature=0.7,
)

messages = [
    SystemMessage(content="You are a concise technical assistant."),
    HumanMessage(content="What is the difference between a process and a thread?"),
]

response = llm.invoke(messages)
print(response.content)

Model choice: Use llama3-8b for fast, high-volume pipelines and llama3-70b or mixtral-8x7b for complex reasoning.

Step 2 — Streaming

Pass streaming=True to the constructor, then call .stream() on your chain or LLM. Each chunk is a BaseMessageChunk with a .content string.

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(
    model="llama3-70b",
    openai_api_key=os.environ["CLOUDACH_API_KEY"],
    openai_api_base="https://api.cloudach.com/v1",
    streaming=True,
)

for chunk in llm.stream([HumanMessage(content="Write a haiku about open source software.")]):
    print(chunk.content, end="", flush=True)
print()

Step 3 — LCEL chains

LangChain Expression Language (LCEL) lets you compose prompts, models, and parsers with the | pipe operator. The chain is lazy and composable — the same chain can be invoked, streamed, or batched.

Basic chain

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(
    model="llama3-70b",
    openai_api_key=os.environ["CLOUDACH_API_KEY"],
    openai_api_base="https://api.cloudach.com/v1",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a concise technical writer. Answer in 2–3 sentences."),
    ("human", "{question}"),
])

chain = prompt | llm | StrOutputParser()

# Invoke (returns a string)
answer = chain.invoke({"question": "What is retrieval-augmented generation?"})
print(answer)

Stream a chain

for token in chain.stream({"question": "Explain LLM temperature in plain English."}):
    print(token, end="", flush=True)
print()

Batch multiple inputs

questions = [
    {"question": "What is a vector database?"},
    {"question": "What is a transformer?"},
    {"question": "What is fine-tuning?"},
]

answers = chain.batch(questions)
for q, a in zip(questions, answers):
    print(f"Q: {q['question']}")
    print(f"A: {a}\n")

Step 4 — Complete working script

Save as cloudach_langchain.py and run with:

CLOUDACH_API_KEY=sk-cloudach-YOUR_KEY python cloudach_langchain.py

#!/usr/bin/env python3
"""Cloudach + LangChain integration demo.

Install:
    pip install langchain langchain-openai

Run:
    CLOUDACH_API_KEY=sk-cloudach-YOUR_KEY python cloudach_langchain.py
"""
import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# ── Configure ───────────────────────────────────────────────────────────────
llm = ChatOpenAI(
    model="llama3-70b",
    openai_api_key=os.environ["CLOUDACH_API_KEY"],
    openai_api_base="https://api.cloudach.com/v1",
    temperature=0.7,
    streaming=True,
)

# ── Build a chain ───────────────────────────────────────────────────────────
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Be concise and direct."),
    ("human", "{question}"),
])

chain = prompt | llm | StrOutputParser()

# ── Run it ───────────────────────────────────────────────────────────────────
questions = [
    "What makes Llama 3 different from GPT-4?",
    "Give me a Python one-liner to flatten a list of lists.",
    "Explain tokens in 30 words.",
]

for q in questions:
    print(f"\nQ: {q}\nA: ", end="", flush=True)
    for token in chain.stream({"question": q}):
        print(token, end="", flush=True)
    print()

Available models

Model ID	Context	Best for
`llama3-8b`	8K	Fast responses, high-volume pipelines
`llama3-70b`	8K	Complex reasoning, nuanced answers
`mistral-7b`	32K	Long documents, code generation
`mixtral-8x7b`	32K	Highest accuracy, complex tasks

What's next

LlamaIndex integration — use Cloudach in RAG pipelines and query engines
Rate limits — plan your retry logic
SDK compatibility — other frameworks that work with Cloudach
support@cloudach.com — questions or feedback

API Docs Changelog support@cloudach.com