Examples

Each pattern below is a complete snippet you can paste into a file and execute. Pick the shape that matches your use case. If you haven’t done the tutorial yet, start there. This page is the “now how do I do X” reference.

The repo has many more focused examples for benchmarking, evaluation, and calibration work: crates/examples/ ships 38 Rust examples, and python/examples/ has the Python ones.

Tested against RedHop 0.3.x.

Single-document QA

One file, many questions. Index once on startup, query for the lifetime of the process.

import redhop

doc = redhop.Document.from_file("contract.pdf")

for q in ["What is the governing law?",
          "What is the refund window?",
          "What is the liability cap?"]:
    ctx = doc.context(q)
    print(f"\nQ: {q}")
    print(f"A-context: {ctx.text()[:200]}...")
    print(f"   cited: {ctx.citations[0]['source']} p{ctx.citations[0]['page']}")

import { Document } from "redhop";

const doc = Document.fromFile("contract.pdf");

for (const q of ["What is the governing law?",
                 "What is the refund window?",
                 "What is the liability cap?"]) {
  const ctx = doc.context(q);
  console.log(`\nQ: ${q}`);
  console.log(`A-context: ${ctx.text.slice(0, 200)}...`);
  console.log(`   cited: ${ctx.citations[0].source} p${ctx.citations[0].page}`);
}

use redhop::read_file;

let mut doc = read_file("contract.pdf")?;

for q in ["What is the governing law?",
          "What is the refund window?",
          "What is the liability cap?"] {
    let ctx = doc.context(q)?;
    println!("\nQ: {}\nA-context: {}...", q, &ctx.text()[..200.min(ctx.text().len())]);
}

Folder of files with persistent index

When the corpus is a directory of files (a docs folder, a knowledge base, a Git repo of markdown). persist=true writes an on-disk index so reload only pays for the files that changed: sub-second for a 5,000-file repo where 3 things changed.

import redhop

doc = redhop.Document.from_folder("./docs", options=redhop.FolderOptions(persist=True))
print(f"indexed: {doc.n_chunks} chunks across {doc.n_files} files")

ctx = doc.context("How do I configure retries?")
for c in ctx.citations:
    print(f"  {c['source']} → {c['heading']}")

Honors .gitignore by default. Custom ignore patterns:

doc = redhop.Document.from_folder("./repo", options=redhop.FolderOptions(persist=True, ignore=["*.lock", "node_modules/**", "target/**"]))

import { Document } from "redhop";

const doc = Document.fromFolder("./docs", { persist: true });
console.log(`indexed: ${doc.chunkCount} chunks`);

const ctx = doc.context("How do I configure retries?");
for (const c of ctx.citations) {
  console.log(`  ${c.source} → ${c.heading ?? "—"}`);
}

Custom ignore globs:

const doc = Document.fromFolder("./repo", {
  persist: true,
  ignore: ["*.lock", "node_modules/**", "target/**"],
});

use redhop::{read_folder_with, FolderOptions};

let mut doc = read_folder_with("./docs", &FolderOptions {
    persist: true,
    ignore: vec!["*.lock".into(), "node_modules/**".into(), "target/**".into()],
    ..Default::default()
})?;

let ctx = doc.context("How do I configure retries?")?;
for c in &ctx.citations {
    println!("  {} → {:?}", c.source, c.heading);
}

Semantic retrieval

The default lexical tier handles most document QA. For synonym-heavy corpora where the query and the answer share no surface words (HR FAQs, support tickets, multilingual content), reach for hybrid. First run downloads roughly 80 MB for the embedding model, then caches it.

import redhop

doc = redhop.Document.from_file("support_kb.md", options=redhop.DocumentOptions(retrieval="hybrid", model="bge-small"))

ctx = doc.context("why did the worker leave?")
# Will find "terminated employee" even though the words don't overlap.
print(ctx.text())

const doc = Document.fromFile("support_kb.md", {
  retrieval: "hybrid",
  model: "bge-small",
});

const ctx = doc.context("why did the worker leave?");
console.log(ctx.text);

use redhop::{read_file_with, LoadOptions};

let mut doc = read_file_with("support_kb.md", &LoadOptions {
    retrieval: Some("hybrid".into()),
    model: Some("bge-small".into()),
    ..Default::default()
})?;

let ctx = doc.context("why did the worker leave?")?;

See Choosing a configuration for when to reach for hybrid vs lexical vs cross-encoder rerank.

Structural expansion

Useful for documents with parallel clauses: contracts with regional overrides (“EU override of §X”, “UK override of §X”), policies with per-region sub-sections. The top retrieval hit is correct on its own, but the section heading and adjacent chunks add context the LLM needs to disambiguate. neighbors includes adjacent chunks, and include_heading prepends the section title.

doc = redhop.Document.from_file("msa.pdf", options=redhop.DocumentOptions(retrieval="hybrid", model="bge-small"))

ctx = doc.context(
    "What law applies in the UK?",
    neighbors=1,           # ± 1 chunk around each hit
    include_heading=True,  # prepend the section heading
)

const doc = Document.fromFile("msa.pdf", {
  retrieval: "hybrid",
  model: "bge-small",
});

const ctx = doc.context(
  "What law applies in the UK?",
  undefined,   // budget -- default
  1,           // neighbors
  true,        // includeHeading
);

use redhop::{read_file_with, LoadOptions, ContextOptions};

let mut doc = read_file_with("msa.pdf", &LoadOptions {
    retrieval: Some("hybrid".into()),
    model: Some("bge-small".into()),
    ..Default::default()
})?;

let ctx = doc.context_with("What law applies in the UK?", &ContextOptions {
    neighbors: 1,
    include_heading: true,
    ..Default::default()
})?;

Cross-encoder reranking

Adds a second-stage scorer that reads each (query, passage) pair together rather than independently, giving more precise ranking on paraphrase-mismatch corpora. The cost is five to ten times the query latency. Enable it only when you’ve measured that it helps on your corpus.

doc = redhop.Document.from_file("kb.md", options=redhop.DocumentOptions(retrieval="hybrid", model="bge-small", rerank="cross-encoder"))     # adds the reranker stage

const doc = Document.fromFile("kb.md", {
  retrieval: "hybrid",
  model: "bge-small",
  rerank: "cross-encoder",
});

let mut doc = read_file_with("kb.md", &LoadOptions {
    retrieval: Some("hybrid".into()),
    model: Some("bge-small".into()),
    rerank: Some("cross-encoder".into()),
    ..Default::default()
})?;

Query rewrites with per-stage audit

Compose a chain of query rewriters (template stripping, vocabulary expansion, your own) and read the per-stage trace off the same Decision Report. Useful for templated workloads (legal QA, support triage) where the query carries boilerplate the index doesn’t, or where the query and answer share no surface vocabulary.

import redhop

stripper = redhop.Stripper(["highlight", "the", "parts", "of", "this", "contract"])
vocab    = redhop.Vocabulary({"change of control": ["merger", "successor", "acquisition"]})

doc = redhop.Document.from_file("contract.pdf")
ctx = doc.context_with_rewrites(
    'Highlight the parts of this contract related to "Change of Control".',
    [stripper, vocab],
)

for rec in ctx.report.query_rewrites:
    print(rec.stage, "matched=", rec.matched, "added=", rec.added, "removed=", rec.removed)
# strip       matched=['highlight', 'the', 'parts', 'of', 'this', 'contract']
#             added=[]
#             removed=['highlight', 'the', 'parts', 'of', 'this', 'contract']
# vocabulary  matched=['change of control']
#             added=['merger', 'successor', 'acquisition']
#             removed=[]

Each stage’s matched / added / removed is the audit trail. Score the lift with redhop.evaluate(query, ctx, gold_chunks=gold_ids) against a small labeled set: no LLM judge, milliseconds per query.

Custom citations rendering

ctx.citations is a list. Render it however your UI needs. A common pattern is footnote-style citations beneath the LLM answer.

Python
Node.js

ctx = doc.context(question)
# ... LLM call returns `answer` ...

# Footnote-style output
print(f"{answer}\n\n--- Sources ---")
for i, c in enumerate(ctx.citations, 1):
    where = c['source']
    if c.get('page'):    where += f", p.{c['page']}"
    if c.get('heading'): where += f" → {c['heading']}"
    print(f"[{i}] {where}")

const ctx = doc.context(question);
// ... LLM call returns `answer` ...

console.log(`${answer}\n\n--- Sources ---`);
ctx.citations.forEach((c, i) => {
  let where = c.source;
  if (c.page) where += `, p.${c.page}`;
  if (c.heading) where += ` → ${c.heading}`;
  console.log(`[${i + 1}] ${where}`);
});

Diagnostics before generating

doc.analyze(query) returns the same report shape as context() but without assembling the context, so it’s cheap. Useful when you want to decide whether to call the LLM at all (skip the call if the corpus clearly doesn’t contain the answer, surface a “no confident match” state in the UI, etc).

Python
Node.js

report = doc.analyze("Where is the company headquartered?")

if report.n_input_chunks == 0:
    print("no relevant content -- skip the LLM call")
    return None

if report.evidence_density < 0.1:
    print(f"low confidence (density {report.evidence_density:.2f}) -- caller may want to ask differently")

const report = doc.analyze("Where is the company headquartered?");

if (report.nInputChunks === 0) {
  console.log("no relevant content -- skip the LLM call");
  return null;
}

Already have chunks? Skip the loader

If you already have retrieved chunks from your own pipeline (BM25 in Postgres, an existing vector store), and you only want RedHop’s context assembly and the Decision Report, use the lower-level build_context. As of 0.3.0, wrap each chunk in the typed redhop.Chunk(...) so source / id / metadata flow through to citations:

Python
Node.js

import redhop

chunks = [
    redhop.Chunk(
        "...",
        source="doc1.pdf",
        id="c1",
        metadata={"page": 12},
    ),
    redhop.Chunk(
        "...",
        source="doc2.pdf",
        id="c2",
        metadata={"page": 3},
    ),
    # ... your retrieved set ...
]

ctx = redhop.build_context(
    query="What is the governing law?",
    retrieved_chunks=chunks,
    strategy="auto",
    token_budget=8192,
)

print(ctx.text())
print(ctx.report)

You can also redhop.analyze_context(query, chunks) for diagnostics without assembly.

const { buildContext, Chunk } = require("redhop");

const chunks = [
  new Chunk("...", { source: "doc1.pdf", id: "c1", metadata: { page: 12 } }),
  new Chunk("...", { source: "doc2.pdf", id: "c2", metadata: { page: 3 } }),
];

const ctx = buildContext("What is the governing law?", chunks, {
  strategy: "auto",
  tokenBudget: 8192,
});

LLM provider integration

RedHop hands you a prompt string. Call any provider directly. A few of the common ones:

from openai import OpenAI

ctx = doc.context(query)
response = OpenAI().chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],
)
answer = response.choices[0].message.content

import anthropic

ctx = doc.context(query)
response = anthropic.Anthropic().messages.create(
    model="claude-haiku-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],
)
answer = response.content[0].text

import requests

ctx = doc.context(query)
response = requests.post("http://localhost:11434/api/generate", json={
    "model": "llama3",
    "prompt": f"{ctx.text()}\n\nQuestion: {query}",
    "stream": False,
})
answer = response.json()["response"]

import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { Document } from "redhop";

// In your API route handler:
const doc = Document.fromFile("contract.pdf");
const ctx = doc.context(query);

const result = streamText({
  model: openai("gpt-4o-mini"),
  prompt: `${ctx.text}\n\nQuestion: ${query}`,
});

return result.toDataStreamResponse();

More examples (in the repo)

The website surfaces the patterns you’ll most often reach for. The repo’s examples/ tree has runnable demos in all three languages: same 11 scenarios per language, real-world flavored (legal contract, support FAQ, on-call runbook, multi-hop research, chat history):

Python: examples/python/ (01_quickstart.py … 11_folder_indexing.py)
Node.js: examples/nodejs/ (.cjs mirrors of the Python set)
Rust: examples/rust/ via cargo run -p redhop-rust-examples --example <NN_name> --release

Each language folder’s README explains how to run, what each demo covers, and the relevant finding in docs/findings/ where applicable.

For measurement probes (the evidence-layer probes behind the findings: CUAD harnesses, multilingual sweeps, the four-corner-observation falsifications), see crates/examples/examples/ (~60 Rust files, run with cargo run -p redhop-examples --example <name>). Those are evidence, not how-to: they prove things on real workloads.

Where to go next

Deploy RedHop to production: patterns for shipping these as services
Choosing a configuration: when to pick lexical / hybrid / +rerank
Options reference: every parameter, every default