Examples
Each pattern below is a complete snippet you can paste into a file and execute. Pick the shape that matches your use case. If you haven’t done the tutorial yet, start there. This page is the “now how do I do X” reference.
The repo has many more focused examples for benchmarking, evaluation, and calibration work: crates/examples/ ships 38 Rust examples, and python/examples/ has the Python ones.
Tested against RedHop 0.3.x.
Single-document QA
Section titled “Single-document QA”One file, many questions. Index once on startup, query for the lifetime of the process.
import redhop
doc = redhop.Document.from_file("contract.pdf")
for q in ["What is the governing law?", "What is the refund window?", "What is the liability cap?"]: ctx = doc.context(q) print(f"\nQ: {q}") print(f"A-context: {ctx.text()[:200]}...") print(f" cited: {ctx.citations[0]['source']} p{ctx.citations[0]['page']}")import { Document } from "redhop";
const doc = Document.fromFile("contract.pdf");
for (const q of ["What is the governing law?", "What is the refund window?", "What is the liability cap?"]) { const ctx = doc.context(q); console.log(`\nQ: ${q}`); console.log(`A-context: ${ctx.text.slice(0, 200)}...`); console.log(` cited: ${ctx.citations[0].source} p${ctx.citations[0].page}`);}use redhop::read_file;
let mut doc = read_file("contract.pdf")?;
for q in ["What is the governing law?", "What is the refund window?", "What is the liability cap?"] { let ctx = doc.context(q)?; println!("\nQ: {}\nA-context: {}...", q, &ctx.text()[..200.min(ctx.text().len())]);}Folder of files with persistent index
Section titled “Folder of files with persistent index”When the corpus is a directory of files (a docs folder, a knowledge
base, a Git repo of markdown). persist=true writes an on-disk index
so reload only pays for the files that changed: sub-second for a
5,000-file repo where 3 things changed.
import redhop
doc = redhop.Document.from_folder("./docs", persist=True)print(f"indexed: {doc.n_chunks} chunks across {doc.n_files} files")
ctx = doc.context("How do I configure retries?")for c in ctx.citations: print(f" {c['source']} → {c['heading']}")Honors .gitignore by default. Custom ignore patterns:
doc = redhop.Document.from_folder("./repo", persist=True, ignore=["*.lock", "node_modules/**", "target/**"])import { Document } from "redhop";
const doc = Document.fromFolder("./docs", { persist: true });console.log(`indexed: ${doc.chunkCount} chunks`);
const ctx = doc.context("How do I configure retries?");for (const c of ctx.citations) { console.log(` ${c.source} → ${c.heading ?? "—"}`);}Custom ignore globs:
const doc = Document.fromFolder("./repo", { persist: true, ignore: ["*.lock", "node_modules/**", "target/**"],});use redhop::{read_folder_with, FolderOptions};
let mut doc = read_folder_with("./docs", &FolderOptions { persist: true, ignore: vec!["*.lock".into(), "node_modules/**".into(), "target/**".into()], ..Default::default()})?;
let ctx = doc.context("How do I configure retries?")?;for c in &ctx.citations { println!(" {} → {:?}", c.source, c.heading);}Semantic retrieval
Section titled “Semantic retrieval”The default lexical tier handles most document QA. For synonym-heavy corpora where the query and the answer share no surface words (HR FAQs, support tickets, multilingual content), reach for hybrid. First run downloads roughly 80 MB for the embedding model, then caches it.
import redhop
doc = redhop.Document.from_file("support_kb.md", retrieval="hybrid", model="bge-small")
ctx = doc.context("why did the worker leave?")# Will find "terminated employee" even though the words don't overlap.print(ctx.text())const doc = Document.fromFile("support_kb.md", { retrieval: "hybrid", model: "bge-small",});
const ctx = doc.context("why did the worker leave?");console.log(ctx.text);use redhop::{read_file_with, LoadOptions};
let mut doc = read_file_with("support_kb.md", &LoadOptions { retrieval: Some("hybrid".into()), model: Some("bge-small".into()), ..Default::default()})?;
let ctx = doc.context("why did the worker leave?")?;See Choosing a configuration for when to reach for hybrid vs lexical vs cross-encoder rerank.
Structural expansion
Section titled “Structural expansion”Useful for documents with parallel clauses: contracts with regional
overrides (“EU override of §X”, “UK override of §X”), policies with
per-region sub-sections. The top retrieval hit is correct on its own,
but the section heading and adjacent chunks add context the LLM needs
to disambiguate. neighbors includes adjacent chunks, and include_heading
prepends the section title.
doc = redhop.Document.from_file("msa.pdf", retrieval="hybrid", model="bge-small")
ctx = doc.context( "What law applies in the UK?", neighbors=1, # ± 1 chunk around each hit include_heading=True, # prepend the section heading)const doc = Document.fromFile("msa.pdf", { retrieval: "hybrid", model: "bge-small",});
const ctx = doc.context( "What law applies in the UK?", undefined, // budget -- default 1, // neighbors true, // includeHeading);use redhop::{read_file_with, LoadOptions, ContextOptions};
let mut doc = read_file_with("msa.pdf", &LoadOptions { retrieval: Some("hybrid".into()), model: Some("bge-small".into()), ..Default::default()})?;
let ctx = doc.context_with("What law applies in the UK?", &ContextOptions { neighbors: 1, include_heading: true, ..Default::default()})?;Cross-encoder reranking
Section titled “Cross-encoder reranking”Adds a second-stage scorer that reads each (query, passage) pair
together rather than independently, giving more precise ranking on
paraphrase-mismatch corpora. The cost is five to ten times the
query latency. Enable it only when you’ve measured that it helps on
your corpus.
doc = redhop.Document.from_file("kb.md", retrieval="hybrid", model="bge-small", rerank="cross-encoder") # adds the reranker stageconst doc = Document.fromFile("kb.md", { retrieval: "hybrid", model: "bge-small", rerank: "cross-encoder",});let mut doc = read_file_with("kb.md", &LoadOptions { retrieval: Some("hybrid".into()), model: Some("bge-small".into()), rerank: Some("cross-encoder".into()), ..Default::default()})?;Query rewrites with per-stage audit
Section titled “Query rewrites with per-stage audit”Compose a chain of query rewriters (template stripping, vocabulary expansion, your own) and read the per-stage trace off the same Decision Report. Useful for templated workloads (legal QA, support triage) where the query carries boilerplate the index doesn’t, or where the query and answer share no surface vocabulary.
import redhop
stripper = redhop.Stripper(["highlight", "the", "parts", "of", "this", "contract"])vocab = redhop.Vocabulary({"change of control": ["merger", "successor", "acquisition"]})
doc = redhop.Document.from_file("contract.pdf")ctx = doc.context_with_rewrites( 'Highlight the parts of this contract related to "Change of Control".', [stripper, vocab],)
for rec in ctx.report.query_rewrites: print(rec.stage, "matched=", rec.matched, "added=", rec.added, "removed=", rec.removed)# strip matched=['highlight', 'the', 'parts', 'of', 'this', 'contract']# added=[]# removed=['highlight', 'the', 'parts', 'of', 'this', 'contract']# vocabulary matched=['change of control']# added=['merger', 'successor', 'acquisition']# removed=[]Each stage’s matched / added / removed is the audit trail. Score
the lift with redhop.evaluate(query, ctx, gold_chunks=gold_ids) against
a small labeled set: no LLM judge, milliseconds per query.
Custom citations rendering
Section titled “Custom citations rendering”ctx.citations is a list. Render it however your UI needs. A
common pattern is footnote-style citations beneath the LLM answer.
ctx = doc.context(question)# ... LLM call returns `answer` ...
# Footnote-style outputprint(f"{answer}\n\n--- Sources ---")for i, c in enumerate(ctx.citations, 1): where = c['source'] if c.get('page'): where += f", p.{c['page']}" if c.get('heading'): where += f" → {c['heading']}" print(f"[{i}] {where}")const ctx = doc.context(question);// ... LLM call returns `answer` ...
console.log(`${answer}\n\n--- Sources ---`);ctx.citations.forEach((c, i) => { let where = c.source; if (c.page) where += `, p.${c.page}`; if (c.heading) where += ` → ${c.heading}`; console.log(`[${i + 1}] ${where}`);});Diagnostics before generating
Section titled “Diagnostics before generating”doc.analyze(query) returns the same report shape as context() but
without assembling the context, so it’s cheap. Useful when you want to
decide whether to call the LLM at all (skip the call if the corpus
clearly doesn’t contain the answer, surface a “no confident match”
state in the UI, etc).
report = doc.analyze("Where is the company headquartered?")
if report.n_input_chunks == 0: print("no relevant content -- skip the LLM call") return None
if report.evidence_density < 0.1: print(f"low confidence (density {report.evidence_density:.2f}) -- caller may want to ask differently")const report = doc.analyze("Where is the company headquartered?");
if (report.nInputChunks === 0) { console.log("no relevant content -- skip the LLM call"); return null;}Already have chunks? Skip the loader
Section titled “Already have chunks? Skip the loader”If you already have retrieved chunks from your own pipeline (BM25 in
Postgres, an existing vector store), and you only want RedHop’s
context assembly and the Decision Report, use the lower-level
build_context. As of 0.3.0, wrap each chunk in the typed
redhop.Chunk(...) so source / id / metadata flow through to
citations:
import redhop
chunks = [ redhop.Chunk( "...", source="doc1.pdf", id="c1", metadata={"page": 12}, ), redhop.Chunk( "...", source="doc2.pdf", id="c2", metadata={"page": 3}, ), # ... your retrieved set ...]
ctx = redhop.build_context( query="What is the governing law?", retrieved_chunks=chunks, strategy="auto", token_budget=8192,)
print(ctx.text())print(ctx.report)You can also redhop.analyze_context(query, chunks) for diagnostics
without assembly.
const { buildContext, Chunk } = require("redhop");
const chunks = [ new Chunk("...", { source: "doc1.pdf", id: "c1", metadata: { page: 12 } }), new Chunk("...", { source: "doc2.pdf", id: "c2", metadata: { page: 3 } }),];
const ctx = buildContext("What is the governing law?", chunks, { strategy: "auto", tokenBudget: 8192,});LLM provider integration
Section titled “LLM provider integration”RedHop hands you a prompt string. Call any provider directly. A few of the common ones:
from openai import OpenAI
ctx = doc.context(query)response = OpenAI().chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],)answer = response.choices[0].message.contentimport anthropic
ctx = doc.context(query)response = anthropic.Anthropic().messages.create( model="claude-haiku-4-5", max_tokens=1024, messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],)answer = response.content[0].textimport requests
ctx = doc.context(query)response = requests.post("http://localhost:11434/api/generate", json={ "model": "llama3", "prompt": f"{ctx.text()}\n\nQuestion: {query}", "stream": False,})answer = response.json()["response"]import { openai } from "@ai-sdk/openai";import { streamText } from "ai";import { Document } from "redhop";
// In your API route handler:const doc = Document.fromFile("contract.pdf");const ctx = doc.context(query);
const result = streamText({ model: openai("gpt-4o-mini"), prompt: `${ctx.text}\n\nQuestion: ${query}`,});
return result.toDataStreamResponse();More examples (in the repo)
Section titled “More examples (in the repo)”The website surfaces the patterns you’ll most often reach for. The repo’s
examples/ tree
has runnable demos in all three languages: same 11 scenarios per
language, real-world flavored (legal contract, support FAQ, on-call
runbook, multi-hop research, chat history):
- Python:
examples/python/(01_quickstart.py…11_folder_indexing.py) - Node.js:
examples/nodejs/(.cjsmirrors of the Python set) - Rust:
examples/rust/viacargo run -p redhop-rust-examples --example <NN_name> --release
Each language folder’s README explains how to run, what each demo
covers, and the relevant finding in docs/findings/ where applicable.
For measurement probes (the evidence-layer probes behind the
findings: CUAD harnesses, multilingual sweeps, the four-corner-observation
falsifications), see
crates/examples/examples/
(~60 Rust files, run with cargo run -p redhop-examples --example <name>).
Those are evidence, not how-to: they prove things on real workloads.
Where to go next
Section titled “Where to go next”- Deploy RedHop to production: patterns for shipping these as services
- Choosing a configuration: when to pick lexical / hybrid / +rerank
- Options reference: every parameter, every default