Skip to content

Examples

Each pattern below is a complete snippet you can paste into a file and execute. Pick the shape that matches your use case. If you haven’t done the tutorial yet, start there. This page is the “now how do I do X” reference.

The repo has many more focused examples for benchmarking, evaluation, and calibration work: crates/examples/ ships 38 Rust examples, and python/examples/ has the Python ones.

Tested against RedHop 0.3.x.

One file, many questions. Index once on startup, query for the lifetime of the process.

import redhop
doc = redhop.Document.from_file("contract.pdf")
for q in ["What is the governing law?",
"What is the refund window?",
"What is the liability cap?"]:
ctx = doc.context(q)
print(f"\nQ: {q}")
print(f"A-context: {ctx.text()[:200]}...")
print(f" cited: {ctx.citations[0]['source']} p{ctx.citations[0]['page']}")

When the corpus is a directory of files (a docs folder, a knowledge base, a Git repo of markdown). persist=true writes an on-disk index so reload only pays for the files that changed: sub-second for a 5,000-file repo where 3 things changed.

import redhop
doc = redhop.Document.from_folder("./docs", persist=True)
print(f"indexed: {doc.n_chunks} chunks across {doc.n_files} files")
ctx = doc.context("How do I configure retries?")
for c in ctx.citations:
print(f" {c['source']}{c['heading']}")

Honors .gitignore by default. Custom ignore patterns:

doc = redhop.Document.from_folder("./repo",
persist=True,
ignore=["*.lock", "node_modules/**", "target/**"])

The default lexical tier handles most document QA. For synonym-heavy corpora where the query and the answer share no surface words (HR FAQs, support tickets, multilingual content), reach for hybrid. First run downloads roughly 80 MB for the embedding model, then caches it.

import redhop
doc = redhop.Document.from_file("support_kb.md",
retrieval="hybrid",
model="bge-small")
ctx = doc.context("why did the worker leave?")
# Will find "terminated employee" even though the words don't overlap.
print(ctx.text())

See Choosing a configuration for when to reach for hybrid vs lexical vs cross-encoder rerank.

Useful for documents with parallel clauses: contracts with regional overrides (“EU override of §X”, “UK override of §X”), policies with per-region sub-sections. The top retrieval hit is correct on its own, but the section heading and adjacent chunks add context the LLM needs to disambiguate. neighbors includes adjacent chunks, and include_heading prepends the section title.

doc = redhop.Document.from_file("msa.pdf",
retrieval="hybrid", model="bge-small")
ctx = doc.context(
"What law applies in the UK?",
neighbors=1, # ± 1 chunk around each hit
include_heading=True, # prepend the section heading
)

Adds a second-stage scorer that reads each (query, passage) pair together rather than independently, giving more precise ranking on paraphrase-mismatch corpora. The cost is five to ten times the query latency. Enable it only when you’ve measured that it helps on your corpus.

doc = redhop.Document.from_file("kb.md",
retrieval="hybrid",
model="bge-small",
rerank="cross-encoder") # adds the reranker stage

Compose a chain of query rewriters (template stripping, vocabulary expansion, your own) and read the per-stage trace off the same Decision Report. Useful for templated workloads (legal QA, support triage) where the query carries boilerplate the index doesn’t, or where the query and answer share no surface vocabulary.

import redhop
stripper = redhop.Stripper(["highlight", "the", "parts", "of", "this", "contract"])
vocab = redhop.Vocabulary({"change of control": ["merger", "successor", "acquisition"]})
doc = redhop.Document.from_file("contract.pdf")
ctx = doc.context_with_rewrites(
'Highlight the parts of this contract related to "Change of Control".',
[stripper, vocab],
)
for rec in ctx.report.query_rewrites:
print(rec.stage, "matched=", rec.matched, "added=", rec.added, "removed=", rec.removed)
# strip matched=['highlight', 'the', 'parts', 'of', 'this', 'contract']
# added=[]
# removed=['highlight', 'the', 'parts', 'of', 'this', 'contract']
# vocabulary matched=['change of control']
# added=['merger', 'successor', 'acquisition']
# removed=[]

Each stage’s matched / added / removed is the audit trail. Score the lift with redhop.evaluate(query, ctx, gold_chunks=gold_ids) against a small labeled set: no LLM judge, milliseconds per query.

ctx.citations is a list. Render it however your UI needs. A common pattern is footnote-style citations beneath the LLM answer.

ctx = doc.context(question)
# ... LLM call returns `answer` ...
# Footnote-style output
print(f"{answer}\n\n--- Sources ---")
for i, c in enumerate(ctx.citations, 1):
where = c['source']
if c.get('page'): where += f", p.{c['page']}"
if c.get('heading'): where += f" → {c['heading']}"
print(f"[{i}] {where}")

doc.analyze(query) returns the same report shape as context() but without assembling the context, so it’s cheap. Useful when you want to decide whether to call the LLM at all (skip the call if the corpus clearly doesn’t contain the answer, surface a “no confident match” state in the UI, etc).

report = doc.analyze("Where is the company headquartered?")
if report.n_input_chunks == 0:
print("no relevant content -- skip the LLM call")
return None
if report.evidence_density < 0.1:
print(f"low confidence (density {report.evidence_density:.2f}) -- caller may want to ask differently")

If you already have retrieved chunks from your own pipeline (BM25 in Postgres, an existing vector store), and you only want RedHop’s context assembly and the Decision Report, use the lower-level build_context. As of 0.3.0, wrap each chunk in the typed redhop.Chunk(...) so source / id / metadata flow through to citations:

import redhop
chunks = [
redhop.Chunk(
"...",
source="doc1.pdf",
id="c1",
metadata={"page": 12},
),
redhop.Chunk(
"...",
source="doc2.pdf",
id="c2",
metadata={"page": 3},
),
# ... your retrieved set ...
]
ctx = redhop.build_context(
query="What is the governing law?",
retrieved_chunks=chunks,
strategy="auto",
token_budget=8192,
)
print(ctx.text())
print(ctx.report)

You can also redhop.analyze_context(query, chunks) for diagnostics without assembly.

RedHop hands you a prompt string. Call any provider directly. A few of the common ones:

from openai import OpenAI
ctx = doc.context(query)
response = OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],
)
answer = response.choices[0].message.content

The website surfaces the patterns you’ll most often reach for. The repo’s examples/ tree has runnable demos in all three languages: same 11 scenarios per language, real-world flavored (legal contract, support FAQ, on-call runbook, multi-hop research, chat history):

Each language folder’s README explains how to run, what each demo covers, and the relevant finding in docs/findings/ where applicable.

For measurement probes (the evidence-layer probes behind the findings: CUAD harnesses, multilingual sweeps, the four-corner-observation falsifications), see crates/examples/examples/ (~60 Rust files, run with cargo run -p redhop-examples --example <name>). Those are evidence, not how-to: they prove things on real workloads.