Skip to content

Options

Every parameter has an evidence-backed default — Document.from_text(text) with nothing else is a complete call. This page is the full reference for when you want to tune something.

Chunking (index-time — fixed at construction):

paramdefaultwhat it does
source"document"a label for the document
chunk_size128target tokens per chunk
chunk_overlap1sentences of overlap between adjacent chunks

Assembly (how the context is built from candidates):

paramdefaultwhat it does
strategy"auto"auto · reasoning_preserving · distractor_filtered · redundancy_pruned · max_density · raw_topk
token_budget8192default assembled-context budget (override per call)
candidate_k20how many candidates to retrieve before assembly

Retrieval (lexical by default; the rest only apply when retrieval="semantic" — see Retrieval options):

paramdefaultwhat it does
retrieval"lexical""lexical" (BM25), "hybrid" (BM25 prune → dense rerank), or "semantic" (dense over every chunk)
modelNonename of a built-in embedding model to download — "bge-small" / "bge-base"
candidate_pool50hybrid only: BM25 prune depth before the dense rerank
rerankNoneoptional cross-encoder second stage — "cross-encoder" (auto-downloaded) re-scores the candidate pool by jointly reading each (query, passage) pair. Works on any tier; off by default (details)

The rest are for bringing your own embedding model instead of model= (advanced — see Bring your own model):

paramdefaultwhat it does
embedder_modelNonepath to a local embedding model
embedder_tokenizerNonepath to its tokenizer
embedder_dim384the model’s output dimension
embedder_pooling"cls""cls" or "mean" (matches the model family)
embedder_query_prefixNonequery prefix for asymmetric models (E5: "query: ")
embedder_passage_prefixNonepassage prefix for asymmetric models (E5: "passage: ")

Takes the same assembly and retrieval options as from_text — you’ve already chunked, so the chunking params don’t apply.

The file loaders (Loaders) take all the same options. from_folder adds recursive, gitignore, ignore, persist, and index_dir — see from_folder.

doc.context(query, budget=None, neighbors=0, include_heading=False)

Section titled “doc.context(query, budget=None, neighbors=0, include_heading=False)”
paramdefaultwhat it does
querythe question
budgetthe doc’s token_budgetper-call budget override — query-time, no re-indexing
neighbors0also include the N adjacent chunks on each side of every selected chunk (same file) — see Structural expansion
include_headingFalsealso include each selected chunk’s section heading

Also: doc.analyze(query) returns the same Decision Report without building a context (pure diagnostics), and doc.n_chunks / len(doc) (Node: doc.chunkCount; Rust: doc.len()) give the chunk count.

A retrieved chunk often answers the question but drops you mid-section — the sentence before it set up the definition, the heading tells you which contract clause it is. neighbors= and include_heading= pull that structure back in, deterministically and with no model: they’re read straight from document order and headings.

ctx = doc.context("what notice is required to terminate?", neighbors=1, include_heading=True)

What makes this more than a ±1 hack is how it cooperates with the finite-attention core. A neighbor pulled in for continuity has little query overlap, so the normal distractor filter would throw it away. Instead, expansion chunks are justified by structure, not relevance — exempt from that filter, but still:

  • bounded by the token budget (companions fill only leftover space after the real evidence is placed),
  • deduplicated (overlapping windows merge),
  • emitted in document order, so each hit reads as a contiguous window rather than scattered fragments,
  • accounted for — they appear in ctx.citations (with their own heading/line) and as ctx.report.n_expanded, and the Decision Report prints a Structural expand: +N line.

Both default off, so plain context(query) is unchanged. Start with neighbors=1, include_heading=True for contracts/docs where surrounding context matters.

doc.context(query) returns a built context with everything you need to prompt a model and show your work:

on the resultwhat it gives
ctx.text()the assembled context as one prompt string
ctx.chunksthe selected chunks’ text, in order
ctx.citationsprovenance per selected chunk — {source, page, heading, line, text} (details)
ctx.reportthe Decision Reportauto_decision, total_tokens, retained_evidence_ratio, …

Lower-level: redhop.build_context(query, chunks, ...)

Section titled “Lower-level: redhop.build_context(query, chunks, ...)”

Already have chunks and want to tune the assembly thresholds Document keeps fixed? The low-level API exposes them (Python; the same build_context / ContextConfig surface is re-exported from the redhop Rust crate):

paramdefaultwhat it does
strategy"reasoning_preserving"same strategy values as above
token_budget8192assembled-context budget
distractor_min_grounding0.10grounding bar below which a chunk is “junk”
link_min_jaccard0.12linkage at/above which a low-relevance chunk is rescued as a second hop
auto_passthrough_max_tokens1500auto: pass through at/below this size, prune above
redundancy_max_cosine0.92dedup threshold

analyze_context(...) and context_economics(...) give the diagnostics without assembling a context.

Next: Retrieval & context tips — the operational laws behind these defaults · Benchmarks — every number, reproducible.