Skip to content

RedHop: A Simpler Haystack Alternative for Document RAG

If you’re searching for a Haystack alternative, you’re probably hitting one of these walls:

  • The pipeline DAG is heavy for simple cases. Two pipelines (indexing + query), components for each step, explicit socket wiring with connect() — for one PDF and one question.
  • Document store assumed from day one. Even the in-memory store is its own object you manage. For prototyping document QA, you want the file in, the answer out — without standing up infrastructure first.
  • Verbose for a small surface. Twenty-plus lines for a basic RAG path that mirrors what other libraries do in five. Production-grade, but heavy for the common case.
  • Python only. No TypeScript or Rust story; if you’re shipping to a non-Python service, you’re rewriting from scratch.
  • No visibility into the retrieval decision. The pipeline runs and returns a result; when the wrong chunk surfaces, you instrument it yourself.

RedHop is a focused alternative: an in-process retrieval + context library that does one thing — turn a document and a question into the right LLM prompt context — and tells you exactly what it kept, dropped, and why.

import redhop
doc = redhop.Document.from_file("contract.pdf")
ctx = doc.context("What is the governing law?")
answer = llm.generate(ctx.text())
print(ctx.report) # what was kept, dropped, and why

That’s the whole surface. Three calls. No pipelines, no components, no document store. Python, Node, and Rust over a Rust core — all in-process.


Should you switch from Haystack to RedHop?

Section titled “Should you switch from Haystack to RedHop?”

The honest answer: it depends on what you’re building.

If you need…Pick
Document QA with citations and a Decision ReportRedHop
In-process retrieval, no document store, no infraRedHop
The same API in Python, Node, and RustRedHop
Multi-step pipelines with branching, loops, conditionalsHaystack
Component reuse and swappable pieces in productionHaystack
Strong evaluation framework (haystack-experimental)Haystack
deepset Cloud (hosted, managed)Haystack (via deepset)
Mature production deployments at scaleHaystack

Haystack is a production-grade pipeline framework built for composable NLP/RAG workflows. RedHop is a library that does the one bounded stephere’s the file, here’s the question, give me the right context with a decision report. If you need Haystack’s pipeline composition, stay there. If you just need the three-call shape with observability, RedHop is simpler.


Same contract.pdf. Same question. RedHop on the left tab, Haystack on the right.

import redhop
from openai import OpenAI
query = "What is the governing law?"
ctx = redhop.Document.from_file("contract.pdf").context(query)
# parsed, chunked, retrieved, and token-budgeted internally
response = OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: {query}"}],
)
print(response.choices[0].message.content)

What you stand up: nothing. Point it at the file and ask; parsing, chunking, retrieval, and token-budgeting happen inside — and every call returns a Decision Report explaining what it kept and why.

Haystack’s component model is well-engineered — every step is a discrete piece with named input/output sockets, which makes it easy to swap pieces in production. But for one PDF and one question, that machinery is overhead. RedHop has one concept: document → context. Everything else is an implementation detail.

The broader head-to-head benchmark on the Comparison page covers LangChain and LlamaIndex specifically — Haystack isn’t in those numbers yet, so the comparison above is structural (code-vs-code) rather than measured retention / answer-quality scores.


What Haystack gives you that RedHop doesn’t

Section titled “What Haystack gives you that RedHop doesn’t”

Be clear about this. Haystack has things RedHop doesn’t even try to be:

  • Composable pipelines with arbitrary branching. Multi-step retrieval, conditional routing, loop-based agentic flows — Haystack’s pipeline graph supports all of it. RedHop is one path: chunk → BM25 (or hybrid) → assemble.
  • A large component ecosystem. Many converters, preprocessors, embedders, retrievers, rankers, generators — for Postgres, Elasticsearch, Pinecone, Weaviate, OpenSearch, Qdrant, you name it. RedHop has built-in parsers for PDF / DOCX / PPTX / XLSX / Markdown / code, BM25 by default, optional ONNX embeddings. That’s it.
  • deepset Cloud. Managed Haystack hosting with a UI, evaluation dashboards, prompt management. RedHop is OSS only, in-process; you run it.
  • Strong evaluation framework. Haystack ships with eval harnesses (haystack-experimental) for retrieval and answer quality metrics across components. RedHop ships with the Decision Report on every call and benchmark scripts in the repo, but no formal eval harness yet.
  • A mature production track record. deepset has been shipping Haystack since 2019; battle-tested at enterprise scale. RedHop is alpha.

If you need any of the above, stay on Haystack — or use the two together (RedHop as a component inside a Haystack pipeline for the document-context step).


What RedHop gives you that Haystack doesn’t

Section titled “What RedHop gives you that Haystack doesn’t”

Every doc.context(query) returns a ctx.report describing exactly what happened — what was kept, what was dropped, whether the engine intervened, why it chose what it chose.

RedHop Decision Report
======================
Decision: Auto → passthrough (small context, no intervention needed)
Why:
- 1,240 tokens — below the dilution gate (1,500 tokens)
- pruning a small clean context risks dropping reasoning evidence
Result:
- kept all 8 retrieved chunks
- evidence retained 100%, second-hop links preserved

Haystack returns a result dict with whatever the last pipeline component produced; observability is what you instrument yourself. With RedHop, the report is structured data on every call — auto_decision, total_tokens, n_input_chunks, n_selected, retained_evidence_ratio, second_hop_rescue_count. You can also run doc.analyze(query) to get the same diagnostics without assembling a context.

Haystack’s default in-memory document store is its own object, lives separately from the pipeline, and you wire components into it explicitly via DocumentWriter on the indexing side and InMemoryEmbeddingRetriever on the query side. Two pipelines, ten components, lots of connect() calls.

RedHop’s default tier is BM25 — no document store, no separate index object you manage, no pipeline DAG to wire. Zero model download, zero embedding cost, sub-100ms warm queries. Most document QA — code, API references, runbooks, financial reports, handbooks — works on lexical alone, because the words in the question are usually the words in the answer.

If you need semantic retrieval, opt into retrieval="hybrid" with a small embedding model (bge-small, ~80MB, auto-downloaded). Even then, retrieval is exact cosine over your in-memory chunks — no ANN index, no vector store, no embedded service.

Load. Ask. Read. That’s the API.

doc = redhop.Document.from_file("contract.pdf") # load (or .from_folder, .from_text, .from_bytes)
ctx = doc.context("What is the governing law?") # ask
print(ctx.text()) # the prompt for your LLM
for c in ctx.citations: ... # source / page / heading / line per chunk
print(ctx.report) # the decision

Compare to Haystack’s Pipeline → components → DocumentStore → run with nested input dict shape. Each piece is its own concept with its own configuration surface.

Haystack is Python-only — no official TypeScript port, no Rust. RedHop ships the same surface in Python, Node, and Rust over a single Rust core. Prototype in Python, ship the same API in your Rust service or Electron app.

RedHop runs in your process. No service to call, no hosted endpoint, no API key. The optional embedding model is downloaded once and runs locally via ONNX. Your documents never leave the box. For finance / legal / health teams with data residency requirements, this is the shape of the answer.


If you’ve got an existing Haystack RAG pipeline doing document QA, here’s the equivalent in RedHop.

Haystack:

from haystack import Pipeline
from haystack.components.converters import PyPDFToDocument
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.embedders import OpenAIDocumentEmbedder
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
doc_store = InMemoryDocumentStore()
indexing = Pipeline()
indexing.add_component("converter", PyPDFToDocument())
indexing.add_component("splitter", DocumentSplitter(split_by="word", split_length=200))
indexing.add_component("embedder", OpenAIDocumentEmbedder())
indexing.add_component("writer", DocumentWriter(document_store=doc_store))
indexing.connect("converter", "splitter")
indexing.connect("splitter", "embedder")
indexing.connect("embedder", "writer")
indexing.run({"converter": {"sources": ["contract.pdf"]}})

RedHop:

import redhop
doc = redhop.Document.from_file("contract.pdf")

That’s it. PDF parsing, chunking, indexing — all behind the API. No embedding call (default tier is BM25). For semantic retrieval add retrieval="hybrid", model="bge-small" to the constructor.

Haystack:

from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
template = [ChatMessage.from_user(
"Answer using only the context.\n\n"
"{% for d in documents %}{{d.content}}\n{% endfor %}\n"
"Question: {{query}}"
)]
querying = Pipeline()
querying.add_component("embedder", OpenAITextEmbedder())
querying.add_component("retriever", InMemoryEmbeddingRetriever(document_store=doc_store))
querying.add_component("prompt", ChatPromptBuilder(template=template))
querying.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini"))
querying.connect("embedder.embedding", "retriever.query_embedding")
querying.connect("retriever.documents", "prompt.documents")
querying.connect("prompt.prompt", "llm.messages")
answer = querying.run({"embedder": {"text": query}, "prompt": {"query": query}})["llm"]["replies"][0].text

RedHop (LLM-agnostic — bring your own):

ctx = doc.context("What is the governing law?")
answer = OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"{ctx.text()}\n\nQuestion: What is the governing law?"}],
).choices[0].message.content

Haystack wraps the LLM call in its pipeline. RedHop hands you the prompt string and lets you call any provider directly — no component wrapping, no socket wiring.

Haystack:

retrieved = querying.run(...)["retriever"]["documents"]
for d in retrieved:
print(d.meta, d.content)

RedHop:

for c in ctx.citations:
print(c["source"], c["page"], c["heading"], c["line"])

Same shape, simpler keys. source plus whichever of page / heading / line the format provides — no separate metadata layer.

Haystack:

from pathlib import Path
indexing.run({"converter": {"sources": list(Path("./docs").glob("**/*.pdf"))}})

RedHop:

doc = redhop.Document.from_folder("./docs", persist=True)

from_folder honors .gitignore, accepts custom ignore patterns, and optionally writes an incremental on-disk index — reload is O(changed files), not O(all files).


WorkloadRedHopHaystack
Document QA with one or many files✅ shorter, observable✅ verbose but flexible
Multi-step pipelines / conditional flows❌ out of scope✅ flagship
Production deployment at enterprise scale⚠️ alpha✅ mature
Hosted / managed RAG with a dashboard✅ deepset Cloud
Visibility into retrieval decisions✅ Decision Report❌ DIY observability
In-process, no document store, no infra
Same API in Python / Node / Rust❌ Python only
Strong evaluation harness⚠️ benchmark scripts✅ haystack-experimental
Apache-2.0, no commercial gating✅ (deepset Cloud is paid)

If your workload sits firmly in document QA and you’ve been wondering why Haystack’s pipeline model feels heavy for a file-in-answer-out flow — RedHop is the alternative you’re looking for. If you’re building multi-step RAG, branching flows, conditional routing, or deploying to enterprise infrastructure — Haystack’s pipeline composition is the better tool.


Terminal window
pip install redhop # Python
cargo add redhop --features files,semantic # Rust
npm install redhop # Node.js -- on npm

Open source under Apache-2.0. Bug reports and use-case feedback welcome at github.com/vysakh0/redhop.