Enterprise RAG Chatbot (Agentic RAG — In Development/Testing)
Agentic, self-hosted RAG system that consumes the KBMS corpus (15 apps, 2,000+ docs) to deliver cited, policy-compliant answers with no-answer fallback, guardrails, and nightly evals.
Problem
Analysts lacked a safe, reliable way to query 2,000+ internal docs — they needed answers that were not only cited, but also policy-compliant and self-correcting.
Approach
Agentic loop: ingestion → enrichment → embeddings + vector DB → retrieval + optional rerank → synthesis → guardrails (must-cite, PII) → templated LLM answer. Backed by nightly evals (groundedness, citation accuracy, latency) and observability logs.
Results
Produces cited, policy-compliant answers with measurable no-answer fallback and self-critique via guardrails. Evaluation harness tracks groundedness, citation accuracy, latency. Status: Internal testing.
Architecture
Vector DB for embeddings + metadata filters • Retrieval middleware + optional reranker • Guardrails for must-cite + PII/secret filtering • Policy-driven no-answer fallback • Evaluation harness (Ragas nightly metrics) • Observability: Langfuse/Phoenix + CI logging
- Vector DB for embeddings + metadata filters
- Retrieval middleware + optional reranker
- Guardrails for must-cite + PII/secret filtering
- Policy-driven no-answer fallback
- Evaluation harness (Ragas nightly metrics)
- Observability: Langfuse/Phoenix + CI logging
Key Challenges
- Groundedness on edge cases
- Near-duplicate handling
- Chunk size vs recall
Lessons Learned
- Prioritize "no-answer" quality
- Version prompts and track deltas
- Log everything
Explore the Project
Interested in similar results?
Let's discuss how we can implement AI automation for your team.