Enterprise RAG Chatbot (Agentic RAG — In Development/Testing)

Window: 15 apps • 2,000+ docs • 4-week development
Baseline → After: Status: internal testing
Method: Eval harness • groundedness • citation accuracy • latency • guardrail compliance

Agentic, self-hosted RAG system that consumes the KBMS corpus (15 apps, 2,000+ docs) to deliver cited, policy-compliant answers with no-answer fallback, guardrails, and nightly evals.

Role: ArchitectTeam: 1Duration: 4 weeks
Python
embeddings + vector store
retrieval middleware
LLM templates
logging/CI

Problem

Analysts lacked a safe, reliable way to query 2,000+ internal docs — they needed answers that were not only cited, but also policy-compliant and self-correcting.

Approach

Agentic loop: ingestion → enrichment → embeddings + vector DB → retrieval + optional rerank → synthesis → guardrails (must-cite, PII) → templated LLM answer. Backed by nightly evals (groundedness, citation accuracy, latency) and observability logs.

Results

Produces cited, policy-compliant answers with measurable no-answer fallback and self-critique via guardrails. Evaluation harness tracks groundedness, citation accuracy, latency. Status: Internal testing.

Architecture

Vector DB for embeddings + metadata filters • Retrieval middleware + optional reranker • Guardrails for must-cite + PII/secret filtering • Policy-driven no-answer fallback • Evaluation harness (Ragas nightly metrics) • Observability: Langfuse/Phoenix + CI logging

  • Vector DB for embeddings + metadata filters
  • Retrieval middleware + optional reranker
  • Guardrails for must-cite + PII/secret filtering
  • Policy-driven no-answer fallback
  • Evaluation harness (Ragas nightly metrics)
  • Observability: Langfuse/Phoenix + CI logging

Key Challenges

  • Groundedness on edge cases
  • Near-duplicate handling
  • Chunk size vs recall

Lessons Learned

  • Prioritize "no-answer" quality
  • Version prompts and track deltas
  • Log everything

Explore the Project

Interested in similar results?

Let's discuss how we can implement AI automation for your team.