Project

RAG Research Agent
(Chroma + LangGraph + Tools)

This project presents a production-style Retrieval Augmented Generation (RAG) system built with LangGraph, LangChain, ChromaDB, and tool routing. The agent follows a knowledge-base-first reasoning policy: it first retrieves from a local vector database, then decides whether retrieval is sufficient, and only falls back to external search when the evidence is weak. The system exposes both a FastAPI backend and an interactive UI, making it suitable for demonstrating controllable LLM-agent workflows, tool usage, structured responses, and practical deployment patterns for AI research assistants.

Agent Pattern

KB-first RAG with tool routing

External Tool Policy

Web fallback only when retrieval is weak

Deployment Style

FastAPI backend + interactive UI

LLM Setup

Local vLLM-compatible model endpoint

Frameworks

LangGraph + LangChain

Retrieval Layer

ChromaDB + embeddings

UI + API

Interactive demo + FastAPI

Main Goal

Controllable tool-using LLM agent

What happens when “Live Demo” is clicked

The button opens the hosted Hugging Face demo for the RAG agent. When the page opens, you may need to wait a short time while the Space finishes building or waking up. After it is ready, click the app UI at the top, enter a natural-language research or technical question in the chat box, and submit it to run the demo.

The agent first queries the local knowledge base stored in a vector database. If the retrieved evidence looks strong enough, it answers directly from the knowledge base. If retrieval is weak, the reasoning flow can route to the external search tool as a fallback, then return a structured answer.

This behavior demonstrates that the project is not just a simple chat interface. It shows retrieval logic, reasoning control, tool usage, fallback policy, and practical AI system design for research-assistant scenarios.

Brief instructions for the demo

1. Open the live demo page.

2. Enter a research or technical question in the chat box.

3. The agent will first search its local knowledge base.

4. If local retrieval is weak, it can route to an external search tool.

5. The final answer is returned through the interactive UI.

Demo interface

The screenshot below shows the public interface of the RAG LangGraph agent demo hosted online.

RAG LangGraph Agent demo interface

Architecture summary

User Question

Knowledge Base Retrieval (ChromaDB)

LangGraph Agent Decision

Tool Routing

• KB Query

• Optional Web Search

• Final Response

Why this project matters

This project demonstrates an important modern AI system pattern: an LLM is not used alone, but is combined with retrieval, tool access, and a controlled reasoning flow.

From a portfolio perspective, it shows practical experience in agent design, vector retrieval, API-based deployment, UI integration, and orchestration logic rather than only prompt-based interaction.

These same patterns appear in enterprise copilots, research assistants, search systems, and internal knowledge agents.

Key features

  • Knowledge-base-first retrieval policy
  • LangGraph state-based reasoning flow
  • Deterministic tool routing
  • ChromaDB vector retrieval with embeddings
  • Structured FastAPI backend responses
  • Interactive UI for public demonstration
  • External search fallback with controlled usage

Technology stack

  • Python
  • LangGraph
  • LangChain
  • ChromaDB
  • SentenceTransformers
  • FastAPI
  • Streamlit / hosted demo UI
  • vLLM-compatible local model serving
  • Tavily or Serper external search integration

Implementation overview

The backend runs as a FastAPI service and exposes the agent logic through an API. The language model is served through an OpenAI-compatible endpoint, allowing local inference with a vLLM-based setup.

The reasoning controller is implemented using LangGraph. The graph manages the sequence of retrieval, evaluation, possible tool usage, and final response generation.

The front-end demo allows users to test the complete pipeline through a lightweight web interface, making the project suitable for portfolio presentation and technical interviews.