A local-only RAG (Retrieval-Augmented Generation) evaluation system with a web-based UI that leverages DeepEval for evaluation metrics.
Note: This project is being built with Claude Code, Anthropic's AI-powered coding assistant.
RAG Evaluator provides a comprehensive framework for evaluating RAG systems. It runs entirely on your local machine with SQLite for data persistence, requiring no cloud infrastructure or external services.
-
DeepEval Integration: Evaluate RAG responses using industry-standard metrics
- Answer Relevancy
- Faithfulness
- Contextual Relevancy/Precision/Recall
- Hallucination Detection
- And more
-
Web-Based Dashboard: React UI for managing evaluations
- Configure RAG systems to test
- Create and organize test cases
- View detailed results and metrics
- Track trends over time
-
LLM Provider Support: Configurable LLM providers via Web UI
- OpenAI: GPT-4, GPT-4-turbo, GPT-3.5-turbo
- Ollama: Llama2, Mistral, CodeLlama, and other local models
- Integrated via LangChain (langchain-openai, langchain-ollama)
-
MCP Server Support: Extensible integrations via Model Context Protocol
- RAG System MCP for querying systems under test
- Vector DB MCP for Chroma database access
- LLM Provider MCP for OpenAI and Ollama interactions
-
Skills: Specialized workflows for common tasks
- Report generation
- Data visualization
- Comparison analysis
-
Local-First Architecture: All data stays on your machine
- SQLite database for persistence
- No external dependencies required
- Privacy-focused design
- No authentication required - designed for single-user local development
| Layer | Technologies |
|---|---|
| Frontend | React 18+, TypeScript, Tailwind CSS, Recharts |
| Backend | FastAPI, SQLAlchemy, Pydantic |
| Database | SQLite |
| Vector DB | Chroma (via langchain-chroma) |
| Evaluation | DeepEval, LangChain |
| LLM Providers | OpenAI (langchain-openai), Ollama (langchain-ollama) |
| Integrations | MCP (Model Context Protocol) |
- Python 3.11+
- Node.js 18+
- uv (Python package manager) - Install from astral.sh/uv
# Clone the repository
git clone <repository-url>
cd rag-evaluator
# Set up the backend with uv
cd backend
uv sync # Creates virtual environment and installs dependencies
# Start the backend
uv run uvicorn app.main:app --reload --port 8000
# In a new terminal, set up the frontend
cd frontend
npm install
npm run devCreate a .env file in the backend directory:
# For OpenAI
LLM_PROVIDER=openai
LLM_MODEL=gpt-4
OPENAI_API_KEY=your-openai-api-key
# For Ollama (local LLM)
LLM_PROVIDER=ollama
LLM_MODEL=llama2
OLLAMA_BASE_URL=http://localhost:11434You can also configure the LLM provider through the Web UI under System Configuration > LLM Provider Settings.
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
rag-evaluator/
├── backend/
│ ├── app/
│ │ ├── api/v1/ # API endpoints
│ │ ├── core/ # Configuration, database
│ │ ├── models/ # SQLAlchemy models
│ │ ├── schemas/ # Pydantic schemas
│ │ └── services/ # Business logic
│ ├── skills/ # Skill definitions
│ └── mcp_servers/ # MCP server implementations
├── frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── hooks/ # Custom hooks
│ │ ├── services/ # API clients
│ │ └── types/ # TypeScript types
│ └── public/
├── docs/ # Architecture documentation
└── data/ # SQLite database (created at runtime)
This application is designed for local development only. It does not include authentication or authorization features. The application:
- Binds to localhost (127.0.0.1) only
- Stores API keys in plain text in
.envfiles - Trusts the local user completely
- Relies on OS-level file permissions for security
Do not expose this application to a network or deploy it to a shared environment.
MIT