Experiments & Research
Where we push boundaries with LLMs, multi-agent systems, RAG architectures, and computer vision. Not every experiment ships — but every one teaches.
Local LLM Code Reviewer
Automated code review powered by locally-hosted LLMs. Privacy-first — zero data leaves your machine.
Voice-to-SQL Agent
Speak natural language queries and get SQL results instantly. Function-calling agents translate intent to queries.
RAG Evaluation Harness
Automated benchmark suite comparing chunking strategies, embedding models, and retrieval methods across 12 metrics.
Multi-Agent Debate System
Three AI agents argue opposing perspectives on any topic, then a judge agent synthesizes the strongest arguments.
Document Vision Parser
Extract structured data from receipts, invoices, and forms using vision models — no OCR templates required.
Agent Memory Architecture
Persistent memory layer for AI agents — episodic, semantic, and procedural memory with forgetting curves.
Real-time Sentiment Stream
Live sentiment analysis on streaming text — Twitter feeds, chat messages, or support tickets with sub-100ms latency.
PDF → Knowledge Graph
Upload any PDF and watch it transform into an interactive knowledge graph with entity relationships and citations.
Browser Automation Agent
Give it a goal in plain English — it navigates websites, fills forms, extracts data, and completes tasks autonomously.
Prompt Optimization Lab
Automated prompt engineering — evolves prompts using genetic algorithms and A/B testing against eval suites.