compliance-scanner-agent/docs/features/ai-chat.md

# AI Chat (RAG)

The AI Chat feature lets you ask natural language questions about your codebase. It uses Retrieval-Augmented Generation (RAG) to find relevant code and provide accurate, source-referenced answers.

## How It Works

1. **Code graph** is built for the repository (functions, classes, modules)
2. **Embeddings** are generated for each code symbol using an LLM embedding model
3. When you ask a question, your query is **embedded** and compared against code embeddings
4. The **top 8 most relevant** code snippets are retrieved
5. These snippets are sent as context to the LLM along with your question
6. The LLM generates a response **grounded in your actual code**

## Getting Started

### 1. Select a Repository

Navigate to **AI Chat** in the sidebar. You'll see a grid of repository cards. Click one to open the chat interface.

### 2. Build Embeddings

Before chatting, you need to build embeddings for the repository:

1. Click **Build Embeddings**
2. Wait for the process to complete — a progress bar shows `X/Y chunks`
3. Once the status shows **Embeddings ready**, the chat input is enabled

::: info
Embedding builds require:
- A code graph already built for the repository (via the Graph feature)
- A configured embedding model (`LITELLM_EMBED_MODEL`)

The default model is `text-embedding-3-small`.
:::

### 3. Ask Questions

Type your question in the input area and press Enter (or click Send). Examples:

- "How does authentication work in this codebase?"
- "What functions handle database connections?"
- "Explain the error handling pattern used in this project"
- "Where are the API routes defined?"
- "What does the `process_scan` function do?"

## Understanding Responses

### Answer

The AI response is a natural language answer to your question, grounded in the actual source code of your repository.

### Source References

Below each response, you'll see source references showing exactly which code was used to generate the answer:

- **Symbol name** — The qualified name of the function/class/module
- **File path** — Where the code is located, with line range
- **Code snippet** — The first ~10 lines of the relevant code
- **Relevance score** — How closely the code matched your question (0.0 to 1.0)

## Conversation Context

The chat maintains conversation history within a session. You can ask follow-up questions that reference previous answers. The system sends the last 10 messages as context to maintain coherence.

## Configuration

| Variable | Description | Default |
|----------|-------------|---------|
| `LITELLM_URL` | LiteLLM proxy URL | `http://localhost:4000` |
| `LITELLM_API_KEY` | API key for the LLM provider | — |
| `LITELLM_MODEL` | Model for chat responses | `gpt-4o` |
| `LITELLM_EMBED_MODEL` | Model for code embeddings | `text-embedding-3-small` |

## Tips

- **Be specific** — "How does the JWT validation middleware work?" is better than "Tell me about auth"
- **Reference filenames** — "What does `server.rs` do?" helps the retrieval find relevant code
- **Ask about patterns** — "What error handling pattern does this project use?" works well with RAG
- **Rebuild after changes** — If the repository has been updated significantly, rebuild embeddings to include new code