Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions azure-ai-search/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Azure Search Configuration
AZURE_SEARCH_ENDPOINT=--your-azure-search-endpoint--
AZURE_SEARCH_API_KEY=--your-azure-search-api-key--

# Alternative: Using Managed Identity (set service name instead of API key)
# AZURE_SEARCH_SERVICE_NAME=your-search-service
3 changes: 3 additions & 0 deletions azure-ai-search/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/dist
/node_modules
.env
8 changes: 8 additions & 0 deletions azure-ai-search/.npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
src/
tools/
prompts/
type.ts
server.ts
tsconfig.json
pnpm-lock.yaml
node_modules/
123 changes: 123 additions & 0 deletions azure-ai-search/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Development Commands

- **Build**: `pnpm build` - Compiles TypeScript to JavaScript using Rollup
- **Start Development**: `pnpm start` - Runs the server directly with ts-node
- **Start Production**: `pnpm start:prod` - Runs the built server from dist/
- **Inspect**: `pnpm inspect` - Runs the MCP inspector for debugging

## Architecture Overview

This is an MCP (Model Context Protocol) server that provides Azure AI Search integration. Currently implements Phase 1 of the roadmap focusing on document retrieval and search functionality.

### Current Implementation Status

** Phase 1.1 - Configuration & Authentication**
- Azure Search Documents SDK integrated (`@azure/search-documents`)
- Dual authentication support (API key + Managed Identity)
- Environment variables: `AZURE_SEARCH_ENDPOINT`, `AZURE_SEARCH_API_KEY`

** Phase 1.2 - Core Search Tools (Retrieval)**
- `search-documents` - Full-text search with filtering, faceting, highlighting
- `get-document` - Retrieve specific document by key
- `suggest` - Search suggestions using configured suggesters
- `autocomplete` - Auto-completion for partial search terms

**✅ Phase 2.1 - Index Management & Discovery**
- `list-indexes` - List all available search indexes
- `get-index-schema` - Get complete index schema and field definitions
- `get-index-statistics` - Get index usage statistics and document counts

**✅ Phase 2.2 - Dynamic Resources**
- Auto-discovery of available indexes at startup
- Dynamic resources for each index: schema, statistics, sample documents
- Resource URIs: `azure-search://indexes`, `azure-search://index/{name}/schema`, etc.

**✅ Phase 3.1 - Document Management (COMPLETE)**
- `upload-documents` - Upload/create documents (batch operations up to 1000)
- `merge-documents` - Partial update of existing documents
- `delete-documents` - Delete documents by key values (batch operations)

**✅ Phase 4.1 - Vector Search (COMPLETE)**
- `vector-search` - Pure vector similarity search using k-nearest neighbors
- `hybrid-search` - Combined text and vector search for enhanced relevance
- Support for multiple vector queries and exhaustive search modes

**✅ Phase 4.2 - Semantic Search (COMPLETE)**
- `semantic-search` - Azure AI semantic search with natural language understanding
- Semantic answers extraction from search results
- Semantic captions with highlighting support
- Integration with Azure's semantic configurations

### Core Components

- **server.ts** - Main MCP server entry point with tool registration
- **lib/azure-search-client.ts** - Azure Search client wrapper with error handling
- **tools/search-tools.ts** - Search tool implementations with validation
- **tools/index-tools.ts** - Index management tool implementations
- **resources/index-resources.ts** - Dynamic resource registration for discovered indexes
- **types.ts** - Zod schemas for Azure AI Search parameters and responses

### Key Architecture Patterns

1. **Lazy Loading**: Azure Search clients instantiated only when first accessed
2. **Client Caching**: Search clients cached per index name for efficiency
3. **Dual Authentication**: Supports both API key and DefaultAzureCredential
4. **Type Safety**: All parameters validated with Zod schemas
5. **Error Handling**: Consistent success/error response format

### Configuration

#### Environment Variables
```env
AZURE_SEARCH_ENDPOINT=https://your-service.search.windows.net
AZURE_SEARCH_API_KEY=your-api-key
```

#### Alternative: Managed Identity
```env
AZURE_SEARCH_ENDPOINT=https://your-service.search.windows.net
# No API key needed - uses DefaultAzureCredential
```

### Available Tools

#### Core Search Tools
- **search-documents** - Full-text search with filtering, faceting, highlighting
- **get-document** - Retrieve specific document by primary key
- **suggest** - Search suggestions using configured suggester with fuzzy matching
- **autocomplete** - Auto-complete partial search terms with multiple modes

#### Index Management Tools
- **list-indexes** - List all available search indexes
- **get-index-schema** - Get complete index schema and field definitions
- **get-index-statistics** - Get index usage statistics and document counts

#### Document Management Tools
- **upload-documents** - Upload/create documents (batch operations up to 1000)
- **merge-documents** - Partial update of existing documents
- **delete-documents** - Delete documents by key values (batch operations)

#### Vector Search Tools (Phase 4)
- **vector-search** - Pure vector similarity search using k-nearest neighbors
- **hybrid-search** - Combined text and vector search for enhanced relevance

#### Semantic Search Tools (Phase 4)
- **semantic-search** - Azure AI semantic search with natural language understanding

### Build System

Uses Rollup with TypeScript compilation. External dependencies are not bundled to reduce size. Some TypeScript warnings exist but don't affect functionality.

### Next Steps (Roadmap)

See `ROADMAP.md` for complete implementation plan:
- ✅ **Phase 1**: Core search and retrieval (COMPLETE)
- ✅ **Phase 2**: Index management and discovery (COMPLETE)
- ✅ **Phase 3**: Document upload/management (COMPLETE)
- ✅ **Phase 4**: Vector and semantic search (COMPLETE)
- 🎯 **Phase 5**: Advanced index operations (create/update/delete indexes)
- 📊 **Phase 6**: Analytics and performance monitoring
21 changes: 21 additions & 0 deletions azure-ai-search/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 IgnitionAI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
181 changes: 181 additions & 0 deletions azure-ai-search/ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# Azure AI Search MCP - Roadmap

## ✅ État Actuel - PHASES 1-4 COMPLÈTES !
Le serveur MCP Azure AI Search est maintenant **fonctionnel et complet** pour toutes les opérations avancées :
- ✅ Search & Retrieval (Phase 1)
- ✅ Index Management & Discovery (Phase 2)
- ✅ Document Management (Phase 3)
- ✅ Vector & Semantic Search (Phase 4)

---

## ✅ Phase 1: Foundation & Retrieval (COMPLÈTE) 🔍

### ✅ 1.1 Configuration & Authentication
- ✅ Dépendance `@azure/search-documents` ajoutée
- ✅ types.ts nettoyé avec schémas Zod pour Azure AI Search
- ✅ Authentication dual (API key + Managed Identity)
- ✅ Variables d'env: `AZURE_SEARCH_ENDPOINT`, `AZURE_SEARCH_API_KEY`

### ✅ 1.2 Core Search Tools (Retrieval)
- ✅ **search-documents** - Recherche complète avec filtres, facettes, highlighting
- ✅ **get-document** - Récupération de document par clé
- ✅ **suggest** - Suggestions de recherche avec fuzzy matching
- ✅ **autocomplete** - Auto-complétion de termes

### ✅ 1.3 Index Discovery Resources
- ✅ **list-indexes** - Liste tous les index disponibles
- ✅ **get-index-schema** - Récupère le schéma complet d'un index
- ✅ Resources dynamiques pour chaque index découvert

---

## ✅ Phase 2: Index Management & Discovery (COMPLÈTE) ⚙️

### ✅ 2.1 Index Operations
- ✅ **get-index-statistics** - Statistiques et usage d'un index
- ✅ **get-index-schema** - Schéma détaillé avec fields, analyzers, etc.
- ✅ Dynamic resource registration au démarrage

### ✅ 2.2 Dynamic Resources
- ✅ Auto-discovery des index au startup
- ✅ Resources MCP créées automatiquement :
- `azure-search://indexes` - Liste complète
- `azure-search://index/{name}/schema` - Schéma par index
- `azure-search://index/{name}/statistics` - Stats par index
- `azure-search://index/{name}/sample` - Documents échantillons

---

## ✅ Phase 3: Document Management (COMPLÈTE) 📄

### ✅ 3.1 Document Operations
- ✅ **upload-documents** - Upload/création de documents (batch 1000 max)
- ✅ **merge-documents** - Mise à jour partielle de documents existants
- ✅ **delete-documents** - Suppression de documents par clés (batch 1000 max)

### ✅ 3.2 Document Processing
- ✅ Validation complète des documents selon schémas Zod
- ✅ Gestion d'erreurs batch avec détails par document
- ✅ Support des types Azure AI Search (text, vector, etc.)

## ✅ Phase 4: Vector & Semantic Search (COMPLÈTE) 🤖

### ✅ 4.1 Vector Search Enhancement
- ✅ **vector-search** - Recherche vectorielle native avec K-NN
- ✅ **hybrid-search** - Recherche hybride (text + vector)
- ✅ **knn-search** - Intégré dans vector-search (paramètre k)
- ✅ **vector-filtering** - Support des filtres OData sur résultats vectoriels

### ✅ 4.2 Semantic Search
- ✅ **semantic-search** - Recherche sémantique Azure avec configuration
- ✅ **semantic-answers** - Réponses sémantiques extraites automatiquement
- ✅ **semantic-captions** - Légendes sémantiques avec highlighting
- ✅ **semantic-ranking** - Classement sémantique intégré

## Phase 5: Advanced Index Operations (Utile) ⚙️

### 5.1 Index Lifecycle
- [ ] **create-index** - Création d'index avec schéma complet
- [ ] **update-index** - Mise à jour schéma d'index existant
- [ ] **delete-index** - Suppression d'index
- [ ] **index-aliases** - Gestion des alias d'index

### 5.2 Skillsets & Enrichment
- [ ] **list-skillsets** - Liste des skillsets disponibles
- [ ] **get-skillset** - Détails d'un skillset
- [ ] **run-indexer** - Exécution d'un indexer
- [ ] **indexer-status** - Statut des indexers

## Phase 6: Analytics & Performance (Optionnel) 📊

### 6.1 Search Analytics
- [ ] **search-analytics** - Métriques de recherche
- [ ] **query-performance** - Performance des requêtes
- [ ] **index-health** - Santé des index
- [ ] **usage-statistics** - Statistiques d'utilisation

### 6.2 Monitoring Tools
- [ ] **connection-health** - Test de connectivité
- [ ] **quota-usage** - Utilisation des quotas
- [ ] **service-statistics** - Statistiques du service

## Architecture Technique

### Structure des Fichiers
```
azure-ai-search/
├── server.ts # Point d'entrée MCP
├── types.ts # Schémas Zod pour AI Search
├── lib/
│ └── azure-search-client.ts # Client Azure AI Search
├── tools/
│ ├── search-tools.ts # Outils de recherche
│ ├── index-tools.ts # Gestion des index
│ ├── document-tools.ts # Gestion des documents
│ └── analytics-tools.ts # Analytics (Phase 4)
├── resources/
│ ├── index-resources.ts # Resources dynamiques des index
│ └── search-resources.ts # Resources de recherche
└── prompts/
└── search-prompts.ts # Prompts pour la recherche
```

### Patterns Architecturaux
- **Lazy Loading**: Client Azure Search instancié à la demande
- **Dynamic Resources**: Découverte automatique des index
- **Error Handling**: Format de réponse cohérent avec success/error
- **Authentication**: Support API Key + Managed Identity
- **Validation**: Schémas Zod stricts pour tous les paramètres

### Configuration Environnement
```env
AZURE_SEARCH_ENDPOINT=https://myservice.search.windows.net
AZURE_SEARCH_API_KEY=your-api-key
# OU pour Managed Identity:
AZURE_SEARCH_SERVICE_NAME=myservice
```

## 🎯 Recommandations pour la Suite

### Phase 4 Prioritaire: Vector & Semantic Search
Le **Vector Search** est la prochaine étape logique car :
- **Tendance forte** dans l'IA générative et RAG
- **Value-add majeur** pour les applications d'IA
- **Déjà supporté** par Azure AI Search
- **Complémentaire** aux fonctionnalités existantes

### Phase 5 Utile: Index Operations
Création et gestion d'index directement depuis MCP :
- **Workflow complet** de A à Z
- **Productivité** pour les développeurs
- **Gestion de cycle de vie** des index

### Phase 6 Optionnel: Analytics
Monitoring et métriques pour optimisation :
- **Debug** et troubleshooting
- **Performance tuning**
- **Usage insights**

## Priorités de Développement

1. ✅ **Phase 1** (COMPLETE): Retrieval fonctionnel
2. ✅ **Phase 2** (COMPLETE): Gestion basique des index
3. ✅ **Phase 3** (COMPLETE): Gestion des documents
4. 🎯 **Phase 4** (Recommandé): Vector & Semantic Search
5. ⚙️ **Phase 5** (Utile): Advanced Index Operations
6. 📊 **Phase 6** (Optionnel): Analytics & Performance

## Critères de Succès

### Phase 1 (MVP)
- [ ] Recherche simple fonctionnelle sur index existants
- [ ] Auto-discovery des index disponibles
- [ ] Gestion d'erreurs robuste
- [ ] Documentation clara avec exemples

### Phases Suivantes
- [ ] Gestion complète du cycle de vie des index
- [ ] Support des opérations batch performantes
- [ ] Intégration avec les outils d'IA générative
- [ ] Métriques et monitoring intégrés
Loading
Loading