|
| 1 | +--- |
| 2 | +name: bigquery-ai |
| 3 | +description: Execute generative AI operations in BigQuery - text generation, embeddings, vector search, and RAG workflows using Gemini, Claude, and other LLMs. Use when working with AI/ML inference, semantic search, or building RAG applications in BigQuery. |
| 4 | +license: Apache-2.0 |
| 5 | +compatibility: BigQuery, Vertex AI, Gemini, Cloud AI APIs |
| 6 | +metadata: |
| 7 | + author: Google Cloud |
| 8 | + version: "2.0" |
| 9 | + category: generative-ai |
| 10 | +adk: |
| 11 | + config: |
| 12 | + timeout_seconds: 180 |
| 13 | + max_parallel_calls: 10 |
| 14 | + allow_network: true |
| 15 | + allowed_callers: |
| 16 | + - bigquery_agent |
| 17 | + - ai_agent |
| 18 | + - rag_agent |
| 19 | +--- |
| 20 | + |
| 21 | +# BigQuery AI Skill |
| 22 | + |
| 23 | +Execute generative AI operations directly in BigQuery using SQL. This skill covers text generation, embeddings, vector search, and retrieval-augmented generation (RAG) workflows. |
| 24 | + |
| 25 | +## When to Use This Skill |
| 26 | + |
| 27 | +Use this skill when you need to: |
| 28 | +- Generate text using LLMs (Gemini, Claude, Llama, Mistral) on BigQuery data |
| 29 | +- Create embeddings for semantic search and similarity matching |
| 30 | +- Build vector search and RAG pipelines entirely in SQL |
| 31 | +- Process documents, translate text, or analyze images at scale |
| 32 | +- Connect BigQuery to Vertex AI models for inference |
| 33 | + |
| 34 | +## Core Capabilities |
| 35 | + |
| 36 | +| Capability | Function | Description | |
| 37 | +|------------|----------|-------------| |
| 38 | +| Text Generation | `AI.GENERATE_TEXT` | Generate text using remote LLM models | |
| 39 | +| Embeddings | `ML.GENERATE_EMBEDDING` | Create vector embeddings from text/images | |
| 40 | +| Vector Search | `VECTOR_SEARCH` | Find semantically similar items | |
| 41 | +| Semantic Search | `AI.SEARCH` | Search with autonomous embeddings | |
| 42 | +| Remote Models | `CREATE MODEL` | Connect to Vertex AI endpoints | |
| 43 | + |
| 44 | +## Quick Start |
| 45 | + |
| 46 | +### 1. Create a Remote Model Connection |
| 47 | + |
| 48 | +```sql |
| 49 | +-- Create connection to Gemini |
| 50 | +CREATE OR REPLACE MODEL `project.dataset.gemini_model` |
| 51 | + REMOTE WITH CONNECTION `project.region.connection_id` |
| 52 | + OPTIONS (ENDPOINT = 'gemini-2.0-flash'); |
| 53 | +``` |
| 54 | + |
| 55 | +### 2. Generate Text |
| 56 | + |
| 57 | +```sql |
| 58 | +SELECT ml_generate_text_result |
| 59 | +FROM ML.GENERATE_TEXT( |
| 60 | + MODEL `project.dataset.gemini_model`, |
| 61 | + (SELECT 'Summarize this text: ' || content AS prompt FROM my_table), |
| 62 | + STRUCT(256 AS max_output_tokens, 0.2 AS temperature) |
| 63 | +); |
| 64 | +``` |
| 65 | + |
| 66 | +### 3. Create Embeddings |
| 67 | + |
| 68 | +```sql |
| 69 | +-- Create embedding model |
| 70 | +CREATE OR REPLACE MODEL `project.dataset.embedding_model` |
| 71 | + REMOTE WITH CONNECTION DEFAULT |
| 72 | + OPTIONS (ENDPOINT = 'text-embedding-005'); |
| 73 | + |
| 74 | +-- Generate embeddings |
| 75 | +SELECT * FROM ML.GENERATE_EMBEDDING( |
| 76 | + MODEL `project.dataset.embedding_model`, |
| 77 | + (SELECT content FROM my_table) |
| 78 | +); |
| 79 | +``` |
| 80 | + |
| 81 | +### 4. Vector Search |
| 82 | + |
| 83 | +```sql |
| 84 | +SELECT base.id, base.content, distance |
| 85 | +FROM VECTOR_SEARCH( |
| 86 | + TABLE `project.dataset.embeddings`, 'embedding', |
| 87 | + (SELECT embedding FROM query_embeddings), |
| 88 | + top_k => 10, |
| 89 | + distance_type => 'COSINE' |
| 90 | +); |
| 91 | +``` |
| 92 | + |
| 93 | +## AI Functions Reference |
| 94 | + |
| 95 | +### AI.GENERATE_TEXT |
| 96 | + |
| 97 | +Full control over text generation with model parameters: |
| 98 | + |
| 99 | +```sql |
| 100 | +SELECT * FROM AI.GENERATE_TEXT( |
| 101 | + MODEL `project.dataset.model`, |
| 102 | + (SELECT prompt FROM prompts_table), |
| 103 | + STRUCT( |
| 104 | + 512 AS max_output_tokens, |
| 105 | + 0.7 AS temperature, |
| 106 | + 0.95 AS top_p, |
| 107 | + TRUE AS ground_with_google_search |
| 108 | + ) |
| 109 | +); |
| 110 | +``` |
| 111 | + |
| 112 | +**Key Parameters:** |
| 113 | +- `max_output_tokens`: 1-8192 (default: 128) |
| 114 | +- `temperature`: 0.0-1.0 (default: 0, higher = more creative) |
| 115 | +- `top_p`: 0.0-1.0 (default: 0.95) |
| 116 | +- `ground_with_google_search`: Enable web grounding |
| 117 | + |
| 118 | +### ML.GENERATE_EMBEDDING |
| 119 | + |
| 120 | +Generate vector embeddings for semantic operations: |
| 121 | + |
| 122 | +```sql |
| 123 | +SELECT * FROM ML.GENERATE_EMBEDDING( |
| 124 | + MODEL `project.dataset.embedding_model`, |
| 125 | + (SELECT id, text_column AS content FROM source_table) |
| 126 | +) |
| 127 | +WHERE LENGTH(ml_generate_embedding_status) = 0; -- Filter errors |
| 128 | +``` |
| 129 | + |
| 130 | +**Supported Models:** |
| 131 | +- `text-embedding-005` (recommended) |
| 132 | +- `text-embedding-004` |
| 133 | +- `text-multilingual-embedding-002` |
| 134 | +- `multimodalembedding@001` (text + images) |
| 135 | + |
| 136 | +### VECTOR_SEARCH |
| 137 | + |
| 138 | +Find nearest neighbors using embeddings: |
| 139 | + |
| 140 | +```sql |
| 141 | +SELECT query.id, base.id, base.content, distance |
| 142 | +FROM VECTOR_SEARCH( |
| 143 | + TABLE `project.dataset.base_embeddings`, 'embedding', |
| 144 | + TABLE `project.dataset.query_embeddings`, |
| 145 | + top_k => 5, |
| 146 | + distance_type => 'COSINE', |
| 147 | + options => '{"fraction_lists_to_search": 0.01}' |
| 148 | +); |
| 149 | +``` |
| 150 | + |
| 151 | +**Distance Types:** `COSINE`, `EUCLIDEAN`, `DOT_PRODUCT` |
| 152 | + |
| 153 | +## Supported Models |
| 154 | + |
| 155 | +| Provider | Models | Use Case | |
| 156 | +|----------|--------|----------| |
| 157 | +| Google | Gemini 2.0, 1.5 Pro/Flash | Text generation, multimodal | |
| 158 | +| Anthropic | Claude 3.5, 3 Opus/Sonnet | Complex reasoning | |
| 159 | +| Meta | Llama 3.1, 3.2 | Open-source alternative | |
| 160 | +| Mistral | Mistral Large, Medium | European compliance | |
| 161 | + |
| 162 | +## Prerequisites |
| 163 | + |
| 164 | +1. **BigQuery Connection**: Create a Cloud resource connection |
| 165 | +2. **IAM Permissions**: Grant `bigquery.connectionUser` and `aiplatform.user` |
| 166 | +3. **APIs Enabled**: BigQuery API, Vertex AI API, BigQuery Connection API |
| 167 | + |
| 168 | +## References |
| 169 | + |
| 170 | +Load detailed documentation as needed: |
| 171 | + |
| 172 | +- `TEXT_GENERATION.md` - Complete AI.GENERATE_TEXT guide with all parameters |
| 173 | +- `EMBEDDINGS.md` - Embedding models, multimodal embeddings, best practices |
| 174 | +- `VECTOR_SEARCH.md` - Vector indexes, search optimization, recall tuning |
| 175 | +- `REMOTE_MODELS.md` - CREATE MODEL syntax for all supported providers |
| 176 | +- `RAG_WORKFLOW.md` - End-to-end RAG implementation patterns |
| 177 | +- `CLOUD_AI_SERVICES.md` - Translation, NLP, document processing, vision |
| 178 | + |
| 179 | +## Scripts |
| 180 | + |
| 181 | +Helper scripts for common operations: |
| 182 | + |
| 183 | +- `setup_remote_model.py` - Create remote model connections |
| 184 | +- `generate_embeddings.py` - Batch embedding generation |
| 185 | +- `semantic_search.py` - Build semantic search pipelines |
| 186 | +- `rag_pipeline.py` - Complete RAG workflow setup |
| 187 | + |
| 188 | +## Common Patterns |
| 189 | + |
| 190 | +### Batch Text Classification |
| 191 | + |
| 192 | +```sql |
| 193 | +SELECT id, content, |
| 194 | + JSON_VALUE(ml_generate_text_result, '$.predictions[0].content') AS category |
| 195 | +FROM ML.GENERATE_TEXT( |
| 196 | + MODEL `project.dataset.gemini`, |
| 197 | + (SELECT id, content, |
| 198 | + CONCAT('Classify this text into one of: Tech, Sports, Politics\n\nText: ', content) AS prompt |
| 199 | + FROM articles) |
| 200 | +); |
| 201 | +``` |
| 202 | + |
| 203 | +### Semantic Similarity Search |
| 204 | + |
| 205 | +```sql |
| 206 | +-- Find similar documents to a query |
| 207 | +WITH query_embedding AS ( |
| 208 | + SELECT ml_generate_embedding_result AS embedding |
| 209 | + FROM ML.GENERATE_EMBEDDING( |
| 210 | + MODEL `project.dataset.embedding_model`, |
| 211 | + (SELECT 'machine learning best practices' AS content) |
| 212 | + ) |
| 213 | +) |
| 214 | +SELECT d.title, d.content, distance |
| 215 | +FROM VECTOR_SEARCH( |
| 216 | + TABLE `project.dataset.doc_embeddings`, 'embedding', |
| 217 | + TABLE query_embedding, |
| 218 | + top_k => 10 |
| 219 | +) |
| 220 | +JOIN `project.dataset.documents` d ON d.id = base.id; |
| 221 | +``` |
| 222 | + |
| 223 | +### RAG with Context Injection |
| 224 | + |
| 225 | +```sql |
| 226 | +-- Retrieve relevant context and generate answer |
| 227 | +WITH context AS ( |
| 228 | + SELECT STRING_AGG(content, '\n\n') AS retrieved_context |
| 229 | + FROM VECTOR_SEARCH( |
| 230 | + TABLE `project.dataset.knowledge_base`, 'embedding', |
| 231 | + (SELECT embedding FROM ML.GENERATE_EMBEDDING(MODEL m, (SELECT @query AS content))), |
| 232 | + top_k => 5 |
| 233 | + ) |
| 234 | +) |
| 235 | +SELECT ml_generate_text_result AS answer |
| 236 | +FROM ML.GENERATE_TEXT( |
| 237 | + MODEL `project.dataset.gemini`, |
| 238 | + (SELECT CONCAT( |
| 239 | + 'Answer based on context:\n\n', retrieved_context, |
| 240 | + '\n\nQuestion: ', @query |
| 241 | + ) AS prompt FROM context) |
| 242 | +); |
| 243 | +``` |
| 244 | + |
| 245 | +## Error Handling |
| 246 | + |
| 247 | +Check status columns for errors: |
| 248 | + |
| 249 | +```sql |
| 250 | +-- Text generation errors |
| 251 | +SELECT * FROM ML.GENERATE_TEXT(...) |
| 252 | +WHERE ml_generate_text_status != ''; |
| 253 | + |
| 254 | +-- Embedding errors |
| 255 | +SELECT * FROM ML.GENERATE_EMBEDDING(...) |
| 256 | +WHERE LENGTH(ml_generate_embedding_status) > 0; |
| 257 | +``` |
| 258 | + |
| 259 | +## Performance Tips |
| 260 | + |
| 261 | +1. **Use Vector Indexes**: Create indexes for tables with >100K embeddings |
| 262 | +2. **Batch Requests**: Process multiple rows in single function calls |
| 263 | +3. **Filter Before AI**: Apply WHERE clauses before expensive AI operations |
| 264 | +4. **Cache Embeddings**: Store embeddings in tables, don't regenerate |
| 265 | +5. **Tune Search**: Adjust `fraction_lists_to_search` for speed vs recall |
| 266 | + |
| 267 | +## Limitations |
| 268 | + |
| 269 | +- Max 10,000 rows per AI function call |
| 270 | +- Embedding dimensions vary by model (768-3072) |
| 271 | +- Rate limits apply based on Vertex AI quotas |
| 272 | +- Some models require specific regions |
0 commit comments