|
| 1 | +--- |
| 2 | +id: multi-tenant-patterns |
| 3 | +title: Multi-tenant application patterns |
| 4 | +sidebar_label: Multi-tenant patterns |
| 5 | +description: Learn how to build multi-tenant applications using Temporal with task queue isolation patterns, worker design, and best practices. |
| 6 | +slug: /production-deployment/multi-tenant-patterns |
| 7 | +toc_max_heading_level: 4 |
| 8 | +keywords: |
| 9 | + - multi-tenant |
| 10 | + - task queues |
| 11 | + - worker patterns |
| 12 | + - SaaS |
| 13 | +tags: |
| 14 | + - Multitenancy |
| 15 | + - Best Practices |
| 16 | +--- |
| 17 | + |
| 18 | +import { RelatedReadContainer, RelatedReadItem } from '@site/src/components'; |
| 19 | + |
| 20 | +Many SaaS providers and large enterprise platform teams use a single Temporal [Namespace](/namespaces) with [per-tenant Task Queues](#1-task-queues-per-tenant-recommended) to power their multi-tenant applications. This approach maximizes resource efficiency while maintaining logical separation between tenants. |
| 21 | + |
| 22 | +This guide covers architectural patterns, design considerations, and practical examples for building multi-tenant applications with Temporal. |
| 23 | + |
| 24 | +## Architectural principles |
| 25 | + |
| 26 | +When designing a multi-tenant Temporal application, follow these principles: |
| 27 | + |
| 28 | +- **Define your tenant model** - Determine what constitutes a tenant in your business (customers, pricing tiers, teams, etc.) |
| 29 | +- **Prefer simplicity** - Start with the simplest pattern that meets your needs |
| 30 | +- **Understand Temporal limits** - Design within the constraints of your Temporal deployment |
| 31 | +- **Test at scale** - Performance testing must drive your capacity decisions |
| 32 | +- **Plan for growth** - Consider how you'll onboard new tenants and scale workers |
| 33 | + |
| 34 | +## Architectural patterns |
| 35 | + |
| 36 | +There are three main patterns for multi-tenant applications in Temporal, listed from most to least recommended: |
| 37 | + |
| 38 | +### 1. Task queues per tenant (Recommended) |
| 39 | + |
| 40 | +**Use different [Task Queues](/task-queue) for each tenant's [Workflows](/workflows) and [Activities](/activities).** |
| 41 | + |
| 42 | +This is the recommended pattern for most use cases. Each tenant gets dedicated Task Queue(s), with [Workers](/workers) polling multiple tenant Task Queues in a single process. |
| 43 | + |
| 44 | +**Pros:** |
| 45 | +- Strong isolation between tenants |
| 46 | +- Efficient resource utilization |
| 47 | +- Flexible worker scaling |
| 48 | +- Easy to add new tenants |
| 49 | +- Can handle thousands of tenants per [Namespace](/namespaces) |
| 50 | + |
| 51 | +**Cons:** |
| 52 | +- Requires worker configuration management |
| 53 | +- Potential for uneven resource distribution |
| 54 | +- Need to prevent "noisy neighbor" issues at the worker level |
| 55 | + |
| 56 | +<RelatedReadContainer> |
| 57 | + <RelatedReadItem path="#task-queue-isolation-pattern" text="Task Queue Isolation Pattern Details" archetype="feature-guide" /> |
| 58 | +</RelatedReadContainer> |
| 59 | + |
| 60 | +### 2. Shared Workflow Task Queues, separate Activity Task Queues |
| 61 | + |
| 62 | +**Share [Workflow Task Queues](/task-queue) but use different [Activity Task Queues](/task-queue) per tenant.** |
| 63 | + |
| 64 | +Use this pattern when [Workflows](/workflows) are lightweight but [Activities](/activities) have heavy resource requirements or external dependencies that need isolation. |
| 65 | + |
| 66 | +**Pros:** |
| 67 | +- Easier worker management than full isolation |
| 68 | +- Activity-level tenant isolation |
| 69 | +- Good for compute-intensive Activities |
| 70 | + |
| 71 | +**Cons:** |
| 72 | +- Less isolation than pattern #1 |
| 73 | +- Workflow visibility is shared |
| 74 | +- More complex to reason about |
| 75 | + |
| 76 | +### 3. Namespace per tenant |
| 77 | + |
| 78 | +**Use a separate [Namespace](/namespaces) for each tenant.** |
| 79 | + |
| 80 | +Only practical for a small number (< 50) of high-value tenants due to operational overhead. |
| 81 | + |
| 82 | +**Pros:** |
| 83 | +- Complete isolation between tenants |
| 84 | +- Per-tenant rate limiting |
| 85 | +- Maximum security |
| 86 | + |
| 87 | +**Cons:** |
| 88 | +- Higher operational overhead |
| 89 | +- Credential and connectivity management per [Namespace](/namespaces) |
| 90 | +- Requires more [Workers](/workers) (minimum 2 per Namespace for high availability) |
| 91 | +- Expensive at scale |
| 92 | + |
| 93 | +<RelatedReadContainer> |
| 94 | + <RelatedReadItem path="/evaluate/development-production-features/multi-tenancy#namespace-isolation" text="Namespace Isolation in Temporal Cloud" archetype="cloud-guide" /> |
| 95 | +</RelatedReadContainer> |
| 96 | + |
| 97 | +## Task Queue isolation pattern |
| 98 | + |
| 99 | +This section details the recommended pattern for most multi-tenant applications. |
| 100 | + |
| 101 | +### Worker design |
| 102 | + |
| 103 | +When a [Worker](/workers) starts up: |
| 104 | + |
| 105 | +1. **Load tenant configuration** - Retrieve the list of tenants this Worker should handle (from config file, API, or database) |
| 106 | +2. **Create [Task Queues](/task-queue)** - For each tenant, generate a unique Task Queue name (e.g., `customer-{tenant-id}`) |
| 107 | +3. **Register [Workflows](/workflows) and [Activities](/activities)** - Register your Workflow and Activity implementations once, passing the tenant-specific Task Queue name |
| 108 | +4. **Poll multiple Task Queues** - A single Worker process polls all assigned tenant Task Queues |
| 109 | + |
| 110 | +```go |
| 111 | +// Example: Go worker polling multiple tenant Task Queues |
| 112 | +for _, tenant := range assignedTenants { |
| 113 | + taskQueue := fmt.Sprintf("customer-%s", tenant.ID) |
| 114 | + |
| 115 | + worker := worker.New(client, taskQueue, worker.Options{}) |
| 116 | + worker.RegisterWorkflow(YourWorkflow) |
| 117 | + worker.RegisterActivity(YourActivity) |
| 118 | +} |
| 119 | +``` |
| 120 | + |
| 121 | +### Routing requests to Task Queues |
| 122 | + |
| 123 | +Your application needs to route [Workflow](/workflows) starts and other operations to the correct tenant [Task Queue](/task-queue): |
| 124 | + |
| 125 | +```go |
| 126 | +// Example: Starting a Workflow for a specific tenant |
| 127 | +taskQueue := fmt.Sprintf("customer-%s", tenantID) |
| 128 | +workflowOptions := client.StartWorkflowOptions{ |
| 129 | + ID: workflowID, |
| 130 | + TaskQueue: taskQueue, |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +Consider creating an API or service that: |
| 135 | +- Maps tenant IDs to Task Queue names |
| 136 | +- Tracks which [Workers](/workers) are handling which tenants |
| 137 | +- Allows both your application and Workers to read the mappings of: |
| 138 | + 1. Tenant IDs to Task Queues |
| 139 | + 1. Workers to tenants |
| 140 | + |
| 141 | +### Capacity planning |
| 142 | + |
| 143 | +Key questions to answer through performance testing: |
| 144 | + |
| 145 | +**[Namespace](/namespaces) capacity:** |
| 146 | +- How many concurrent [Task Queue](/task-queue) pollers can your Namespace support? |
| 147 | +- What are your [Actions Per Second (APS)](/cloud/limits#actions-per-second) limits? |
| 148 | +- What are your [Operations Per Second (OPS)](/references/operation-list) limits? |
| 149 | + |
| 150 | +**[Worker](/workers) capacity:** |
| 151 | +- How many tenants can a single Worker process handle? |
| 152 | +- What are the CPU and memory requirements per tenant? |
| 153 | +- How many concurrent [Workflow](/workflows) executions per tenant? |
| 154 | +- How many concurrent [Activity](/activities) executions per tenant? |
| 155 | + |
| 156 | +**SDK configuration to tune:** |
| 157 | +- `MaxConcurrentWorkflowTaskExecutionSize` |
| 158 | +- `MaxConcurrentActivityExecutionSize` |
| 159 | +- `MaxConcurrentWorkflowTaskPollers` |
| 160 | +- `MaxConcurrentActivityTaskPollers` |
| 161 | +- Worker replicas (in Kubernetes deployments) |
| 162 | + |
| 163 | +### Provisioning new tenants |
| 164 | + |
| 165 | +Automate tenant onboarding with a Temporal [Workflow](/workflows): |
| 166 | + |
| 167 | +1. Create a tenant onboarding Workflow that: |
| 168 | + - Validates tenant information |
| 169 | + - Provisions infrastructure |
| 170 | + - Deploys/updates [Worker](/workers) configuration |
| 171 | + - Triggers Worker restarts or scaling |
| 172 | + - Verifies the tenant is operational |
| 173 | + |
| 174 | +2. Store tenant-to-Worker mappings in a database or configuration service |
| 175 | + |
| 176 | +3. Update Worker deployments to pick up new tenant assignments |
| 177 | + |
| 178 | +## Practical example |
| 179 | + |
| 180 | +**Scenario:** A SaaS company has 1,000 customers and expects to grow to 5,000 customers over 3 years. They have 2 [Workflows](/workflows) and ~25 [Activities](/activities) per Workflow. All customers are on the same tier (no segmentation yet). |
| 181 | + |
| 182 | +### Assumptions |
| 183 | + |
| 184 | +| Item | Value | |
| 185 | +|------|-------| |
| 186 | +| Current customers | 1,000 | |
| 187 | +| Workflow Task Queues per customer | 1 | |
| 188 | +| Activity Task Queues per customer | 1 | |
| 189 | +| Max Task Queue pollers per Namespace | 5,000 | |
| 190 | +| SDK concurrent Workflow task pollers | 5 | |
| 191 | +| SDK concurrent Activity task pollers | 5 | |
| 192 | +| Max concurrent Workflow executions | 200 | |
| 193 | +| Max concurrent Activity executions | 200 | |
| 194 | + |
| 195 | +### Capacity calculations |
| 196 | + |
| 197 | +**[Task Queue](/task-queue) poller limits:** |
| 198 | +- Each [Worker](/workers) uses 10 pollers per tenant (5 Workflow + 5 Activity) |
| 199 | +- Maximum Workers in [Namespace](/namespaces): 5,000 pollers ÷ 10 = **500 Workers** |
| 200 | + |
| 201 | +**Worker capacity:** |
| 202 | +- Each Worker can theoretically handle 200 [Workflows](/workflows) and 200 [Activities](/activities) concurrently |
| 203 | +- Conservative estimate: **250 tenants per Worker** (accounting for overhead) |
| 204 | +- For 1,000 customers: **4 Workers minimum** (plus replicas for HA) |
| 205 | +- For 5,000 customers: **20 Workers minimum** (plus replicas for HA) |
| 206 | + |
| 207 | +**Namespace capacity:** |
| 208 | +- At 250 tenants per Worker, need 2 Workers per group of tenants (for HA) |
| 209 | +- Maximum tenants in Namespace: (500 Workers ÷ 2) × 250 = **62,500 tenants** |
| 210 | + |
| 211 | +:::note |
| 212 | +These are theoretical calculations based on SDK defaults. **Always perform load testing** to determine actual capacity for your specific workload. Monitor CPU, memory, and Temporal metrics during testing. |
| 213 | + |
| 214 | +While testing, also pay attention to your [metrics capacity and cardinality](/cloud/metrics/openmetrics/api-reference#managing-high-cardinality). |
| 215 | +::: |
| 216 | + |
| 217 | +### Worker assignment strategies |
| 218 | + |
| 219 | +**Option 1: Static configuration** |
| 220 | +- Each [Worker](/workers) reads a config file listing assigned tenant IDs |
| 221 | +- Simple to implement |
| 222 | +- Requires deployment to add tenants |
| 223 | + |
| 224 | +**Option 2: Dynamic API** |
| 225 | +- Workers call an API on startup to get assigned tenants |
| 226 | +- Workers identified by static ID (1 to N) |
| 227 | +- API returns tenant list based on Worker ID |
| 228 | +- More flexible, no deployment needed for new tenants |
| 229 | + |
| 230 | +## Best practices |
| 231 | + |
| 232 | +### Monitoring |
| 233 | + |
| 234 | +Track these [metrics](/references/sdk-metrics) per tenant: |
| 235 | +- [Workflow completion](/cloud/metrics/openmetrics/metrics-reference#workflow-completion-metrics) rates |
| 236 | +- [Activity execution](/cloud/metrics/openmetrics/metrics-reference#task-queue-metrics) rates |
| 237 | +- [Task Queue backlog](/cloud/metrics/openmetrics/metrics-reference#task-queue-metrics) |
| 238 | +- [Worker resource utilization](/references/sdk-metrics#worker_task_slots_used) |
| 239 | +- [Workflow failure rates](/encyclopedia/detecting-workflow-failures) |
| 240 | + |
| 241 | +### Handling noisy neighbors |
| 242 | + |
| 243 | +Even with [Task Queue](/task-queue) isolation, monitor for tenants that: |
| 244 | +- Generate excessive load |
| 245 | +- Have high failure rates |
| 246 | +- Cause [Worker](/workers) resource exhaustion |
| 247 | + |
| 248 | +Strategies: |
| 249 | +- Implement per-tenant rate limiting in your application |
| 250 | +- Move problematic tenants to dedicated Workers |
| 251 | +- Use [Workflow](/workflows)/[Activity](/activities) timeouts aggressively |
| 252 | + |
| 253 | +### Tenant lifecycle |
| 254 | + |
| 255 | +Plan for: |
| 256 | +- **Onboarding** - Automated provisioning [Workflow](/workflows) |
| 257 | +- **Scaling** - When to add new [Workers](/workers) for growing tenants |
| 258 | +- **Offboarding** - Graceful tenant removal and data cleanup |
| 259 | +- **Rebalancing** - Redistributing tenants across Workers |
| 260 | + |
| 261 | +### Search Attributes |
| 262 | + |
| 263 | +Use [Search Attributes](/search-attribute) to enable tenant-scoped queries: |
| 264 | +```go |
| 265 | +// Add tenant ID as a Search Attribute |
| 266 | +searchAttributes := map[string]interface{}{ |
| 267 | + "TenantId": tenantID, |
| 268 | +} |
| 269 | +``` |
| 270 | + |
| 271 | +This allows filtering [Workflows](/workflows) by tenant in the UI and SDK: |
| 272 | +```sql |
| 273 | +TenantId = 'customer-123' AND ExecutionStatus = 'Running' |
| 274 | +``` |
| 275 | + |
| 276 | +## Related resources |
| 277 | + |
| 278 | +<RelatedReadContainer> |
| 279 | + <RelatedReadItem path="/evaluate/development-production-features/multi-tenancy" text="Multi-tenancy Overview" archetype="feature-guide" /> |
| 280 | + <RelatedReadItem path="/cloud/limits" text="Temporal Cloud Limits" archetype="cloud-guide" /> |
| 281 | + <RelatedReadItem path="/visibility" text="Visibility and Search Attributes" archetype="feature-guide" /> |
| 282 | +</RelatedReadContainer> |
0 commit comments