Skip to content

Commit 2a30712

Browse files
becholsclaudebrianmacdonald-temporal
authored
update multitenant docs, add best practice guide (#4090)
* update multitenant docs, add best practice guide * fix links * Apply suggestions from code review * Apply suggestions from code review * Fix broken links after cloud docs path change Update links from /production-deployment/cloud/... to /cloud/... to reflect the cloud docs directory restructuring. Co-Authored-By: Claude Opus 4.5 <[email protected]> * Revise multi-tenant feature page for open source and Cloud - Split namespace isolation into open source and Cloud subsections - Add open source benefits: Workflow ID uniqueness, resource isolation, configuration boundaries, custom Authorizer access control - Clarify what Cloud adds: API keys/mTLS, built-in RBAC, rate limits, HA replication, Nexus - Condense application multi-tenancy section, link to best practices Co-Authored-By: Claude Opus 4.5 <[email protected]> * Move Nexus to open source section (available in both) Nexus is GA for self-hosted and Cloud. Added it to the open source section and removed from Cloud-only features. Co-Authored-By: Claude Opus 4.5 <[email protected]> * Update multi-tenant-patterns.mdx --------- Co-authored-by: Claude Opus 4.5 <[email protected]> Co-authored-by: Brian MacDonald <[email protected]>
1 parent e5fd867 commit 2a30712

File tree

4 files changed

+346
-19
lines changed

4 files changed

+346
-19
lines changed
Lines changed: 34 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,53 @@
11
---
22
id: multi-tenancy
33
title: Multi-tenancy - Temporal feature
4-
description: Learn about Temporal Cloud's multi-tenant architecture and how it enhances scalability, efficiency, and cost-effectiveness.
4+
description: Learn about Temporal's namespace isolation for multi-tenancy and how to build multi-tenant applications.
55
sidebar_label: Multi-tenancy
66
tags:
77
- Temporal Cloud
8+
- Multitenancy
89
keywords:
910
- multi-tenant
1011
- Temporal Cloud
11-
- cloud architecture
12-
- scalability
13-
- cost-effectiveness
14-
- noisy neighbor
15-
- database performance
16-
- high throughput
12+
- namespace isolation
13+
- multi-tenant applications
14+
- tenant isolation
1715
---
1816

1917
import { RelatedReadContainer, RelatedReadItem } from '@site/src/components';
2018

21-
A Namespace is a unit of isolation within the Temporal Platform -- but even a single Namespace is still multi-tenant.
22-
Multi-tenancy ensures extra capacity is available for all customers during traffic spikes.
19+
Multi-tenancy in Temporal operates at two levels:
2320

24-
However, multi-tenancy can also presents the challenge of "noisy neighbors", where high-traffic tenants consume excess resources, causing slower performance for other tenants.
25-
This is a common problem for database scaling.
21+
## Namespace isolation
2622

27-
Temporal's write-heavy workload, where changes in execution state are constantly written to the persistence layer, demands a database that supports reliably high throughput with low latency for multiple customers, concurrently and fairly.
23+
[Namespaces](/namespaces) are Temporal's unit of isolation, providing logical separation for multi-tenant deployments in both open source Temporal and Temporal Cloud.
2824

29-
With Temporal Cloud, customers pay for consumption instead of entire sets of hardware, providing a cost-effective solution.
30-
Temporal Cloud's architecture scales to handle multiple tenants efficiently.
25+
### Open source Temporal
26+
27+
Namespaces in self-hosted Temporal provide:
28+
29+
- **Workflow ID uniqueness**: Temporal guarantees unique Workflow IDs within a Namespace. Different Namespaces can have Workflows with the same ID without conflict.
30+
- **Resource isolation**: Traffic from one Namespace does not impact other Namespaces on the same Temporal Service.
31+
- **Configuration boundaries**: Settings like [Retention Period](/temporal-service/temporal-server#retention-period) and [Archival](/temporal-service/archival) destination are configured per Namespace.
32+
- **Access control**: Use a custom [Authorizer](/self-hosted-guide/security#authorization) on your Frontend Service to restrict who can access each Namespace.
33+
- **Inter-namespace communication**: Use [Nexus](/evaluate/nexus) for controlled communication between Namespaces.
34+
35+
### Temporal Cloud
36+
37+
Temporal Cloud builds on these capabilities with additional isolation guarantees:
38+
39+
- **Independent authentication** via [API keys](/cloud/api-keys) or [mTLS certificates](/cloud/certificates)
40+
- **Built-in [role-based access controls](/cloud/users#namespace-level-permissions)** without custom Authorizer configuration
41+
- **Separate [rate limits](/cloud/limits#namespace-level)** to prevent noisy neighbor problems
42+
- **[High availability replication](/cloud/high-availability)** across regions
3143

3244
<RelatedReadContainer>
33-
<RelatedReadItem path="https://docs.temporal.io/cloud/security#namespace-isolation" text="Namespace Isolation" archetype="cloud-guide" />
34-
<RelatedReadItem path="https://docs.temporal.io/cloud/pricing" text="Cost-effective Consumption" archetype="cloud-guide" />
45+
<RelatedReadItem path="/cloud/security#namespace-isolation" text="Namespace Isolation Details" archetype="cloud-guide" />
46+
<RelatedReadItem path="/cloud/pricing" text="Temporal Cloud Pricing" archetype="cloud-guide" />
3547
</RelatedReadContainer>
48+
49+
## Application multi-tenancy
50+
51+
Many organizations use Temporal to power their own multi-tenant SaaS applications, isolating their customers' workloads using Task Queues, Search Attributes, and Worker design patterns.
52+
53+
See the [multi-tenant application patterns guide](/production-deployment/multi-tenant-patterns) for detailed recommendations on architecting multi-tenant applications with Temporal.

docs/evaluate/temporal-cloud/security.mdx

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ keywords:
1010
- security
1111
- temporal cloud
1212
tags:
13+
- Multitenancy
1314
- Security
1415
- Temporal Cloud
1516
---
@@ -50,9 +51,34 @@ By deploying a [Codec Server](/production-deployment/data-encryption) you can se
5051
### Namespace isolation
5152

5253
The base unit of isolation in a Temporal environment is a [Namespace](/namespaces).
53-
Each Temporal Cloud account can have multiple Namespaces.
54-
A Namespace (regardless of account) cannot interact with other Namespaces.
55-
Each Namespace is available through a secure gRPC endpoint and an HTTPS (TLS) endpoint.
54+
Each Temporal Cloud account can have multiple Namespaces, and each Namespace is isolated to ensure your workloads remain secure and performant.
55+
56+
#### Authentication
57+
58+
Each Namespace is secured with your choice of authentication method:
59+
- **mTLS certificates** - Namespace-specific X.509 certificates for mutual TLS authentication
60+
- **API keys** - Namespace-scoped API keys for authentication
61+
62+
See [API Keys](/cloud/api-keys) and [mTLS Certificates](/cloud/certificates) for more details on configuring authentication for your Namespace.
63+
64+
#### Rate limiting
65+
66+
Temporal Cloud protects each Namespace with separate rate limits to prevent noisy neighbor problems:
67+
- **Actions Per Second (APS)** - Limits the rate of [actions](/best-practices/managing-aps-limits) performed in your Workflows
68+
- **Operations Per Second (OPS)** - Limits the rate of all [operations](/references/operation-list) that create load on Temporal Server
69+
70+
These per-Namespace rate limits ensure that one Namespace experiencing a traffic spike cannot impact the performance or reliability of other Namespaces, whether those Namespaces belong to a single Temporal Cloud account or separate ones.
71+
72+
See [Rate limiting](/cloud/limits) for more information about Temporal Cloud limits, and [Monitoring trends against limits](/cloud/service-health#rps-aps-rate-limits) for monitoring best practices.
73+
74+
#### Inter-Namespace communication
75+
76+
Namespaces are isolated by default. The only way for Workflows in one Namespace to interact with Workflows in another Namespace is through [Temporal Nexus](/nexus), which provides controlled, secure cross-Namespace communication via Nexus Endpoints.
77+
78+
See [Nexus Security](/nexus/security) for details on how Nexus enables secure inter-Namespace communication.
79+
80+
#### Logical segregation
81+
5682
Temporal Cloud is a multi-tenant service.
5783
Namespaces in the same environment are logically segregated.
5884
Namespaces do not share data processing or data storage across regional boundaries.
Lines changed: 282 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,282 @@
1+
---
2+
id: multi-tenant-patterns
3+
title: Multi-tenant application patterns
4+
sidebar_label: Multi-tenant patterns
5+
description: Learn how to build multi-tenant applications using Temporal with task queue isolation patterns, worker design, and best practices.
6+
slug: /production-deployment/multi-tenant-patterns
7+
toc_max_heading_level: 4
8+
keywords:
9+
- multi-tenant
10+
- task queues
11+
- worker patterns
12+
- SaaS
13+
tags:
14+
- Multitenancy
15+
- Best Practices
16+
---
17+
18+
import { RelatedReadContainer, RelatedReadItem } from '@site/src/components';
19+
20+
Many SaaS providers and large enterprise platform teams use a single Temporal [Namespace](/namespaces) with [per-tenant Task Queues](#1-task-queues-per-tenant-recommended) to power their multi-tenant applications. This approach maximizes resource efficiency while maintaining logical separation between tenants.
21+
22+
This guide covers architectural patterns, design considerations, and practical examples for building multi-tenant applications with Temporal.
23+
24+
## Architectural principles
25+
26+
When designing a multi-tenant Temporal application, follow these principles:
27+
28+
- **Define your tenant model** - Determine what constitutes a tenant in your business (customers, pricing tiers, teams, etc.)
29+
- **Prefer simplicity** - Start with the simplest pattern that meets your needs
30+
- **Understand Temporal limits** - Design within the constraints of your Temporal deployment
31+
- **Test at scale** - Performance testing must drive your capacity decisions
32+
- **Plan for growth** - Consider how you'll onboard new tenants and scale workers
33+
34+
## Architectural patterns
35+
36+
There are three main patterns for multi-tenant applications in Temporal, listed from most to least recommended:
37+
38+
### 1. Task queues per tenant (Recommended)
39+
40+
**Use different [Task Queues](/task-queue) for each tenant's [Workflows](/workflows) and [Activities](/activities).**
41+
42+
This is the recommended pattern for most use cases. Each tenant gets dedicated Task Queue(s), with [Workers](/workers) polling multiple tenant Task Queues in a single process.
43+
44+
**Pros:**
45+
- Strong isolation between tenants
46+
- Efficient resource utilization
47+
- Flexible worker scaling
48+
- Easy to add new tenants
49+
- Can handle thousands of tenants per [Namespace](/namespaces)
50+
51+
**Cons:**
52+
- Requires worker configuration management
53+
- Potential for uneven resource distribution
54+
- Need to prevent "noisy neighbor" issues at the worker level
55+
56+
<RelatedReadContainer>
57+
<RelatedReadItem path="#task-queue-isolation-pattern" text="Task Queue Isolation Pattern Details" archetype="feature-guide" />
58+
</RelatedReadContainer>
59+
60+
### 2. Shared Workflow Task Queues, separate Activity Task Queues
61+
62+
**Share [Workflow Task Queues](/task-queue) but use different [Activity Task Queues](/task-queue) per tenant.**
63+
64+
Use this pattern when [Workflows](/workflows) are lightweight but [Activities](/activities) have heavy resource requirements or external dependencies that need isolation.
65+
66+
**Pros:**
67+
- Easier worker management than full isolation
68+
- Activity-level tenant isolation
69+
- Good for compute-intensive Activities
70+
71+
**Cons:**
72+
- Less isolation than pattern #1
73+
- Workflow visibility is shared
74+
- More complex to reason about
75+
76+
### 3. Namespace per tenant
77+
78+
**Use a separate [Namespace](/namespaces) for each tenant.**
79+
80+
Only practical for a small number (< 50) of high-value tenants due to operational overhead.
81+
82+
**Pros:**
83+
- Complete isolation between tenants
84+
- Per-tenant rate limiting
85+
- Maximum security
86+
87+
**Cons:**
88+
- Higher operational overhead
89+
- Credential and connectivity management per [Namespace](/namespaces)
90+
- Requires more [Workers](/workers) (minimum 2 per Namespace for high availability)
91+
- Expensive at scale
92+
93+
<RelatedReadContainer>
94+
<RelatedReadItem path="/evaluate/development-production-features/multi-tenancy#namespace-isolation" text="Namespace Isolation in Temporal Cloud" archetype="cloud-guide" />
95+
</RelatedReadContainer>
96+
97+
## Task Queue isolation pattern
98+
99+
This section details the recommended pattern for most multi-tenant applications.
100+
101+
### Worker design
102+
103+
When a [Worker](/workers) starts up:
104+
105+
1. **Load tenant configuration** - Retrieve the list of tenants this Worker should handle (from config file, API, or database)
106+
2. **Create [Task Queues](/task-queue)** - For each tenant, generate a unique Task Queue name (e.g., `customer-{tenant-id}`)
107+
3. **Register [Workflows](/workflows) and [Activities](/activities)** - Register your Workflow and Activity implementations once, passing the tenant-specific Task Queue name
108+
4. **Poll multiple Task Queues** - A single Worker process polls all assigned tenant Task Queues
109+
110+
```go
111+
// Example: Go worker polling multiple tenant Task Queues
112+
for _, tenant := range assignedTenants {
113+
taskQueue := fmt.Sprintf("customer-%s", tenant.ID)
114+
115+
worker := worker.New(client, taskQueue, worker.Options{})
116+
worker.RegisterWorkflow(YourWorkflow)
117+
worker.RegisterActivity(YourActivity)
118+
}
119+
```
120+
121+
### Routing requests to Task Queues
122+
123+
Your application needs to route [Workflow](/workflows) starts and other operations to the correct tenant [Task Queue](/task-queue):
124+
125+
```go
126+
// Example: Starting a Workflow for a specific tenant
127+
taskQueue := fmt.Sprintf("customer-%s", tenantID)
128+
workflowOptions := client.StartWorkflowOptions{
129+
ID: workflowID,
130+
TaskQueue: taskQueue,
131+
}
132+
```
133+
134+
Consider creating an API or service that:
135+
- Maps tenant IDs to Task Queue names
136+
- Tracks which [Workers](/workers) are handling which tenants
137+
- Allows both your application and Workers to read the mappings of:
138+
1. Tenant IDs to Task Queues
139+
1. Workers to tenants
140+
141+
### Capacity planning
142+
143+
Key questions to answer through performance testing:
144+
145+
**[Namespace](/namespaces) capacity:**
146+
- How many concurrent [Task Queue](/task-queue) pollers can your Namespace support?
147+
- What are your [Actions Per Second (APS)](/cloud/limits#actions-per-second) limits?
148+
- What are your [Operations Per Second (OPS)](/references/operation-list) limits?
149+
150+
**[Worker](/workers) capacity:**
151+
- How many tenants can a single Worker process handle?
152+
- What are the CPU and memory requirements per tenant?
153+
- How many concurrent [Workflow](/workflows) executions per tenant?
154+
- How many concurrent [Activity](/activities) executions per tenant?
155+
156+
**SDK configuration to tune:**
157+
- `MaxConcurrentWorkflowTaskExecutionSize`
158+
- `MaxConcurrentActivityExecutionSize`
159+
- `MaxConcurrentWorkflowTaskPollers`
160+
- `MaxConcurrentActivityTaskPollers`
161+
- Worker replicas (in Kubernetes deployments)
162+
163+
### Provisioning new tenants
164+
165+
Automate tenant onboarding with a Temporal [Workflow](/workflows):
166+
167+
1. Create a tenant onboarding Workflow that:
168+
- Validates tenant information
169+
- Provisions infrastructure
170+
- Deploys/updates [Worker](/workers) configuration
171+
- Triggers Worker restarts or scaling
172+
- Verifies the tenant is operational
173+
174+
2. Store tenant-to-Worker mappings in a database or configuration service
175+
176+
3. Update Worker deployments to pick up new tenant assignments
177+
178+
## Practical example
179+
180+
**Scenario:** A SaaS company has 1,000 customers and expects to grow to 5,000 customers over 3 years. They have 2 [Workflows](/workflows) and ~25 [Activities](/activities) per Workflow. All customers are on the same tier (no segmentation yet).
181+
182+
### Assumptions
183+
184+
| Item | Value |
185+
|------|-------|
186+
| Current customers | 1,000 |
187+
| Workflow Task Queues per customer | 1 |
188+
| Activity Task Queues per customer | 1 |
189+
| Max Task Queue pollers per Namespace | 5,000 |
190+
| SDK concurrent Workflow task pollers | 5 |
191+
| SDK concurrent Activity task pollers | 5 |
192+
| Max concurrent Workflow executions | 200 |
193+
| Max concurrent Activity executions | 200 |
194+
195+
### Capacity calculations
196+
197+
**[Task Queue](/task-queue) poller limits:**
198+
- Each [Worker](/workers) uses 10 pollers per tenant (5 Workflow + 5 Activity)
199+
- Maximum Workers in [Namespace](/namespaces): 5,000 pollers ÷ 10 = **500 Workers**
200+
201+
**Worker capacity:**
202+
- Each Worker can theoretically handle 200 [Workflows](/workflows) and 200 [Activities](/activities) concurrently
203+
- Conservative estimate: **250 tenants per Worker** (accounting for overhead)
204+
- For 1,000 customers: **4 Workers minimum** (plus replicas for HA)
205+
- For 5,000 customers: **20 Workers minimum** (plus replicas for HA)
206+
207+
**Namespace capacity:**
208+
- At 250 tenants per Worker, need 2 Workers per group of tenants (for HA)
209+
- Maximum tenants in Namespace: (500 Workers ÷ 2) × 250 = **62,500 tenants**
210+
211+
:::note
212+
These are theoretical calculations based on SDK defaults. **Always perform load testing** to determine actual capacity for your specific workload. Monitor CPU, memory, and Temporal metrics during testing.
213+
214+
While testing, also pay attention to your [metrics capacity and cardinality](/cloud/metrics/openmetrics/api-reference#managing-high-cardinality).
215+
:::
216+
217+
### Worker assignment strategies
218+
219+
**Option 1: Static configuration**
220+
- Each [Worker](/workers) reads a config file listing assigned tenant IDs
221+
- Simple to implement
222+
- Requires deployment to add tenants
223+
224+
**Option 2: Dynamic API**
225+
- Workers call an API on startup to get assigned tenants
226+
- Workers identified by static ID (1 to N)
227+
- API returns tenant list based on Worker ID
228+
- More flexible, no deployment needed for new tenants
229+
230+
## Best practices
231+
232+
### Monitoring
233+
234+
Track these [metrics](/references/sdk-metrics) per tenant:
235+
- [Workflow completion](/cloud/metrics/openmetrics/metrics-reference#workflow-completion-metrics) rates
236+
- [Activity execution](/cloud/metrics/openmetrics/metrics-reference#task-queue-metrics) rates
237+
- [Task Queue backlog](/cloud/metrics/openmetrics/metrics-reference#task-queue-metrics)
238+
- [Worker resource utilization](/references/sdk-metrics#worker_task_slots_used)
239+
- [Workflow failure rates](/encyclopedia/detecting-workflow-failures)
240+
241+
### Handling noisy neighbors
242+
243+
Even with [Task Queue](/task-queue) isolation, monitor for tenants that:
244+
- Generate excessive load
245+
- Have high failure rates
246+
- Cause [Worker](/workers) resource exhaustion
247+
248+
Strategies:
249+
- Implement per-tenant rate limiting in your application
250+
- Move problematic tenants to dedicated Workers
251+
- Use [Workflow](/workflows)/[Activity](/activities) timeouts aggressively
252+
253+
### Tenant lifecycle
254+
255+
Plan for:
256+
- **Onboarding** - Automated provisioning [Workflow](/workflows)
257+
- **Scaling** - When to add new [Workers](/workers) for growing tenants
258+
- **Offboarding** - Graceful tenant removal and data cleanup
259+
- **Rebalancing** - Redistributing tenants across Workers
260+
261+
### Search Attributes
262+
263+
Use [Search Attributes](/search-attribute) to enable tenant-scoped queries:
264+
```go
265+
// Add tenant ID as a Search Attribute
266+
searchAttributes := map[string]interface{}{
267+
"TenantId": tenantID,
268+
}
269+
```
270+
271+
This allows filtering [Workflows](/workflows) by tenant in the UI and SDK:
272+
```sql
273+
TenantId = 'customer-123' AND ExecutionStatus = 'Running'
274+
```
275+
276+
## Related resources
277+
278+
<RelatedReadContainer>
279+
<RelatedReadItem path="/evaluate/development-production-features/multi-tenancy" text="Multi-tenancy Overview" archetype="feature-guide" />
280+
<RelatedReadItem path="/cloud/limits" text="Temporal Cloud Limits" archetype="cloud-guide" />
281+
<RelatedReadItem path="/visibility" text="Visibility and Search Attributes" archetype="feature-guide" />
282+
</RelatedReadContainer>

0 commit comments

Comments
 (0)