This benchmark suite measures and compares point-read latency across multiple Azure data systems, specifically Cosmos DB serverless and Azure Storage (Blob), under identical, region-aligned conditions.
It was designed to validate the hypothesis that for workloads dominated by single-record lookups, a SQL + Blob Storage architecture offers lower latency and greater cost efficiency than a fully document-based store such as Cosmos DB.
Enterprise systems often evolve toward schema-flexible or semi-structured data stores. However, most production access patterns remain point reads by ID, for example:
- Retrieving a user profile by UserID
- Fetching a configuration document by TenantID
- Looking up a device telemetry snapshot by (LocationID, DeviceID)
While NoSQL systems such as Cosmos DB provide global distribution and flexible schema, they incur structural latency costs tied to consistency, partition routing, and RU accounting.
This project quantifies those trade-offs through direct, controlled measurements.
- Generate a synthetic dataset of small JSON documents (~1 KB each) simulating IoT telemetry data.
- Ingest the dataset into two different storage architectures:
- SQL + Blob Storage (Document Registry pattern)
- Cosmos DB container with hierarchical partitioning
- Run concurrent point-read benchmarks against both systems, measuring per-read latency.
| Component | Description |
|---|---|
| SQL Table | Maps (PartitionId, DocumentId) -> BlobUri |
| Blob Storage | Stores the actual JSON documents (~1 KB each) |
| Access Pattern | Two-step lookup: SQL SELECT -> Blob Download |
| Expected Behavior | Minimal control-plane overhead, network-limited latency |
| Component | Description |
|---|---|
| Container | iotreadings (partitioned by stateId, deviceId) with hierarchical partitioning |
| Item | JSON document identical to Blob content |
| Access Pattern | Direct ReadItemAsync(partitionKey, id) |
| Expected Behavior | Consistent reads, higher per-request overhead |
Weather Data - IoT Telemetry Simulation
- States: 50 U.S. states (AK, TX, CA, etc.)
- Devices per State: 1000
- Reads per Device: 100
- Total Documents: 5,000,000+
- Document Size: < 1 KB each
Example document:
{
"id": "AK_sensor-AK-0000_00003",
"stateId": "AK",
"deviceId": "sensor-AK-0000",
"timestamp": "20251117011804035",
"temperature": 19.073329476673777,
"humidity": 33.547862546307904,
"battery": 81,
"status": "ok",
"geo": {
"lat": 0,
"lon": 0
}
}
Documents are grouped by state and device to simulate IoT telemetry ingestion.
This section explains how to build, configure, and run the suite end-to-end: synthetic data generation, Cosmos DB bulk upload, and the cross-store latency benchmark.
Contents
- Prerequisites
- Required environment variables
- Benchmark Runner
- CLI Usage
- Generate synthetic data
- Upload to Cosmos DB
- Run the benchmark
- Results Summary
- Observations & Interpretation
- .NET SDK 9.0+
- Azure resources (same region recommended, e.g., East US 2):
- Azure Storage Account (Blob)
- [VNET integrated with private endpoint and public access, Default Access Tier: Hot, Replication: Zone-redundant (ZRS)]
- Azure Cosmos DB for NoSQL (Serverless).
- (database + container will be auto-created)
- [VNET integrated with private endpoint and public access, 4000RUs throughput]
- SQL Server / Azure SQL (for the registry table)
- Azure VM for running the benchmark (recommended)
- Azure Storage Account (Blob)
- Network access (SQL firewall rules as needed)
Set these for default runs (or pass explicit CLI options):
- AZURE_SQL_CONNECTION_STRING
- Example: Server=tcp:.database.windows.net,1433;Initial Catalog=;User ID=;Password=;Encrypt=True;...
- AZURE_STORAGE_CONNECTION_STRING
- From your Storage Account “Access keys”
- AZURE_COSMOSDB_CONNECTION_STRING
- From Cosmos DB Keys (primary connection string)
How to set (examples):
- Windows (PowerShell): $env:AZURE_SQL_CONNECTION_STRING="..."
- Linux/macOS: export AZURE_SQL_CONNECTION_STRING="..."
The Benchmark Runner performs concurrent point-read tests against both systems.
- Sample Size: 50,000 to 100,000 random document IDs
- Parallelism: Configurable (default 20)
- Region: East US 2 (VM, Cosmos DB, and Storage Account co-located)
- Metrics Captured:
- Per-read latency (ms)
- Payload size
- P50 / P95 / P99 latency percentiles
- Average latency (mean)
Build and Publish the project:
From the project root directory, run:
dotnet publish .\Riatix.DocumentRegistry.Benchmark\Riatix.DocumentRegistry.Benchmark.csproj -c Release -o ./publish
The dotnet publish command will restore, build, and publish the project
Navigate to the publish directory:
cd ./publish
The following commands are available:
dotnet rixbm.dll generate --target-container "iotreadings"
see help for options:
dotnet rixbm.dll generate --help
Description:
Generate synthetic data for testing.
Usage:
rixbm generate [options]
Options:
--target-container <target-container> (REQUIRED) The local path [<executing_directory>\<target-container>] or Azure storage container target to save generated files. The container will be created in Azure Storage account if it does not exist.
--storage-connection-string <storage-connection-string> The connection string for Azure Blob Storage. If not provided, a default value from environment variable named 'AZURE_STORAGE_CONNECTION_STRING' will be used.
--sql-connection-string <sql-connection-string> The connection string for SQL Database. If not provided, a default value from environment variable named 'AZURE_SQL_CONNECTION_STRING' will be used.
--report-dir <report-dir> The directory to save reports. [default: <executing_directory>\reports]
--devices-per-state <devices-per-state> The number of devices to generate per state. The 'state' represents the US states [default: 1000]
--readings-per-device <readings-per-device> The number of readings to generate per device. [default: 100]
--batch-size <batch-size> The number of metadata rows to insert per SQL batch. [default: 500]
--write-to-blob Whether to write generated files to Azure Blob Storage. Files will always be written to local disk.
--test-run Whether to run the sample test. Overrides --readings-per-device, --devices-per-state, and --batch-size.
*If you choose not to write to blob storage, files will be saved to local disk at: <executing_directory>\iotreadings and metadata will be inserted into SQL database. You may choose to upload these files to Azure Blob Storage later using your own tools or a tool like AzCopy.
dotnet rixbm.dll cosmosdbupload --source-dir "<executing_directory>\iotreadings"
see help for options:
dotnet rixbm.dll cosmosdbupload --help
Description:
Upload files to Cosmos DB.
Usage:
rixbm cosmosdbupload [options]
Options:
--source-dir <source-dir> The directory containing the files to upload. [default:
<executing_directory>\iotreadings]
--cosmosdb-connection-string <cosmosdb-connection-string> The connection string for Cosmos DB. If not provided, a default value from environment variable named 'AZURE_COSMOSDB_CONNECTION_STRING' will be used.
--cosmosdb-name <cosmosdb-name> The name of the Cosmos DB database. If not provided, a default value of 'iotdb' will be used. [default: iotdb]
--cosmosdb-container-name <cosmosdb-container-name> The name of the Cosmos DB container. If not provided, a default value of 'iotreadings' will be used. [default: iotreadings]
--cosmosdb-retries-per-document <cosmosdb-retries-per-document> The number of retries for each document upload. [default: 5]
--reports-dir <reports-dir> The directory to save reports. [default: <executing_directory>\reports]
--test-run Whether to run the sample test.
Run the benchmark:
dotnet rixbm.dll benchmark --sample-size 1000
see help for options:
dotnet rixbm.dll benchmark --help
Description:
Run benchmark tests.
Usage:
rixbm benchmark [options]
Options:
--storage-connection-string <storage-connection-string> The connection string for Azure Blob Storage. If not provided, a default value from environment variable named 'AZURE_STORAGE_CONNECTION_STRING' will be used.
--sql-connection-string <sql-connection-string> The connection string for SQL Database. If not provided, a default value from environment variable named 'SQL_CONNECTION_STRING' will be used.
--cosmosdb-connection-string <cosmosdb-connection-string> The connection string for Cosmos DB. If not provided, a default value from environment variable named 'AZURE_COSMOSDB_CONNECTION_STRING' will be used.
--cosmosdb-name <cosmosdb-name> The name of the Cosmos DB database. If not provided, a default value of 'iotdb' will be used. [default: iotdb]
--cosmosdb-container-name <cosmosdb-container-name> The name of the Cosmos DB container. If not provided, a default value of 'iotreadings' will be used. [default: iotreadings]
--sample-size <sample-size> The number of documents to use for the benchmark tests. [default: 500000]
--parallelism <parallelism> The degree of parallelism to use for the benchmark tests. Defaults to the number of processors on the machine. [default: 20]
--test-run Whether to run the sample test.
Benchmark results are written to:
<executing_directory>\reports\latency_results.csv
<executing_directory>\reports\latency_summary.json
Environment:
- VM: East US 2 (same region as Cosmos DB and Storage Account)
- Runtime: .NET 9.0
- OS: Unix 6.14.0.1012 [Ubuntu 24 LTS]
Networking: Both Storage Account and Cosmos DB configured with public access enabled.
| Metric | Storage Account | Cosmos DB | Ratio (Cosmos/Blob) |
|---|---|---|---|
| Average | 7.70 ms | 93.33 ms | 12.1x slower |
| P50 | 3.87 ms | 82.45 ms | 21.3x slower |
| P95 | 30.05 ms | 213.43 ms | 7.1x slower |
| P99 | 58.15 ms | 264.87 ms | 4.6x slower |
| Samples | 100,000 | 100,000 | - |
Latency Comparison Chart:
Networking: Both Storage Account and Cosmos DB configured with private endpoints (VNET integration).
| Metric | Storage (Public) | Storage (Private) | Cosmos DB (Public) | Cosmos DB (Private) | Delta (Private - Public, Storage) |
|---|---|---|---|---|---|
| Average (ms) | 7.70 | 16.05 | 93.33 | 93.60 | +8.35 (2.1x slower) |
| P50 (ms) | 3.87 | 10.10 | 82.45 | 81.74 | +6.23 (2.6x slower) |
| P95 (ms) | 30.05 | 52.92 | 213.43 | 214.97 | +22.87 (1.8x slower) |
| P99 (ms) | 58.15 | 76.57 | 264.87 | 269.94 | +18.42 (1.3x slower) |
Observation: Even under identical regional conditions, Cosmos DB point reads were consistently 10 to 20 times slower than equivalent reads from Blob Storage. It could be that Cosmos DB's consistency and metadata guarantees introduce a structural latency floor.
- Near hardware-level latency (3 to 8 ms median)
- Predictable cost per operation (no RUs)
- Simple, deterministic architecture
- Ideal for large-scale read-heavy workloads
- Predictable but higher base latency (80 to 100 ms)
- RU-based cost and consistency trade-offs
- Suited for multi-region writes and flexible schema ingestion
| Use Case | Recommended Store |
|---|---|
| Point reads by ID, read-heavy | SQL + Blob Storage |
| Analytical or transactional workloads | Cosmos DB |
| Multi-region conflict resolution | Cosmos DB |
| Massive immutable reads | Blob Storage |
| File | Description |
|---|---|
| latency_comparison.csv | Raw per-sample latency measurements |
| latency_summary.json | Summary statistics (avg, P50, P95, P99) |
{
"StorageAccount": {
"averageMs": 7.70,
"p50Ms": 3.87,
"p95Ms": 30.05,
"p99Ms": 58.15,
"samples": 100000,
"generatedUtc": "2025-11-18T19:36:24Z",
"machine": "rixbmvm",
"os": "Unix 6.14.0.1012",
"clr": "9.0.11"
},
"CosmosDB": {
"averageMs": 93.33,
"p50Ms": 82.45,
"p95Ms": 213.43,
"p99Ms": 264.87,
"samples": 100000,
"generatedUtc": "2025-11-18T19:36:24Z",
"machine": "rixbmvm",
"os": "Unix 6.14.0.1012",
"clr": "9.0.11"
}
}
If this is a viable architectural approach for your point read workloads, then this could be extended to build a highly resilient, globally distributed data access layer:
- Multi-Cloud SQL + Blob replication with geo-failover
- Cross-Cloud data access layer for hybrid scenarios
- Integration with edge computing nodes for local caching
- Integrate Azure Identity (service principals, managed identities) inplace of connection strings for secure access.
- Benchmark Summary:
- Latency Results: latency_results.csv
- Latency Summary: latency_summary.json
- Storage Account Requests:

- Cosmos DB Requests:

MIT License (c) 2025 Riatix Created for empirical analysis of Azure data store performance. Use freely with attribution.



