This chapter will teach you everything about databases in Azure, starting from absolute basics. We’ll explain what databases are, why you need them, and how to choose and use the right database service for your needs.
Without a database (storing data in text files or JSON files):
Problems:1. Slow searching - Finding one customer in 1 million records = read entire file - Takes seconds or minutes2. No concurrent access - Two people update same file = data corruption - Must lock file (only one person at a time)3. No relationships - How do you connect customers to their orders? - Must manually search and link data4. No data integrity - Nothing prevents duplicate customers - Nothing ensures email format is valid - Easy to corrupt data5. No backup/recovery - File corrupted = everything lost - No automatic backups
With a database:
Solutions:1. Fast searching - Indexes allow instant lookups (milliseconds) - Can search by name, email, ID, anything2. Concurrent access - 1000s of users can read/write simultaneously - Database handles conflicts automatically3. Relationships - Customer has many Orders (foreign keys) - Database enforces relationships4. Data integrity - Constraints prevent bad data - Transactions ensure consistency - No duplicates (unique constraints)5. Backup/recovery - Automatic backups every day - Point-in-time restore - Disaster recovery built-in
Customers:- ID, Name, Email, Phone, Address, Created DateProducts:- ID, Name, Description, Price, Stock Quantity, CategoryOrders:- ID, Customer ID, Order Date, Total Amount, StatusOrder Items (what's in each order):- ID, Order ID, Product ID, Quantity, PriceReviews:- ID, Product ID, Customer ID, Rating, Comment, Date
Without Database (text files):
customers.txt:1,John Doe,john@email.com,555-1234,123 Main St,2024-01-152,Jane Smith,jane@email.com,555-5678,456 Oak Ave,2024-01-16orders.txt:1,1,2024-02-01,99.99,Shipped2,2,2024-02-02,149.99,DeliveredHow do you find all orders for customer "John Doe"?- Read customers.txt, find John's ID (1)- Read ALL of orders.txt, find orders with Customer ID = 1- If 1 million orders = very slow!
With Database (SQL):
-- Find all orders for John Doe (instant, even with millions of orders)SELECT o.*FROM Orders oJOIN Customers c ON o.CustomerID = c.IDWHERE c.Name = 'John Doe';-- Result in milliseconds, even with 10 million orders
What it is: Data stored in tables with rows and columns, connected by relationships.Real-World Analogy: Filing Cabinet with Cross-References
Think of a library:- Customer card catalog (one drawer)- Book catalog (another drawer)- Checkout cards (third drawer)Cross-references:- Customer card references their checkouts- Checkout card references the book- Everything linked by ID numbersSame concept as relational databases!
When to Use Relational:
✅ Data has clear relationships (customers → orders → items)✅ Need ACID guarantees (banking, e-commerce)✅ Need complex queries (joins, aggregations)✅ Data structure is consistent✅ Strong data integrity requiredExamples:- E-commerce orders- Banking transactions- Employee management- Inventory systems- ERP/CRM systems
What it is: Flexible data storage without fixed schemas.Real-World Analogy: Boxes with Labels
Think of storage boxes:- Each box (document) can contain different things- Box 1: Customer info + preferences + purchase history (all in one)- Box 2: Different customer with different fields- No need for everything to match- Just search by label
When to Use NoSQL:
✅ Flexible schema (fields change frequently)✅ Need massive scale (millions of records)✅ Global distribution (users worldwide)✅ Semi-structured data (JSON documents)✅ High read/write throughputExamples:- User profiles- Product catalogs- Social media posts- IoT sensor data- Gaming leaderboards- Real-time analytics
Azure Options:
Cosmos DB (multi-model: document, key-value, graph, column)
Modern E-Commerce Application:Azure SQL Database:- Orders (need ACID guarantees)- Inventory (need transactions)- Payments (critical consistency)Cost: ~$200/month (Standard S3 tier, 100 DTUs)Cosmos DB:- Product catalog (flexible schema)- User sessions (high throughput)- Shopping cart (global distribution)Cost: ~$25-200/month (depends on RU/s provisioned)Blob Storage:- Product images- User uploadsCost: ~$18/month per TB (Hot tier)Redis Cache:- Session data, frequently read products, cart totals- Eliminates 80% of database readsCost: ~$15/month (Basic C0) to ~$55/month (Standard C1)Why both?- Use the right tool for each job- SQL for critical data that needs ACID guarantees- NoSQL for scale, flexibility, and global distribution- Storage for files and unstructured data- Redis for sub-millisecond reads and reducing database load
Cost Tip: The biggest database cost mistake is provisioning Cosmos DB with a fixed RU/s allocation and forgetting about it. A common pattern: developer sets 10,000 RU/s during a load test (584/month),thenneverscalesitbackdown.AlwaysuseautoscaleprovisioningforCosmosDB−−itscalesbetween1058/month during quiet periods.
Traditional SQL databases like SQL Server or PostgreSQL are designed to run on a single machine. When you hit the limit of that machine (even if it’s 128 vCPUs), you have to Shard.
The Problem: You split your “Customers” into two databases: A-M in DB1, N-Z in DB2.
The Principal’s Headache: How do you do a JOIN between a customer in DB1 and an order in DB2? You can’t. Your application code now has to handle the routing and the distributed logic.
Real-World Analogy: Imagine a library that has grown too large for one building. You split it into two buildings: Fiction (Building A) and Non-Fiction (Building B). Simple lookups still work (“find this novel” goes to Building A). But cross-building queries become painful: “Find all books by authors who also wrote non-fiction” now requires a trip to both buildings, collecting results, and merging them manually. This is the fundamental challenge of sharding — and why most teams avoid it until they absolutely must.Practical Tip: Before you shard, exhaust these options first (in order of complexity and cost):
Query optimization: Free. A single bad query can consume 90% of your database capacity. Use Azure SQL’s Query Performance Insights to find the top 5 most expensive queries — fixing them often eliminates the scaling need entirely.
Caching: ~$15-100/month. Put Redis in front of your database for frequently-read data (cuts DB load by 80%+). Start with Azure Cache for Redis Basic tier for dev, Standard for production.
Read replicas: Included free in Premium/Business Critical tiers. Offload read traffic to up to 4 replicas. This is the easiest “scaling” win because it requires zero application changes if you use read-only connection strings.
Vertical scaling: Move to a bigger Azure SQL tier (up to 128 vCPUs, 4 TB RAM on Business Critical). Simple but expensive — doubling compute roughly doubles cost.
Sharding: Only when all the above have been exhausted and you are still bottlenecked on writes. Sharding adds permanent application complexity that you will carry forever. Consider Cosmos DB instead if you can tolerate eventual consistency.
You can add hundreds of Read Replicas to handle users browsing your site. But every Write (an order) must go to the Single Primary. This is the bottleneck for every relational database in the cloud.
Azure SQL: Choose C (Consistency). If the primary fails, the database is unavailable for a few seconds until a new primary is elected.
Cosmos DB: Choose A (Availability). It can be configured for “Multi-Master” writes, allowing you to write to any region, even if other regions are down.
[!IMPORTANT]
Pro Insight: NewSQL
Azure SQL Hyperscale is Microsoft’s answer to the “Distributed SQL” problem. It separates Compute from Storage, allowing the storage to grow to 100 TB while providing multiple read-compute nodes, effectively blurring the line between traditional SQL and cloud-scale distribution.
DTU = Bundled measure of CPU, memory, I/OTiers:- Basic: 5 DTUs, 2 GB max- Standard: 10-3,000 DTUs, up to 1 TB- Premium: 125-4,000 DTUs, up to 4 TBExample: S3 (100 DTUs) = ~$150/month
When to use: Simple workloads, predictable performance
Virtual Cores (more control)
Choose: CPU cores + Memory + Storage separatelyTiers:- General Purpose: Balanced, SSD, 99.99% SLA- Business Critical: High performance, local SSD, 99.995% SLA- Hyperscale: Up to 100 TB, rapid scaleExample: 4 vCores + 32 GB + 500 GB = ~$700/month
When to use: Complex workloads, need specific resources
[!WARNING]
Gotcha: DTU vs vCore
DTU is cheaper for small databases but scales poorly. vCore is more expensive but allows you to save money with “Reserved Instances” (1-3 year commitment). Moving from DTU to vCore is easy, but going back can be tricky.
[!TIP]
Jargon Alert: ACID vs BASEACID (Azure SQL): Data is always correct immediately (Bank transactions).
BASE (Cosmos DB): Data is eventually correct (Social media feed). Cosmos DB creates copies around the world, so it might take a few milliseconds for a “Like” to show up in Japan if it happened in Brazil.
Architecture:- Compute and storage together- Local SSD with 4 replicas- Synchronous replication- 99.995% SLAFailover time: <10 secondsRPO: 0 (zero data loss)Bonus: Read-only replica included (free)
To a pro, indexes aren’t just “go-faster stripes”. They are physical data structures (B-Trees) that occupy disk space.
The Covered Index: The INCLUDE clause in the example above is critical. By including FirstName and LastName in the index, the query can get all the data it needs directly from the index leaf nodes without having to go back to the main table (the “Key Lookup”). This can turn a 500ms query into a 5ms query.
SARGable Queries: “Search ARGument-able”. If you use a function on a column in your WHERE clause (e.g., WHERE YEAR(JoinDate) = 2024), the database CANNOT use an index on JoinDate. It must scan every row.
Write Penalty: Every index you add makes your INSERT and UPDATE operations slower. In high-traffic write systems, “Over-indexing” is as dangerous as “Under-indexing”.
3. Read Scale-Out
Use read-only replica for reporting:
// Write connection (primary)string connStrWrite = "Server=tcp:myserver.database.windows.net;Database=mydb;ApplicationIntent=ReadWrite;";// Read connection (replica)string connStrRead = "Server=tcp:myserver.database.windows.net;Database=mydb;ApplicationIntent=ReadOnly;";// Routes to read-only replica (no load on primary)using (var conn = new SqlConnection(connStrRead)){ // Run reports without impacting production}
In a distributed system, you don’t just have CAP (Consistency, Availability, Partition Tolerance). You also have PACELC:
If there is a Partition, you choose between Availability or Consistency.
Else, you choose between Latency or Consistency.
Cosmos DB allows you to tune this exactly for your application.
Level
Consistency
Latency
RU Cost
Use Case
Strong
Highest
Highest
2.0x
FinTech / Transfers
Bounded
Guaranteed Lag
Medium
1.0x
Stock Tickers
Session
Read-Your-Own-Write
Low
1.0x
Social Media / Cart
Prefix
Ordering Guaranteed
Lowest
1.0x
Chat / Comments
Eventual
No Guarantees
Lowest
1.0x
View Analytics
[!WARNING]
Performance Gotcha: The Strong Consistency Tax
Strong consistency requires a majority of replicas to ACK a write before success. If you have replicas in US, Europe, and Asia, a “Strong” write will take hundreds of milliseconds (speed of light limit). NEVER use Strong consistency for global applications unless absolutely necessary.
Tiers:- Basic: 1-2 vCores, up to 1 TB- General Purpose: 2-64 vCores, up to 16 TB- Memory Optimized: 2-32 vCores, high memoryFeatures:✅ Automatic backups (7-35 days)✅ Automatic patching✅ Built-in HA❌ No read replicas (except cross-region)❌ No zone redundancyStatus: Being retired (use Flexible Server)
Advanced features, better performance
Features:✅ Zone-redundant HA (99.99% SLA)✅ Read replicas (same region)✅ Burstable VMs (B-series)✅ Stop/Start (save costs)✅ Custom maintenance window✅ Better performance (up to 3x faster)Price: Similar to Single ServerUse: All new deployments
Blocking: Check for long-running transactions that are locking rows. Use sys.dm_tran_locks.
CPU Throttling: If your compute utilization is at 100%, Azure will “throttle” your queries. The solution is usually vertical scaling (more vCores) or fixing a missing index.
Connection pooling: Is your app opening a new connection for every request? This is expensive. Use a connection pool (like HikariCP or Entity Framework pooling).
Cloud databases sometimes reset connections during maintenance.
The Fix: Implement Retry Logic in your application code. Most Azure SDKs have this built-in, but you must enable it. Use “Exponential Backoff” to avoid overwhelming the database after a reset.
[!TIP]
Pro Tool: Query Store
In Azure SQL, enable Query Store. It’s a “black box flight recorder” for your database that tracks every query’s performance history. It can even automatically roll back a bad execution plan!
Your e-commerce app uses Azure SQL for everything. Performance is degrading at 50,000 daily users. How do you redesign the data layer?
Strong Candidate Answer:
Diagnosis first: Use Query Performance Insight to find the top 5 resource-consuming queries. Check DTU utilization — if above 80%, the workload is mismatched. In my experience, 90% of DB performance problems come from 3-5 bad queries.
The anti-pattern: Using one database for different access patterns. Orders need ACID (SQL). Product catalog needs flexible schema (NoSQL). Sessions need sub-millisecond reads (cache). Analytics needs columnar scans (data warehouse).
Redesign: Orders stay in Azure SQL. Product catalog moves to Cosmos DB (flexible JSON, global reads). Sessions move to Redis ($40/month, sub-millisecond). Analytics to Synapse or read replicas.
Cost impact: Single Azure SQL Premium P6 at 3,000/monthbecomesAzureSQLS3(150) + Cosmos DB (100)+Redis(40) + Synapse (50)=340/month with 10x better performance.
Follow-up: The team resists learning Cosmos DB. Everyone knows SQL. How do you handle this?Cosmos DB’s SQL API has query syntax nearly identical to T-SQL. SELECT, WHERE, ORDER BY all work. The main shift is partition keys and avoiding cross-partition queries. A 2-hour workshop plus a 2-day proof-of-concept typically converts skeptics. The learning curve is weeks, not months.
Explain Cosmos DB Request Units. Your bill jumped from $200 to $2,400/month after a feature launch. How do you investigate?
Strong Candidate Answer:
RU basics: 1 RU = one point read of a 1 KB document by ID and partition key. A query scanning 100 documents costs 5-50 RUs. Writes cost 5-10 RUs. You provision RU/s and pay whether you use them or not.
Why the bill jumped: At 200/monththeyhad400RU/s.At2,400/month, ~5,000 RU/s. The new feature likely introduced cross-partition queries (50-100x more expensive), large document writes, or someone manually provisioned high RU/s.
Investigation: Check “Normalized RU Consumption” in metrics. Use diagnostic logs to find queries with highest x-ms-request-charge. Look for missing partition key in WHERE clauses.
Fix: Switch to autoscale throughput. Add partition key to every query. Use projections instead of SELECT *. Add Redis as read-through cache (cache hits cost 0 RU).
Follow-up: Compare Cosmos DB with DynamoDB pricing at 10,000 reads/second.Cosmos DB 10,000 RU/s: ~584/month.DynamoDB10,000RCU(stronglyconsistent):650/month. For eventually consistent reads, DynamoDB drops to ~$325/month. Real cost difference is writes: DynamoDB is 30-40% cheaper for write-heavy workloads. For global distribution, both multiply cost by region count.
Migrate an on-premises SQL Server 2016 database (2 TB, 500 users) to Azure. Compare your options.
Strong Candidate Answer:
Azure SQL Database (PaaS): ~95% compatible. No cross-database queries, no SQL Agent, no CLR. Best for teams willing to refactor. Business Critical 8 vCores: ~$2,900/month.
SQL Server on Azure VM (IaaS): 100% compatible (it IS SQL Server). You manage OS patching, HA, backups. Best for SSIS/SSRS on same instance. Cost: ~$1,000-1,500/month with AHUB.
My recommendation: SQL Managed Instance. A 2 TB database with 500 concurrent users benefits from managed HA and near-100% compatibility. Use Azure Database Migration Service for online migration with under 10 minutes downtime.
Follow-up: The DBA says Managed Instance General Purpose has high I/O latency versus on-premises. How do you address this?Correct — General Purpose uses remote Azure Premium Storage (5-10ms latency vs sub-1ms on-premises SSD). For transactional workloads, upgrade to Business Critical tier which uses local NVMe SSD with sub-2ms latency. It includes built-in read replicas at no extra cost. Cost jumps from 1,400to2,900/month, but if P99 query latency improves 3-5x, the business value justifies it.