Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Serverless Architecture

Serverless lets you focus on code without managing infrastructure. Azure handles scaling, availability, and operations. Azure Serverless Architecture

What You’ll Learn

By the end of this chapter, you’ll understand:
  • What “serverless” actually means (spoiler: servers still exist!)
  • Why serverless can save you money and time
  • When to use serverless vs containers vs VMs
  • How to build event-driven applications
  • Real cost comparisons and examples

Introduction: What is Serverless? (Start Here if You’re Completely New)

The Name is Misleading

First, let’s clear up confusion: “Serverless” DOES NOT mean “no servers” Servers still exist! You just don’t see them, manage them, or pay for them when they’re idle. Better name would be: “Server-Invisible” or “Someone-Else-Manages-The-Servers”

Real-World Analogy: Electricity

Think about how you use electricity: Before Modern Electricity (Like Traditional Servers):
Your Factory:
- Buy a generator ($10,000)
- Buy fuel continuously ($500/month)
- Hire someone to maintain it ($3,000/month)
- Generator runs 24/7 (even at night when factory is closed)
- You pay for generator even when factory is off
- If generator breaks, your factory stops

Total cost:
- Upfront: $10,000
- Monthly: $3,500
- Wasted energy: ~60% (nights/weekends)
Modern Electricity Grid (Like Serverless):
Your Factory:
- Plug into wall socket
- Pay only for electricity you actually use
- $0.10 per kilowatt-hour
- Someone else maintains power plants
- Power scales automatically (more appliances = more power)
- If power plant breaks, another one takes over (you don't notice)

Total cost:
- Upfront: $0
- Monthly: ~$500 (only when factory runs)
- Wasted energy: 0%
Serverless computing works the same way:
  • Plug in your code
  • Pay only when code runs
  • Azure maintains servers
  • Scales automatically
  • If server fails, another takes over

What is Serverless Computing?

Serverless = A cloud computing model where:
  1. You write code (functions)
  2. Cloud provider runs your code when triggered (HTTP request, file upload, timer, etc.)
  3. You pay only for execution time (per millisecond!)
  4. Provider handles everything else (servers, scaling, maintenance, security patches)
The Magic: Your code can scale from 0 requests/day to 1 million requests/day without changing anything.

The Problem Serverless Solves

Scenario: You built a web API that resizes user-uploaded images. Traditional VM/Container Approach:
You create a server (VM) that:
- Runs 24/7/365
- Waits for image upload requests
- Processes images when they arrive

Reality:
Monday 9am: 1,000 images/hour → Server handles it
Monday 3pm: 50 images/hour → Server mostly idle (wasting money)
Tuesday 3am: 0 images/hour → Server idle (wasting money)
Black Friday: 10,000 images/hour → Server crashes (need more VMs)

Cost:
- VM runs 24/7: $50/month
- You process images 2 hours/day on average
- Wasted capacity: ~92% of the time
- Wasted money: ~$46/month
Serverless Approach:
You write a function that:
- Runs ONLY when image is uploaded
- Processes the image
- Stops immediately after

Reality:
Monday 9am: 1,000 images → Azure spins up containers automatically
Monday 3pm: 50 images → Azure scales down automatically
Tuesday 3am: 0 images → Nothing runs, $0 cost
Black Friday: 10,000 images → Azure scales to handle it automatically

Cost:
- Pay per execution: $0.20 per million executions
- 100,000 images/month = $0.02/month
- Wasted capacity: 0%
- Savings: $49.98/month (99.96% cheaper!)

Breaking Down “Serverless”

What you DON’T manage (Azure does it): ❌ Provisioning servers ❌ Installing operating systems ❌ Configuring networking ❌ Security patches ❌ Scaling infrastructure ❌ Load balancing ❌ Server maintenance ❌ Paying for idle time What you DO manage: ✅ Write your code (JavaScript, Python, C#, Java, etc.) ✅ Define triggers (HTTP, Timer, File upload, etc.) ✅ Deploy your function ✅ Monitor execution Time spent:
  • Traditional VM: 10-20 hours/month managing infrastructure
  • Serverless: 1-2 hours/month managing code

Real-World Example: Photo Sharing App

Without Serverless (Traditional Approach):
User uploads photo (5 MB)

VM receives photo (VM always running)

VM generates 3 thumbnails (small, medium, large)

VM saves thumbnails to storage

VM is idle again (still running, still costing money)

Infrastructure:
- 1 VM: $50/month (runs 24/7)
- Process 1,000 photos/day
- VM busy: ~2 hours/day
- VM idle: ~22 hours/day (wasting $45/month)

Total cost: $50/month
Developer time: 15 hours/month (maintaining VM, scaling, updates)
With Serverless (Azure Functions):
User uploads photo (5 MB) → Triggers Azure Function

Azure Function spins up (50ms)

Generates 3 thumbnails (2 seconds)

Saves thumbnails to Blob Storage

Azure Function shuts down (0ms)

Nothing runs until next photo upload

Infrastructure:
- Azure Function: Runs 2 seconds per photo
- 1,000 photos/day × 2 seconds = 2,000 seconds/day = 33 minutes/day
- 33 minutes/day × 30 days = 16 hours/month of execution

Pricing:
- 1,000 photos/day × 30 days = 30,000 executions/month
- Execution time: 30,000 × 2 seconds = 60,000 GB-seconds
- Cost: ~$1/month

Total cost: $1/month (98% savings!)
Developer time: 2 hours/month (just deploying code updates)

Cost Comparison: Real Numbers

API Backend for Mobile App (100,000 users):
MetricTraditional VMServerless
Traffic PatternPeaks 8am-10pm, dead midnight-6amSame
Average Requests/Day500,000500,000
Server Uptime24/7 (720 hours/month)Only when requests come (avg 100 hours/month)
Infrastructure Cost$150/month$5/month
ScalingManual (add more VMs)Automatic (0 to 1000 instances)
Idle Cost$100/month wasted$0 wasted
Dev Time20 hours/month3 hours/month
Savings: $145/month + 17 hours/month of developer time

When Traditional Servers Waste Money

Pattern 1: Unpredictable Workloads
Your API:
- Monday: 10,000 requests
- Tuesday: 500 requests
- Wednesday: 25,000 requests
- Friday: 100,000 requests (newsletter sent)

Traditional: Size server for worst case (Friday) = $200/month
Serverless: Pay for actual usage each day = $15/month
Pattern 2: Scheduled Jobs
Nightly data processing job:
- Runs once per day at 2am
- Takes 10 minutes
- Needs powerful machine (16 cores, 64 GB RAM)

Traditional: Server runs 24/7 = $500/month (for 10 min of work!)
Serverless: Runs 10 minutes/day = $5/month (99% savings)
Pattern 3: Event-Driven Tasks
Image processing:
- Users upload images sporadically
- Need instant processing
- 50-500 images/day (varies wildly)

Traditional: Server ready 24/7 = $80/month
Serverless: Runs only on upload = $2/month

When is Serverless NOT the Answer?

Don’t use serverless for: Long-running processes (>10 minutes)
Example: Video encoding that takes 30 minutes
Problem: Azure Functions timeout at 10 minutes (even on premium plan)
Solution: Use containers (AKS) or VMs
Constant high traffic (millions of requests/hour continuously)
Example: Real-time stock trading platform (10M requests/hour, 24/7)
Problem: Serverless costs more than dedicated servers at this scale
Solution: Use VMs or containers

Cost comparison:
- Serverless: 10M × 24 × 30 = 7.2 billion requests/month = $1,440/month
- Dedicated VMs: 5 VMs × $100 = $500/month (cheaper!)
Stateful applications (need to remember data between requests)
Example: WebSocket connections, game servers, chat servers
Problem: Serverless functions are stateless (forget everything after execution)
Solution: Use VMs, containers, or Azure SignalR Service
Need for specific hardware (GPUs, TPUs, specialized chips)
Example: Machine learning model training with GPUs
Problem: Serverless functions run on standard CPUs
Solution: Use Azure ML, VMs with GPUs, or AKS with GPU nodes

Serverless vs Containers vs VMs: Decision Tree

START: I need to run code in the cloud

Do I need it to run 24/7 with constant traffic?

    ├─ YES → Does my code take >10 minutes to execute?
    │         │
    │         ├─ YES → Use VMs or Containers (AKS)
    │         │
    │         └─ NO → Is traffic VERY high (millions req/hour)?
    │                 │
    │                 ├─ YES → Use VMs (cheaper at massive scale)
    │                 │
    │                 └─ NO → Use Serverless or Containers

    └─ NO → Is my code event-driven (runs occasionally)?

              ├─ YES → Does each execution take <10 minutes?
              │         │
              │         ├─ YES → Use Serverless! ✅ (Best fit)
              │         │
              │         └─ NO → Use Containers (AKS)

              └─ NO → Use Serverless or Containers (both work)

Common Serverless Use Cases

✅ Perfect for Serverless:
  1. REST APIs (CRUD operations, mobile backends)
  2. File Processing (image resize, PDF generation, video thumbnails)
  3. Scheduled Tasks (nightly reports, data cleanup)
  4. Webhooks (GitHub webhooks, payment notifications)
  5. IoT Processing (sensor data, telemetry)
  6. Real-time Stream Processing (log analysis, event processing)
  7. Chatbots (respond to messages)
⚠️ Maybe Serverless (depends on requirements):
  1. Data Processing Pipelines (if tasks <10 min each)
  2. Backend for SPA (if traffic not extremely high)
  3. Microservices (if services are stateless)
❌ Not Serverless:
  1. Databases (always-on, stateful)
  2. Websocket Servers (long-lived connections)
  3. Game Servers (stateful, real-time)
  4. ML Model Training (long-running, GPU-intensive)

Understanding “Cold Starts”

Cold Start = Delay when first request comes after period of inactivity. What happens:
Your function hasn't run for 20 minutes

Azure scaled it to zero (no containers running)

New request arrives

Azure must:
1. Allocate a server (100ms)
2. Start a container (1-2 seconds)
3. Load your code (500ms)
4. Run your function (your code execution time)

Total: 2-5 seconds for first request

Second request arrives 10 seconds later:

Container still warm (running)

Run your function immediately

Total: Your code execution time only (no delay!)
Real Impact:
API endpoint for mobile app:
- User 1 (first request after idle): 3 seconds ← Cold start
- User 2 (10 seconds later): 50ms ← Warm
- User 3 (15 seconds later): 50ms ← Warm
- [20 minutes of no traffic]
- User 4: 3 seconds ← Cold start again
Solutions (ordered by cost):
  1. Accept it: For non-critical workloads (scheduled jobs, webhooks), cold starts are fine. A 3-second delay on a webhook processing background tasks is invisible to users.
  2. Optimize your code: Cold start duration depends heavily on your runtime and package size. Python and Node.js cold start in ~1-2 seconds; .NET and Java cold start in 3-8 seconds. Reduce dependencies — every imported library adds to initialization time.
  3. Keep-alive ping: Create a Timer-triggered function that runs every 5 minutes with an empty body. This keeps the container warm for $0/month (timer executions are free within the Consumption plan’s 1 million free requests). The tradeoff: you pay for the memory-seconds of keeping an idle container allocated.
  4. Premium Plan (Elastic Premium): Keep 1+ instances always warm ($150/month base). This eliminates cold starts entirely and also gives you VNet integration, larger execution limits (up to 60 minutes), and more powerful instances. Best for customer-facing APIs where 3-second delays are unacceptable.
  5. Dedicated Plan (App Service Plan): Run Functions on always-on App Service instances. You pay for the App Service Plan whether functions run or not — identical pricing to hosting a web app. Only makes sense if you already have an App Service Plan with spare capacity.
Cost Comparison for 100,000 requests/month:
Consumption (with cold starts):   ~$0.20/month  + occasional 3s delays
Consumption (with keep-alive):    ~$1.00/month  + minimal cold starts
Premium (1 pre-warmed instance):  ~$150/month   + zero cold starts
The decision: If your function backs a user-facing API, the 150/monthPremiumplanisalmostalwaysworthit.Ifitprocessesbackgroundjobs,acceptthecoldstartsandsave150/month Premium plan is almost always worth it. If it processes background jobs, accept the cold starts and save 149.80/month.

Cost Example: Detailed Breakdown

Webhook Handler (GitHub webhook for CI/CD): Usage:
  • 50 developers
  • Each pushes code 10 times/day
  • Each push triggers webhook
  • Total: 500 executions/day = 15,000/month
  • Each execution: 200ms
Serverless Cost:
Pricing:
- First 1 million executions: FREE
- Execution time: 15,000 × 0.2 seconds = 3,000 seconds
- Memory: 512 MB
- GB-seconds: 3,000 × 0.5 = 1,500 GB-seconds
- First 400,000 GB-seconds: FREE
- Cost: $0/month ✅
Traditional VM Cost:
VM (B2s: 2 vCPU, 4 GB RAM):
- Always running: 24/7
- Cost: $30/month
- Wasted capacity: 99.97% (runs 1 hour total, idle 719 hours)
- Wasted money: $29.90/month
Savings: $30/month (100% savings) + zero management overhead

1. Azure Functions

Azure Functions is Azure’s serverless compute platform for event-driven code execution.

Hosting Plans

Pay per execution (true serverless)
  • Automatic scaling (0 to 200 instances)
  • Pay only when code runs
  • 5-minute execution limit
  • Cold start (~3-10 seconds)
  • Cost: 0.20permillionexecutions+0.20 per million executions + 0.000016/GB-s
Use for: Event-driven, unpredictable workloads
[!WARNING] Gotcha: Consumption Plan Timeouts The default timeout for a function is 5 minutes (can increase to 10). If your code takes 11 minutes to run, the platform will kill it mid-execution. For long tasks, use Durable Functions.
[!TIP] Jargon Alert: Cold Start If no one calls your function for 20 minutes, Azure keeps it “cold” (off) to save money. The next person to call it waits 5-10 seconds while Azure boots up a server in the background.

Triggers and Bindings

Common Triggers

  • HTTP (REST APIs)
  • Timer (CRON jobs)
  • Blob Storage (file upload)
  • Queue Storage (messages)
  • Event Hub (streaming)
  • Cosmos DB (change feed)

Output Bindings

  • HTTP response
  • Queue Storage
  • Blob Storage
  • Cosmos DB
  • Table Storage
  • Event Hub

Example Functions

[FunctionName("HttpExample")]
public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequest req,
    ILogger log)
{
    log.LogInformation("C# HTTP trigger function processed a request.");

    string name = req.Query["name"];
    return new OkObjectResult($"Hello, {name}");
}

2. Durable Functions

Durable Functions enable stateful workflows in serverless.

Function Chaining

[FunctionName("ProcessOrder")]
public static async Task<object> RunOrchestrator(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    var order = context.GetInput<Order>();

    // Sequential workflow
    var payment = await context.CallActivityAsync<Payment>("ProcessPayment", order);
    var inventory = await context.CallActivityAsync<Inventory>("ReserveInventory", order);
    var shipment = await context.CallActivityAsync<Shipment>("CreateShipment", order);

    return new { payment, inventory, shipment };
}

[FunctionName("ProcessPayment")]
public static Payment ProcessPayment([ActivityTrigger] Order order, ILogger log)
{
    log.LogInformation($"Processing payment for order {order.Id}");
    // Payment processing logic
    return new Payment { OrderId = order.Id, Status = "Paid" };
}

Fan-Out/Fan-In

[FunctionName("ProcessBatch")]
public static async Task<long> RunOrchestrator(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    var files = context.GetInput<string[]>();

    // Fan-out: Process all files in parallel
    var tasks = new List<Task<long>>();
    foreach (var file in files)
    {
        tasks.Add(context.CallActivityAsync<long>("ProcessFile", file));
    }

    // Fan-in: Wait for all to complete
    var results = await Task.WhenAll(tasks);

    // Return total bytes processed
    return results.Sum();
}

3. Logic Apps

Logic Apps provide visual workflow automation with 400+ connectors.

Use Cases

Integration

Connect SaaS apps (Salesforce, SAP, Office 365)

Automation

Automate business processes and approvals

B2B

EDI and enterprise messaging

Scheduled Tasks

Recurring workflows (reports, backups)

Example Workflow

{
  "definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "triggers": {
      "When_a_blob_is_added": {
        "type": "ApiConnection",
        "inputs": {
          "host": {
            "connection": {
              "name": "@parameters('$connections')['azureblob']['connectionId']"
            }
          },
          "method": "get",
          "path": "/datasets/default/triggers/batch/onupdatedfile"
        },
        "recurrence": {
          "frequency": "Minute",
          "interval": 5
        }
      }
    },
    "actions": {
      "Get_blob_content": {
        "type": "ApiConnection",
        "inputs": {
          "host": {
            "connection": {
              "name": "@parameters('$connections')['azureblob']['connectionId']"
            }
          },
          "method": "get",
          "path": "/datasets/default/files/@{encodeURIComponent(encodeURIComponent(triggerBody()?['Path']))}/content"
        }
      },
      "Parse_JSON": {
        "type": "ParseJson",
        "inputs": {
          "content": "@body('Get_blob_content')",
          "schema": {
            "type": "object",
            "properties": {
              "orderId": { "type": "string" },
              "amount": { "type": "number" }
            }
          }
        },
        "runAfter": {
          "Get_blob_content": ["Succeeded"]
        }
      },
      "Condition": {
        "type": "If",
        "expression": {
          "and": [
            {
              "greater": ["@body('Parse_JSON')?['amount']", 1000]
            }
          ]
        },
        "actions": {
          "Send_approval_email": {
            "type": "ApiConnection",
            "inputs": {
              "host": {
                "connection": {
                  "name": "@parameters('$connections')['office365']['connectionId']"
                }
              },
              "method": "post",
              "body": {
                "To": "manager@company.com",
                "Subject": "Approval Required: Order @{body('Parse_JSON')?['orderId']}",
                "Body": "Order amount: $@{body('Parse_JSON')?['amount']}"
              },
              "path": "/v2/Mail"
            }
          }
        }
      }
    }
  }
}

3. Messaging & Eventing: The Glue of Serverless

In a serverless world, services don’t talk to each other directly; they use Messages and Events. Choosing the right one is the marks of a Principal Engineer.

Service Bus vs. Event Hub vs. Event Grid

ServiceBest For…Key FeatureAnalogy
Service BusCritical TransactionsFIFO (Ordering), Dead-Letter QueuesRegistered Mail (Tracking included)
Event HubsBig Data StreamingMillions of events/sec, PartitionsA high-speed highway with many lanes
Event GridReactive ProgrammingDirect routing from Azure resourcesA doorbell (Ding! Something happened)

1. Azure Service Bus (The Reliable Messenger)

Use this when you cannot afford to lose a single message (e.g., an Order).
  • Queues: One-to-one communication.
  • Topics: One-to-many. One order can trigger “Email Service”, “Inventory Service”, and “shipping Service”.
  • Pro Feature: Dead Letter Queues (DLQ): If a message fails to process 10 times, Service Bus moves it to a DLQ. This keeps your system running while you debug the “poison message”.

2. Azure Event Hubs (The Data Firehose)

Use this for logs, telemetry, or clickstream data.
  • Partitioning: This is how Event Hubs scales. If you have 32 partitions, 32 different functions can read data in parallel.
  • Capture: Automatically save streaming data into Blob Storage or Data Lake for later analysis.

3. Azure Event Grid (The Event Distributor)

The lightest and cheapest way to react to changes.
  • Use Case: When a blob is uploaded, Event Grid tells your Function instantly.
  • Push-Push Model: It pushes the event to your code, rather than your code “polling” for new work.
[!IMPORTANT] Pro Tip: The Decision Tree
  • Need Reliability and Transactions? → Service Bus
  • Need Massive Throughput (Logs/IoT)? → Event Hubs
  • Need to React to Azure Alerts? → Event Grid

4. Event Grid

Event Grid is Azure’s event routing service for reactive programming.

Event Sources → Event Grid → Event Handlers

Example: Image Processing Pipeline

# Create Event Grid subscription
az eventgrid event-subscription create \
  --name process-images \
  --source-resource-id /subscriptions/.../storageAccounts/mystorage \
  --endpoint /subscriptions/.../functions/ProcessImage \
  --included-event-types Microsoft.Storage.BlobCreated \
  --subject-begins-with /blobServices/default/containers/uploads/
When image uploaded → Event Grid triggers function → Process image → Save thumbnail

5. Serverless Patterns

Mobile App → API Management → Azure Functions → Cosmos DB

Benefits:
- Pay per use (cost-effective)
- Auto-scaling (handles traffic spikes)
- No server management
IoT Device → IoT Hub → Azure Function → Time Series Insights

Use case: Real-time telemetry processing
Timer Trigger → Azure Function → Process Data → Storage

Use case: Daily report generation, cleanup tasks
External System → HTTP Function → Process → Database

Use case: GitHub webhooks, payment notifications

6. Best Practices

Keep Functions Small

Single responsibility, fast execution (<5 seconds)

Use Queues

Decouple functions with queues for reliability

Handle Idempotency

Functions may execute multiple times, design for it

Monitor Everything

Application Insights for logging, metrics, traces

Use Managed Identity

No connection strings in code

Consider Cold Starts

Use Premium plan for critical apps

7. Interview Questions

Beginner Level

Answer: A delay that occurs when the first request comes in after a period of inactivity. The provider has to provision execution environment/server before running code.Mitigation:
  • Use Premium Plan (pre-warmed instances)
  • Keep alive ping
  • Use Dedicated (App Service) plan
Answer:
  • Trigger: Defines how a function is invoked (e.g., HTTP request, Timer, Blob uploaded). A function must have exactly one trigger.
  • Binding: Connects input/output data resources declaratively (e.g., Read from Cosmos DB, Write to Queue). Optional.

Intermediate Level

Answer: When you need stateful workflows in a stateless environment:
  • Chaining: Function A -> Function B -> Function C
  • Fan-out/Fan-in: Run multiple functions in parallel, wait for all to finish.
  • Human Interaction: Wait for approval trigger (email link).
  • Monitor: Long-running polling process.
Answer:
  • Azure Functions: Code-first. Best for complex logic, custom algorithms, existing libraries.
  • Logic Apps: Design-first (GUI). Best for integration, connecting SaaS apps, orchestrating disparate systems without writing code.

Advanced Level

Answer:
  • Scale Controller: Monitors event rate and adds instances.
  • Maximum Instances: usually 200 instances (Windows/Linux).
  • Execution Time: Default 5 mins, max 10 mins.
  • Throughput: Limited by scale rate (e.g., 1 instance every few seconds).
If you need longer execution or VNet integration, use Premium.

Troubleshooting: When Serverless Fails

Debugging code that runs on “someone else’s server” can be tricky. Here is the Principal’s playbook.

1. The “Function Host is Restarting” Loop

If your functions keep failing to start:
  • Missing Application Setting: Check if your AzureWebJobsStorage setting is missing or invalid. Functions need a storage account to store their own internal state.
  • Runtime Version Mismatch: Did you deploy Node 20 code to a Function App configured for Node 18? Check the FUNCTIONS_EXTENSION_VERSION and WEBSITE_NODE_DEFAULT_VERSION.

2. The “Timeout” Triage

If your function is being killed mid-process:
  • Consumption Limits: Remember the 5/10 minute limit. If you need 30 minutes, you must move to the Premium Plan or a Dedicated Plan.
  • Zombie Processes: If your code starts a background thread and doesn’t wait for it, the Function Host might shut down before the thread finish. Always use async/await and wait for all tasks.

3. “403 Forbidden” (Managed Identity)

If your function works on your laptop but fails in Azure:
  • Local Settings: You are likely using your personal credentials locally. In Azure, you need to enable Managed Identity and grant it permissions (e.g., “Storage Blob Data Contributor”) on the target resource.
[!TIP] Pro Tool: Application Insights (Live Metrics) Don’t wait for log ingestion. Use the Live Metrics Stream in Application Insights. It shows you the heartbeat of your function app with near-zero latency. You can see CPU spikes and request failures the second they happen.

8. Key Takeaways

Event-Driven

Serverless is reacting to events (HTTP, Timer, Queue). Don’t poll; wait for the trigger.

Scaling

Scaling is automatic and can go to zero (cost = 0) or 200 instances instantly.

Stateless by Default

Functions are ephemeral. Use Durable Functions or external storage (Cosmos/Redis) if you need state.

Bindings

Use Input/Output Bindings to reduce boilerplate code. Focus on business logic, not connection management.

Cost Model

Consumption plan is pay-per-execution. Great for sporadic workloads. Use Premium for predictable, high-performance needs.

Next Steps

Interview Deep-Dive

Strong Candidate Answer:
  • Root cause diagnosis: Azure Functions on the Consumption plan have a cold start penalty of 1-10 seconds when no warm instances exist. During traffic spikes, the platform scales by creating new instances, each incurring cold start. Stripe webhooks have a 20-second timeout — if cold start plus processing exceeds 20 seconds, Stripe retries, amplifying load. Additionally, Consumption plan has a default concurrency limit of 100 instances.
  • Immediate fix — switch to Premium plan: Azure Functions Premium plan keeps 1+ instances always warm (eliminates cold start entirely). Pre-provisioned instances handle the baseline load, and additional instances scale with warm starts (sub-second). Cost: ~175/monthfor1alwaysreadyinstancevs175/month for 1 always-ready instance vs 0.20/million executions on Consumption. For payment processing, the reliability justifies the cost.
  • Application-level fix: Implement idempotency. Stripe sends the same webhook multiple times if it does not receive a 200 response within the timeout. Your function must check if it already processed a given event ID (store in Redis or Cosmos DB) before re-processing the payment. Without idempotency, customers get charged multiple times.
  • Architecture improvement: Decouple webhook reception from payment processing. Function A receives the webhook, stores the payload in Service Bus queue, and immediately returns 200 to Stripe (50ms total). Function B processes the payment from the queue asynchronously with retries and dead-letter handling. This way, Stripe always gets a fast 200 response, and payment processing can take as long as needed.
  • Why not just increase the timeout? You cannot control Stripe’s webhook timeout (it is fixed at 20 seconds). And even if you could, long-running synchronous webhook handlers are an anti-pattern because they hold connections open and reduce throughput.
Follow-up: Compare Azure Functions Consumption vs Premium vs Dedicated plans for this payment webhook scenario.Consumption: 0.20/millionexecutions,scalestozero,buthascoldstarts(110seconds)and100instancelimit.Wrongforpaymentwebhooksbecausecoldstartscausetimeouts.Premium:0.20/million executions, scales to zero, but has cold starts (1-10 seconds) and 100-instance limit. Wrong for payment webhooks because cold starts cause timeouts. Premium: 175/month minimum, always-warm instances, scales to 100+ instances, VNet integration for accessing private databases. Right for payment processing — predictable latency, no cold starts. Dedicated (App Service plan): Fixed cost ($74+/month), no auto-scaling to zero, but predictable performance. Only use if you already have an App Service plan with spare capacity. For payment webhooks specifically, Premium plan with the queue-based decoupling pattern is the production-grade answer.
Strong Candidate Answer:
  • Scenario 1 — Long-running processes (over 10 minutes): Azure Functions Consumption plan has a 10-minute execution timeout (configurable to 60 minutes on Premium). If you need to process a 2-hour video encoding job, Functions will timeout. Use Azure Container Instances or AKS batch jobs instead. Durable Functions can orchestrate long workflows but each individual activity function still has the timeout constraint.
  • Scenario 2 — Consistent high throughput with predictable traffic: If your workload runs 24/7 at steady 1,000 requests/second, serverless is more expensive than provisioned compute. At 1,000 req/s, that is 2.6 billion executions/month. Consumption plan: ~520/month.A2vCPUAppServiceP1v3handlingthesameload:520/month. A 2-vCPU App Service P1v3 handling the same load: 74/month. Serverless pricing wins at low/variable traffic but loses at sustained high throughput.
  • Scenario 3 — Applications requiring persistent connections: WebSocket servers, game servers, or gRPC streaming services need long-lived connections. Functions are designed for short-lived request/response patterns. A WebSocket connection that stays open for 30 minutes per user does not fit the serverless model. Use App Service or AKS for connection-oriented workloads.
  • Bonus scenario — Complex multi-step workflows with shared state: While Durable Functions handle orchestration, complex state machines with 20+ steps, compensation logic, and human approval gates are better served by Azure Logic Apps (visual designer, built-in connectors) or a dedicated workflow engine. Durable Functions code can become unmaintainable at scale.
Follow-up: A startup says “we use Azure Functions for everything because it scales to zero and we pay nothing when idle.” What risk are they not seeing?The hidden risk is vendor lock-in and architectural debt. Functions with bindings to Event Hub, Cosmos DB, Service Bus, and Blob Storage are deeply coupled to Azure’s proprietary trigger model. If the startup needs to migrate to AWS or GCP (for example, after an acquisition), every function must be rewritten. The second risk is cold start latency: scaling to zero means the first request after idle takes 1-10 seconds. For user-facing APIs, this creates a terrible experience during low-traffic periods. The third risk is observability: debugging a distributed system of 50 functions with asynchronous triggers is significantly harder than debugging 5 microservices with synchronous APIs.
Strong Candidate Answer:
  • Azure Functions strengths: Best .NET support (first-class C# experience), Durable Functions for stateful orchestration (no equivalent in Lambda or Cloud Functions), and deep integration with Azure services via bindings. Premium plan with VNet integration is unique — Lambda requires NAT Gateway for VPC access. Supports running on Kubernetes via KEDA for hybrid scenarios.
  • AWS Lambda strengths: Largest ecosystem, most third-party integrations, SnapStart for Java cold start reduction, Lambda@Edge for CDN-level execution, and Graviton2 (ARM) support for 20% cost savings. Most mature platform with the largest community.
  • GCP Cloud Functions strengths: Best integration with BigQuery and Pub/Sub for data pipeline triggers. Cloud Functions v2 runs on Cloud Run, giving you the same container runtime with serverless scaling. Cheapest for low-volume workloads.
  • My recommendation for multi-cloud: Do not standardize on one serverless platform. Instead, use containers (Docker) as the portability layer and deploy to each cloud’s container serverless offering: Azure Container Apps, AWS App Runner, GCP Cloud Run. These all run standard Docker containers and your code is not locked to any cloud’s function trigger model. Reserve cloud-specific Functions for event-driven glue code (S3 triggers, Blob Storage triggers) that is inherently cloud-specific anyway.
Follow-up: The engineering VP insists on a single platform. If forced to pick one, which do you choose?If forced to pick one, AWS Lambda — largest ecosystem, most third-party integrations, best documentation, and easiest to hire for. But I would document the trade-off explicitly: you are trading multi-cloud flexibility for ecosystem maturity. And I would still containerize the core business logic so that Lambda functions are thin wrappers calling container-based services, preserving the option to migrate later.

Next Steps

Continue to Chapter 9

Master Azure Monitor, Application Insights, and observability