Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Serverless Architecture
Serverless lets you focus on code without managing infrastructure. Azure handles scaling, availability, and operations.What You’ll Learn
By the end of this chapter, you’ll understand:- What “serverless” actually means (spoiler: servers still exist!)
- Why serverless can save you money and time
- When to use serverless vs containers vs VMs
- How to build event-driven applications
- Real cost comparisons and examples
Introduction: What is Serverless? (Start Here if You’re Completely New)
The Name is Misleading
First, let’s clear up confusion: “Serverless” DOES NOT mean “no servers” Servers still exist! You just don’t see them, manage them, or pay for them when they’re idle. Better name would be: “Server-Invisible” or “Someone-Else-Manages-The-Servers”Real-World Analogy: Electricity
Think about how you use electricity: Before Modern Electricity (Like Traditional Servers):- Plug in your code
- Pay only when code runs
- Azure maintains servers
- Scales automatically
- If server fails, another takes over
What is Serverless Computing?
Serverless = A cloud computing model where:- You write code (functions)
- Cloud provider runs your code when triggered (HTTP request, file upload, timer, etc.)
- You pay only for execution time (per millisecond!)
- Provider handles everything else (servers, scaling, maintenance, security patches)
The Problem Serverless Solves
Scenario: You built a web API that resizes user-uploaded images. Traditional VM/Container Approach:Breaking Down “Serverless”
What you DON’T manage (Azure does it): ❌ Provisioning servers ❌ Installing operating systems ❌ Configuring networking ❌ Security patches ❌ Scaling infrastructure ❌ Load balancing ❌ Server maintenance ❌ Paying for idle time What you DO manage: ✅ Write your code (JavaScript, Python, C#, Java, etc.) ✅ Define triggers (HTTP, Timer, File upload, etc.) ✅ Deploy your function ✅ Monitor execution Time spent:- Traditional VM: 10-20 hours/month managing infrastructure
- Serverless: 1-2 hours/month managing code
Real-World Example: Photo Sharing App
Without Serverless (Traditional Approach):Cost Comparison: Real Numbers
API Backend for Mobile App (100,000 users):| Metric | Traditional VM | Serverless |
|---|---|---|
| Traffic Pattern | Peaks 8am-10pm, dead midnight-6am | Same |
| Average Requests/Day | 500,000 | 500,000 |
| Server Uptime | 24/7 (720 hours/month) | Only when requests come (avg 100 hours/month) |
| Infrastructure Cost | $150/month | $5/month |
| Scaling | Manual (add more VMs) | Automatic (0 to 1000 instances) |
| Idle Cost | $100/month wasted | $0 wasted |
| Dev Time | 20 hours/month | 3 hours/month |
When Traditional Servers Waste Money
Pattern 1: Unpredictable WorkloadsWhen is Serverless NOT the Answer?
Don’t use serverless for: ❌ Long-running processes (>10 minutes)Serverless vs Containers vs VMs: Decision Tree
Common Serverless Use Cases
✅ Perfect for Serverless:- REST APIs (CRUD operations, mobile backends)
- File Processing (image resize, PDF generation, video thumbnails)
- Scheduled Tasks (nightly reports, data cleanup)
- Webhooks (GitHub webhooks, payment notifications)
- IoT Processing (sensor data, telemetry)
- Real-time Stream Processing (log analysis, event processing)
- Chatbots (respond to messages)
- Data Processing Pipelines (if tasks <10 min each)
- Backend for SPA (if traffic not extremely high)
- Microservices (if services are stateless)
- Databases (always-on, stateful)
- Websocket Servers (long-lived connections)
- Game Servers (stateful, real-time)
- ML Model Training (long-running, GPU-intensive)
Understanding “Cold Starts”
Cold Start = Delay when first request comes after period of inactivity. What happens:- Accept it: For non-critical workloads (scheduled jobs, webhooks), cold starts are fine. A 3-second delay on a webhook processing background tasks is invisible to users.
- Optimize your code: Cold start duration depends heavily on your runtime and package size. Python and Node.js cold start in ~1-2 seconds; .NET and Java cold start in 3-8 seconds. Reduce dependencies — every imported library adds to initialization time.
- Keep-alive ping: Create a Timer-triggered function that runs every 5 minutes with an empty body. This keeps the container warm for $0/month (timer executions are free within the Consumption plan’s 1 million free requests). The tradeoff: you pay for the memory-seconds of keeping an idle container allocated.
- Premium Plan (Elastic Premium): Keep 1+ instances always warm ($150/month base). This eliminates cold starts entirely and also gives you VNet integration, larger execution limits (up to 60 minutes), and more powerful instances. Best for customer-facing APIs where 3-second delays are unacceptable.
- Dedicated Plan (App Service Plan): Run Functions on always-on App Service instances. You pay for the App Service Plan whether functions run or not — identical pricing to hosting a web app. Only makes sense if you already have an App Service Plan with spare capacity.
Cost Example: Detailed Breakdown
Webhook Handler (GitHub webhook for CI/CD): Usage:- 50 developers
- Each pushes code 10 times/day
- Each push triggers webhook
- Total: 500 executions/day = 15,000/month
- Each execution: 200ms
1. Azure Functions
Azure Functions is Azure’s serverless compute platform for event-driven code execution.Hosting Plans
- Consumption Plan
- Dedicated (App Service)
- Automatic scaling (0 to 200 instances)
- Pay only when code runs
- 5-minute execution limit
- Cold start (~3-10 seconds)
- Cost: 0.000016/GB-s
[!WARNING] Gotcha: Consumption Plan Timeouts The default timeout for a function is 5 minutes (can increase to 10). If your code takes 11 minutes to run, the platform will kill it mid-execution. For long tasks, use Durable Functions.
[!TIP] Jargon Alert: Cold Start If no one calls your function for 20 minutes, Azure keeps it “cold” (off) to save money. The next person to call it waits 5-10 seconds while Azure boots up a server in the background.
Triggers and Bindings
Common Triggers
- HTTP (REST APIs)
- Timer (CRON jobs)
- Blob Storage (file upload)
- Queue Storage (messages)
- Event Hub (streaming)
- Cosmos DB (change feed)
Output Bindings
- HTTP response
- Queue Storage
- Blob Storage
- Cosmos DB
- Table Storage
- Event Hub
Example Functions
- HTTP Trigger
- Blob Trigger
- Timer Trigger
- Queue Trigger
2. Durable Functions
Durable Functions enable stateful workflows in serverless.Function Chaining
Fan-Out/Fan-In
3. Logic Apps
Logic Apps provide visual workflow automation with 400+ connectors.Use Cases
Integration
Automation
B2B
Scheduled Tasks
Example Workflow
3. Messaging & Eventing: The Glue of Serverless
In a serverless world, services don’t talk to each other directly; they use Messages and Events. Choosing the right one is the marks of a Principal Engineer.Service Bus vs. Event Hub vs. Event Grid
| Service | Best For… | Key Feature | Analogy |
|---|---|---|---|
| Service Bus | Critical Transactions | FIFO (Ordering), Dead-Letter Queues | Registered Mail (Tracking included) |
| Event Hubs | Big Data Streaming | Millions of events/sec, Partitions | A high-speed highway with many lanes |
| Event Grid | Reactive Programming | Direct routing from Azure resources | A doorbell (Ding! Something happened) |
1. Azure Service Bus (The Reliable Messenger)
Use this when you cannot afford to lose a single message (e.g., an Order).- Queues: One-to-one communication.
- Topics: One-to-many. One order can trigger “Email Service”, “Inventory Service”, and “shipping Service”.
- Pro Feature: Dead Letter Queues (DLQ): If a message fails to process 10 times, Service Bus moves it to a DLQ. This keeps your system running while you debug the “poison message”.
2. Azure Event Hubs (The Data Firehose)
Use this for logs, telemetry, or clickstream data.- Partitioning: This is how Event Hubs scales. If you have 32 partitions, 32 different functions can read data in parallel.
- Capture: Automatically save streaming data into Blob Storage or Data Lake for later analysis.
3. Azure Event Grid (The Event Distributor)
The lightest and cheapest way to react to changes.- Use Case: When a blob is uploaded, Event Grid tells your Function instantly.
- Push-Push Model: It pushes the event to your code, rather than your code “polling” for new work.
[!IMPORTANT] Pro Tip: The Decision Tree
- Need Reliability and Transactions? → Service Bus
- Need Massive Throughput (Logs/IoT)? → Event Hubs
- Need to React to Azure Alerts? → Event Grid
4. Event Grid
Event Grid is Azure’s event routing service for reactive programming.Event Sources → Event Grid → Event Handlers
Example: Image Processing Pipeline
5. Serverless Patterns
1. Backend for Frontend (BFF)
1. Backend for Frontend (BFF)
2. Event-Driven Processing
2. Event-Driven Processing
3. Scheduled Jobs
3. Scheduled Jobs
4. Webhook Handler
4. Webhook Handler
6. Best Practices
Keep Functions Small
Use Queues
Handle Idempotency
Monitor Everything
Use Managed Identity
Consider Cold Starts
7. Interview Questions
Beginner Level
Q1: What is a Cold Start in Serverless?
Q1: What is a Cold Start in Serverless?
- Use Premium Plan (pre-warmed instances)
- Keep alive ping
- Use Dedicated (App Service) plan
Q2: What is the difference between a Trigger and a Binding?
Q2: What is the difference between a Trigger and a Binding?
- Trigger: Defines how a function is invoked (e.g., HTTP request, Timer, Blob uploaded). A function must have exactly one trigger.
- Binding: Connects input/output data resources declaratively (e.g., Read from Cosmos DB, Write to Queue). Optional.
Intermediate Level
Q3: When should you use Durable Functions?
Q3: When should you use Durable Functions?
- Chaining: Function A -> Function B -> Function C
- Fan-out/Fan-in: Run multiple functions in parallel, wait for all to finish.
- Human Interaction: Wait for approval trigger (email link).
- Monitor: Long-running polling process.
Q4: Azure Functions vs Logic Apps - how to choose?
Q4: Azure Functions vs Logic Apps - how to choose?
- Azure Functions: Code-first. Best for complex logic, custom algorithms, existing libraries.
- Logic Apps: Design-first (GUI). Best for integration, connecting SaaS apps, orchestrating disparate systems without writing code.
Advanced Level
Q5: Explain the Consumption Plan scaling limits
Q5: Explain the Consumption Plan scaling limits
- Scale Controller: Monitors event rate and adds instances.
- Maximum Instances: usually 200 instances (Windows/Linux).
- Execution Time: Default 5 mins, max 10 mins.
- Throughput: Limited by scale rate (e.g., 1 instance every few seconds).
Troubleshooting: When Serverless Fails
Debugging code that runs on “someone else’s server” can be tricky. Here is the Principal’s playbook.1. The “Function Host is Restarting” Loop
If your functions keep failing to start:- Missing Application Setting: Check if your
AzureWebJobsStoragesetting is missing or invalid. Functions need a storage account to store their own internal state. - Runtime Version Mismatch: Did you deploy Node 20 code to a Function App configured for Node 18? Check the
FUNCTIONS_EXTENSION_VERSIONandWEBSITE_NODE_DEFAULT_VERSION.
2. The “Timeout” Triage
If your function is being killed mid-process:- Consumption Limits: Remember the 5/10 minute limit. If you need 30 minutes, you must move to the Premium Plan or a Dedicated Plan.
- Zombie Processes: If your code starts a background thread and doesn’t wait for it, the Function Host might shut down before the thread finish. Always use
async/awaitand wait for all tasks.
3. “403 Forbidden” (Managed Identity)
If your function works on your laptop but fails in Azure:- Local Settings: You are likely using your personal credentials locally. In Azure, you need to enable Managed Identity and grant it permissions (e.g., “Storage Blob Data Contributor”) on the target resource.
[!TIP] Pro Tool: Application Insights (Live Metrics) Don’t wait for log ingestion. Use the Live Metrics Stream in Application Insights. It shows you the heartbeat of your function app with near-zero latency. You can see CPU spikes and request failures the second they happen.
8. Key Takeaways
Event-Driven
Scaling
Stateless by Default
Bindings
Cost Model
Next Steps
Interview Deep-Dive
Your Azure Function processes payment webhooks from Stripe. During Black Friday, you notice 15% of webhook calls are failing with timeout errors. Diagnose and fix this.
Your Azure Function processes payment webhooks from Stripe. During Black Friday, you notice 15% of webhook calls are failing with timeout errors. Diagnose and fix this.
- Root cause diagnosis: Azure Functions on the Consumption plan have a cold start penalty of 1-10 seconds when no warm instances exist. During traffic spikes, the platform scales by creating new instances, each incurring cold start. Stripe webhooks have a 20-second timeout — if cold start plus processing exceeds 20 seconds, Stripe retries, amplifying load. Additionally, Consumption plan has a default concurrency limit of 100 instances.
- Immediate fix — switch to Premium plan: Azure Functions Premium plan keeps 1+ instances always warm (eliminates cold start entirely). Pre-provisioned instances handle the baseline load, and additional instances scale with warm starts (sub-second). Cost: ~0.20/million executions on Consumption. For payment processing, the reliability justifies the cost.
- Application-level fix: Implement idempotency. Stripe sends the same webhook multiple times if it does not receive a 200 response within the timeout. Your function must check if it already processed a given event ID (store in Redis or Cosmos DB) before re-processing the payment. Without idempotency, customers get charged multiple times.
- Architecture improvement: Decouple webhook reception from payment processing. Function A receives the webhook, stores the payload in Service Bus queue, and immediately returns 200 to Stripe (50ms total). Function B processes the payment from the queue asynchronously with retries and dead-letter handling. This way, Stripe always gets a fast 200 response, and payment processing can take as long as needed.
- Why not just increase the timeout? You cannot control Stripe’s webhook timeout (it is fixed at 20 seconds). And even if you could, long-running synchronous webhook handlers are an anti-pattern because they hold connections open and reduce throughput.
When should you NOT use serverless? Give me three scenarios where Azure Functions is the wrong choice.
When should you NOT use serverless? Give me three scenarios where Azure Functions is the wrong choice.
- Scenario 1 — Long-running processes (over 10 minutes): Azure Functions Consumption plan has a 10-minute execution timeout (configurable to 60 minutes on Premium). If you need to process a 2-hour video encoding job, Functions will timeout. Use Azure Container Instances or AKS batch jobs instead. Durable Functions can orchestrate long workflows but each individual activity function still has the timeout constraint.
- Scenario 2 — Consistent high throughput with predictable traffic: If your workload runs 24/7 at steady 1,000 requests/second, serverless is more expensive than provisioned compute. At 1,000 req/s, that is 2.6 billion executions/month. Consumption plan: ~74/month. Serverless pricing wins at low/variable traffic but loses at sustained high throughput.
- Scenario 3 — Applications requiring persistent connections: WebSocket servers, game servers, or gRPC streaming services need long-lived connections. Functions are designed for short-lived request/response patterns. A WebSocket connection that stays open for 30 minutes per user does not fit the serverless model. Use App Service or AKS for connection-oriented workloads.
- Bonus scenario — Complex multi-step workflows with shared state: While Durable Functions handle orchestration, complex state machines with 20+ steps, compensation logic, and human approval gates are better served by Azure Logic Apps (visual designer, built-in connectors) or a dedicated workflow engine. Durable Functions code can become unmaintainable at scale.
Compare Azure Functions, AWS Lambda, and GCP Cloud Functions. A multi-cloud company asks which serverless platform to standardize on. What do you recommend?
Compare Azure Functions, AWS Lambda, and GCP Cloud Functions. A multi-cloud company asks which serverless platform to standardize on. What do you recommend?
- Azure Functions strengths: Best .NET support (first-class C# experience), Durable Functions for stateful orchestration (no equivalent in Lambda or Cloud Functions), and deep integration with Azure services via bindings. Premium plan with VNet integration is unique — Lambda requires NAT Gateway for VPC access. Supports running on Kubernetes via KEDA for hybrid scenarios.
- AWS Lambda strengths: Largest ecosystem, most third-party integrations, SnapStart for Java cold start reduction, Lambda@Edge for CDN-level execution, and Graviton2 (ARM) support for 20% cost savings. Most mature platform with the largest community.
- GCP Cloud Functions strengths: Best integration with BigQuery and Pub/Sub for data pipeline triggers. Cloud Functions v2 runs on Cloud Run, giving you the same container runtime with serverless scaling. Cheapest for low-volume workloads.
- My recommendation for multi-cloud: Do not standardize on one serverless platform. Instead, use containers (Docker) as the portability layer and deploy to each cloud’s container serverless offering: Azure Container Apps, AWS App Runner, GCP Cloud Run. These all run standard Docker containers and your code is not locked to any cloud’s function trigger model. Reserve cloud-specific Functions for event-driven glue code (S3 triggers, Blob Storage triggers) that is inherently cloud-specific anyway.