Compute Services

This chapter will teach you everything about running applications in Azure, starting from absolute basics. We’ll explain what compute actually means, why different options exist, and how to choose and use them confidently.

What You’ll Learn

By the end of this chapter, you’ll understand:

What “compute” means in cloud computing (explained from scratch)
Why you need compute resources and what they do
The differences between VMs, App Service, Functions, and Containers
How to create and configure each compute type step-by-step
When to use each option (with real-world decision criteria)
How to optimize cost and performance
Best practices for production deployments

What is “Compute”? (Start Here if You’re Completely New)

Let’s start with the absolute basics. What does “compute” even mean?

The Simple Explanation

Compute = The ability to run your code That’s it. When you write a program (a website, an app, a script), it needs to run SOMEWHERE. That “somewhere” needs:

CPU (Central Processing Unit): The brain that executes your code
Memory (RAM): Temporary storage while your code runs
Storage (Disk): Where your code and data are stored permanently

Compute is just a fancy tech word for “a computer that runs your code.”

Real-World Analogy

Think about baking a cake:

Your recipe = Your code (the instructions)
The kitchen = Compute resources
- Oven = CPU (does the work)
- Counter space = RAM (workspace while cooking)
- Pantry = Storage (ingredients and supplies)

Without a kitchen, your recipe is useless. Without compute, your code can’t run.

Where Does Your Code Run?

Option 1: Your Laptop (Local)

Pros:
✅ Free (you already own it)
✅ Full control
✅ Easy to test

Cons:
❌ Only you can access it
❌ Goes offline when you close laptop
❌ Limited by your laptop's power
❌ If laptop dies, app goes down

Option 2: Your Company’s Server (On-Premises)

Pros:
✅ Your company controls it
✅ Can handle more traffic than a laptop
✅ Stays online 24/7 (if configured properly)

Cons:
❌ Expensive ($10,000+ upfront)
❌ Takes weeks to set up
❌ You manage everything (patches, hardware, backups)
❌ Can't easily scale (bought 1 server, stuck with 1 server)

Option 3: Azure Cloud (What We’re Learning)

Pros:
✅ No upfront cost (pay as you go)
✅ Deploy in minutes
✅ Scales automatically (1 server → 100 servers → 1 server)
✅ Microsoft manages hardware
✅ Available worldwide

Cons:
❌ Monthly cost (but often cheaper than on-premises)
❌ Need to learn Azure (that's why you're here!)

Why Azure Has Multiple “Compute” Services

Why not just one type of compute? Because different apps have different needs. Analogy: Transportation You wouldn’t use the same vehicle for every trip:

Going to grocery store → Walk or bike
Commute to work → Car or bus
Moving furniture → Truck
International trip → Airplane

Similarly, different apps need different compute:

Simple website → App Service (like a car: easy, sufficient for most)
Custom legacy app → Virtual Machine (like a truck: heavy-duty, more control)
Process one task → Azure Function (like an Uber: pay per ride)
Complex microservices → Kubernetes (like a fleet of vehicles: orchestrated system)

What is “Compute”?**

In simple terms, compute is the processing power that runs your applications—the CPU, memory, and resources that execute your code. Think of it as the “brain” of your application that processes requests, runs algorithms, and serves data to users. Azure offers multiple compute options, each designed for different scenarios. Choosing the right one impacts cost, performance, and operational overhead. This chapter will teach you what each service is, why you’d choose it, and how to use it effectively—from complete beginner to production-ready deployments.

Understanding the Compute-to-Application Relationship

Let’s make this concrete with a real example: Example: Building a Blog Website

Your Blog Application Needs:
1. Web Server
   - Receives HTTP requests (user visits http://yourblog.com)
   - Sends back HTML pages
   - Compute needed: CPU to process requests, RAM to hold data

2. Database
   - Stores your blog posts, comments, users
   - Compute needed: CPU to query data, RAM to cache queries

3. File Storage
   - Stores images, videos you upload
   - Compute needed: Minimal (just storage, not much processing)

Where does this run?
- Without Azure: You set up a server in your closet
- With Azure: You rent compute resources in Microsoft's datacenter

The Flow:

User visits yourblog.com
    ↓
Request goes to Azure datacenter
    ↓
Azure Compute (your rented "computer") receives request
    ↓
Your application code runs on that compute
    ↓
Code fetches blog post from database
    ↓
Code generates HTML page
    ↓
Compute sends HTML back to user
    ↓
User sees your blog

All of this happens on Azure Compute in milliseconds.

Breaking Down “Compute Resources”

When you rent compute in Azure, you’re actually renting these components: 1. CPU (vCPU - Virtual CPU)

What it is: Processing power. Measured in “cores.”
What it does: Executes your code, one instruction at a time
Analogy: Workers in a kitchen
- 1 vCPU = 1 worker (handles 1 task at a time)
- 4 vCPUs = 4 workers (handles 4 tasks simultaneously)
Example: Blog with 10 visitors → 1 vCPU sufficient
Example: Blog with 10,000 visitors → 8 vCPUs needed

2. Memory (RAM)

What it is: Temporary storage while your app runs
What it does: Holds data currently being processed
Analogy: Counter space in a kitchen
- More RAM = More space to work with multiple things at once
- Less RAM = Must finish one task before starting another
Example: Blog loads 10 posts → Needs 100 MB RAM
Example: Blog loads 1000 posts → Needs 2 GB RAM

3. Storage (Disk)

What it is: Permanent storage for your code and data
What it does: Stores files even when compute is turned off
Analogy: Pantry or closet (long-term storage)
Example: Blog application code → 500 MB
Example: 1,000 blog posts with images → 10 GB

4. Network

What it is: Bandwidth for sending/receiving data
What it does: Transfers data between user and your app
Analogy: Internet connection speed
Example: Small blog → 1 Mbps sufficient
Example: Video streaming site → 1 Gbps+ needed

Understanding the Compute Spectrum

Before diving into specific services, let’s understand the fundamental question: What do you need to run?

The Evolution of Compute Needs

Traditional On-Premises (Before Cloud):

You buy physical servers:
- Pay upfront ($10,000+)
- Takes weeks to arrive
- Fixed capacity (can't scale)
- You manage everything (OS, patches, hardware)
- If server breaks, you're down until it's fixed

Cloud Computing (Azure):

You rent virtual servers:
- Pay per hour/second (no upfront cost)
- Deploy in minutes
- Scale up/down instantly
- Microsoft manages hardware
- Automatic redundancy (if one fails, another takes over)

Why Multiple Compute Options?

Different applications have different needs:

Application Type	Needs	Azure Service
Simple website	Just run code, don’t care about OS	App Service
Legacy application	Needs specific OS version, custom software	Virtual Machines
Microservices	Need to orchestrate many containers	Azure Kubernetes Service (AKS)
Event-driven	Run code only when triggered (e.g., file upload)	Azure Functions
Quick task	Run a container once, no orchestration	Container Instances

Real-World Analogy:

Virtual Machine = Renting an entire apartment (full control, more responsibility)
App Service = Renting a furnished room (less control, less responsibility, easier)
Azure Functions = Using a hotel room for one night (pay only when you use it)

[!TIP] Jargon Alert: Compute “Compute” is just a fancy word for “processing power” or “the ability to run code.” When someone says “compute resources,” they mean CPU, memory, and the servers that run your applications. Don’t let the word intimidate you—it’s just tech jargon for “the stuff that runs your code.”

[!WARNING] Gotcha: Choosing the Wrong Compute Service Many beginners choose VMs because they’re familiar, but VMs are often overkill. If you’re building a simple web app, use App Service. You’ll save time, money, and headaches. Only use VMs if you truly need full OS control.

Key Concepts You Must Understand

1. IaaS vs PaaS vs Serverless

These terms define how much Microsoft manages for you:

IaaS (Infrastructure as a Service)
PaaS (Platform as a Service)
Serverless (Function as a Service)

You manage: OS, runtime, applications, dataMicrosoft manages: Hardware, networking, datacenterExample: Virtual MachinesAnalogy: You rent a plot of land. You build the house, install plumbing, electricity—everything. The landlord just provides the land.When to use:

Need specific OS version (Windows Server 2012 R2)
Legacy applications that can’t run on PaaS
Full control over the environment
Compliance requirements (need to manage security patches yourself)

Trade-off: More control = More responsibility (you patch OS, manage security, handle failures)

2. Stateless vs Stateful Applications

Stateless: Application doesn’t store session data on the server. Each request is independent. Example: A REST API that processes requests. If the server restarts, no data is lost because state is stored in a database.

# Stateless API
@app.route('/api/users/<user_id>')
def get_user(user_id):
    # No session data stored on server
    # Fetches from database each time
    return db.get_user(user_id)

Stateful: Application stores session data in memory. If server restarts, session is lost. Example: A game server that keeps player positions in memory.

# Stateful game server
player_positions = {}  # Stored in memory

def update_position(player_id, x, y):
    player_positions[player_id] = (x, y)  # Lost if server restarts

Why This Matters:

Stateless apps can scale horizontally easily (add more servers, load balance)
Stateful apps need sticky sessions or external state storage (Redis, database)

Best Practice: Make applications stateless. Store state in databases, Redis, or Cosmos DB.

[!TIP] Jargon Alert: Stateless vs Stateful Stateless: Like ordering at a fast-food restaurant. Each order is independent—the cashier doesn’t remember your last order. The app doesn’t store session data on the server. Stateful: Like a sit-down restaurant where the waiter remembers your preferences. The app stores session data in memory. If the server restarts, that memory is lost.

[!WARNING] Gotcha: Stateful Applications Don’t Scale If your app stores user sessions in memory, you can’t easily add more servers. User A’s session is on Server 1, but the load balancer might send them to Server 2, which doesn’t have their session. Always use external storage (database, Redis) for state.

3. Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up):

Make the server bigger (more CPU, more RAM)
Example: Upgrade from 4 vCPU to 8 vCPU
Limitation: Can only scale to maximum VM size
Cost: More expensive per unit

Horizontal Scaling (Scale Out):

Add more servers (2 servers → 4 servers → 10 servers)
Example: Add more VM instances to handle traffic
Advantage: Can scale to hundreds/thousands of servers
Cost: More cost-effective at scale

Real-World Example:

Scenario: Your website gets 10x more traffic

Vertical Scaling:
- Upgrade VM from 4 vCPU to 16 vCPU
- Cost: $200/month → $800/month
- Limitation: Can't go beyond 16 vCPU

Horizontal Scaling:
- Add 3 more VMs (4 total, each with 4 vCPU)
- Cost: $200/month → $800/month (same cost, but 4x redundancy)
- Advantage: If one VM fails, 3 others still serve traffic

Best Practice: Design for horizontal scaling from day one. Use load balancers, stateless applications, and autoscaling.

1. Compute Decision Tree

[!WARNING] Gotcha: Spot VM Eviction Spot VMs offer huge discounts (up to 90%), but Azure can take them back with only a 30-second warning. Never use them for production databases or critical APIs—only for stateless batch jobs that can fail and restart.

[!TIP] Jargon Alert: SLA (Service Level Agreement) Microsoft’s financial guarantee of uptime (e.g., 99.9%). If they miss it, you get a bill credit. Note: Single VMs often have a lower SLA than multiple VMs deployed in an Availability Set or Zone.

2. Virtual Machines Deep Dive

What is a Virtual Machine? A Virtual Machine (VM) is a software-based computer that runs on physical hardware. Think of it like this: Azure has massive physical servers in datacenters. They use virtualization technology to split one physical server into multiple “virtual” servers. Each VM gets its own CPU, RAM, and storage, isolated from other VMs. Why Use VMs?

Full Control: You have root/admin access. Install any software, configure anything.
Legacy Applications: Run old applications that require specific OS versions or configurations.
Custom Requirements: Need specific drivers, software, or configurations that PaaS doesn’t support.
Compliance: Some regulations require you to manage the OS yourself.

Under the Hood: How Azure Compute Works

To a Principal Engineer, a “VM” isn’t just a virtual computer; it’s a slice of a massive, distributed system. Here is what’s happening behind the scenes.

1. The Fabric Controller (The Brain)

Azure doesn’t have humans plugging in servers when you click “Create”. It uses the Fabric Controller.

It maintains a map of every physical server in the datacenter.
It tracks CPU/RAM utilization and hardware health.
When you request a VM, it finds a physical host with enough “white space” (available resources) and sends a command to create your VM.

2. The Hypervisor (The Gatekeeper)

Every physical host runs a custom version of Hyper-V.

It provides strict isolation between VMs. VM-A cannot see VM-B’s memory, even though they sit on the same physical chip.
It manages the vCPU scheduling. If you have a 2-vCPU VM, the hypervisor ensures you get your fair share of time on the physical CPU cores.

Real-World Analogy: The hypervisor is like an apartment building manager. Each tenant (VM) has their own apartment with walls between them. The manager allocates a fair share of shared resources (elevator time, water pressure, parking) but ensures no tenant can peek into another’s apartment. If one tenant makes too much noise (consumes too much CPU), the manager throttles them rather than letting them disturb everyone else.

3. Service Healing (Self-Correcting Infrastructure)

What happens if the physical server hosting your VM catches fire?

The Fabric Controller detects the heartbeats are missing.
It immediately marks that host as “failed”.
It finds a new, healthy physical host in the same cluster.
It “respawns” your VM on the new host and re-attaches your Managed Disks.
Your VM reboots automatically. This is why “Managed Disks” are critical—they live on the storage network, not the physical host, so they can be moved instantly.

[!IMPORTANT] Pro Insight: Availability Sets vs. Zones

Availability Sets ensure your VMs are on different Racks (Power/Network) in the same building.

Availability Zones ensure your VMs are in different Buildings (miles apart). Always use Availability Zones for production to survive a complete datacenter power failure.

When NOT to Use VMs:

Simple web applications (use App Service instead)
You just want to deploy code quickly (use PaaS)
You don’t want to manage OS patches (use PaaS)

[!WARNING] Gotcha: VM Management Overhead VMs require ongoing maintenance: OS patches, security updates, monitoring, backups. If you’re not prepared to manage this, use PaaS (App Service, Azure Functions). Many teams choose VMs thinking they’ll have “more control,” but end up spending 50% of their time on maintenance instead of building features.

[!TIP] Jargon Alert: Virtual Machine (VM) A VM is a software-based computer running on physical hardware. Think of it like this: Azure has massive physical servers. They use virtualization technology to split one physical server into multiple “virtual” servers. Each VM thinks it’s a real computer with its own CPU, RAM, and storage.

Understanding VM Components

Before choosing a VM size, you need to understand what you’re buying:

vCPU (Virtual CPU)
Memory (RAM)
Storage (Disks)
Network

What it is: Processing power. More vCPUs = can handle more concurrent operations.Real-World Analogy: Like having more workers. 1 worker can handle 1 task at a time. 4 workers can handle 4 tasks simultaneously.How to Choose:

1-2 vCPU: Small websites, dev/test environments
4-8 vCPU: Medium web applications, small databases
16+ vCPU: Large databases, high-traffic applications, data processing

Common Mistake: Over-provisioning. If your app uses 20% CPU, you don’t need 16 vCPUs. Start small, monitor, then scale up.

[!WARNING] Gotcha: Over-Provisioning Costs Money Many beginners choose the biggest VM “to be safe.” A 16 vCPU VM costs $800/month. If you only use 20% CPU, you're wasting$ 640/month. Start with 2-4 vCPUs, monitor for a week, then scale up if needed. Azure makes it easy to resize VMs.

VM Size Families

General Purpose
Compute Optimized
Memory Optimized
Storage Optimized
GPU

B, D, DC, DS seriesBalanced CPU:Memory ratio (1:4)

Use cases:
- Web servers
- Small to medium databases
- Development/test environments
- Low to medium traffic apps

Examples:
- Standard_B2s: 2 vCPU, 4 GB RAM (Burstable)
- Standard_D4s_v5: 4 vCPU, 16 GB RAM

B-series (Burstable):

Accumulate CPU credits when idle
Burst to 100% when needed
Cost-effective for variable workloads
Perfect for dev/test

Practical Tip: B-series VMs are the secret weapon for learning Azure. A B1s ($5/month) is included in the Azure free tier for 12 months and can run a small Linux web server comfortably. But never use B-series for production databases or latency-sensitive APIs — once you exhaust your CPU credits (which happens after ~30 minutes of sustained load), the VM throttles to 10% of its baseline CPU, making your app feel like it is running on a calculator.

F, FX seriesHigh CPU:Memory ratio (1:2)

Use cases:
- Application servers
- Batch processing
- Analytics
- Gaming servers

Examples:
- Standard_F8s_v2: 8 vCPU, 16 GB RAM
- High CPU performance
- Lower cost per vCPU

E, M seriesHigh Memory:CPU ratio (8:1 or higher)

Use cases:
- Relational databases (SQL Server, Oracle)
- In-memory caches (Redis)
- Analytics (SAP HANA)

Examples:
- Standard_E8s_v5: 8 vCPU, 64 GB RAM
- Standard_M64s: 64 vCPU, 1 TB RAM

L seriesHigh local disk throughput

Use cases:
- NoSQL databases (Cassandra, MongoDB)
- Data warehousing
- Big data applications

Examples:
- Standard_L8s_v3: 8 vCPU, 64 GB RAM, 1.92 TB NVMe

NC, ND, NV seriesGPU-accelerated compute

Use cases:
- Machine learning training
- AI inference
- Graphics rendering
- Video encoding

Examples:
- Standard_NC6s_v3: 6 vCPU, 112 GB RAM, Tesla V100
- Standard_ND40rs_v2: 40 vCPU, 672 GB RAM, 8x V100

Step-by-Step: Creating Your First VM

Let’s create a VM from scratch, explaining every step and why we’re doing it:

Prerequisites

Before creating a VM, you need:

Azure Account: Sign up at portal.azure.com (free tier works)
Resource Group: A container for your resources (like a folder)
Virtual Network: A network for your VM to connect to (like a LAN)

Step 1: Create Resource Group

What is a Resource Group? Think of it as a folder that contains related resources. All resources in a group can be managed together (delete the group = delete all resources).

# Create resource group
az group create \
  --name rg-learn-vm \
  --location eastus

# What this does:
# --name: Name of the resource group (must be unique in your subscription)
# --location: Azure region where resources will be created
#   - eastus = East US (Virginia) - good for US East Coast
#   - westeurope = West Europe (Netherlands) - good for Europe
#   - southeastasia = Southeast Asia (Singapore) - good for Asia

Why eastus? It’s one of the cheapest regions and has all services available. For production, choose the region closest to your users.

Step 2: Create Virtual Network

What is a Virtual Network (VNet)? Think of it as your private network in Azure. VMs in the same VNet can communicate with each other privately (like computers on the same WiFi network).

# Create virtual network
az network vnet create \
  --resource-group rg-learn-vm \
  --name vnet-learn \
  --address-prefix 10.0.0.0/16 \
  --subnet-name default \
  --subnet-prefix 10.0.1.0/24

# What this does:
# --address-prefix 10.0.0.0/16: 
#   - Defines the network range (10.0.0.0 to 10.0.255.255)
#   - /16 means first 16 bits are network, last 16 bits are hosts
#   - Can have up to 65,536 IP addresses (2^16)
# --subnet-prefix 10.0.1.0/24:
#   - Subnet is a smaller network within the VNet
#   - /24 means first 24 bits are network, last 8 bits are hosts
#   - Can have up to 256 IP addresses (2^8)
#   - VMs will get IPs like 10.0.1.4, 10.0.1.5, etc.

Why 10.0.0.0/16? This is a private IP range (RFC 1918). It won’t conflict with public internet IPs. Common choices:

10.0.0.0/16 (10.0.0.0 - 10.0.255.255) - 65,536 IPs
172.16.0.0/12 (172.16.0.0 - 172.31.255.255) - 1 million IPs
192.168.0.0/16 (192.168.0.0 - 192.168.255.255) - 65,536 IPs

Step 3: Create Network Security Group (NSG)

What is an NSG? A firewall that controls traffic to/from your VM. By default, Azure blocks all inbound traffic. You need to explicitly allow ports (like port 22 for SSH, port 3389 for RDP).

# Create NSG
az network nsg create \
  --resource-group rg-learn-vm \
  --name nsg-learn

# Allow SSH (port 22) from anywhere
# WARNING: In production, restrict to your IP only!
az network nsg rule create \
  --resource-group rg-learn-vm \
  --nsg-name nsg-learn \
  --name AllowSSH \
  --priority 1000 \
  --protocol Tcp \
  --direction Inbound \
  --source-address-prefixes '*' \
  --source-port-ranges '*' \
  --destination-address-prefixes '*' \
  --destination-port-ranges 22 \
  --access Allow

# What this does:
# --priority 1000: Lower number = higher priority (evaluated first)
# --protocol Tcp: Allow TCP protocol (SSH uses TCP)
# --direction Inbound: Rule applies to incoming traffic
# --source-address-prefixes '*': Allow from any IP (NOT secure for production!)
# --destination-port-ranges 22: Allow traffic to port 22 (SSH)
# --access Allow: Allow this traffic (vs Deny)

Security Best Practice: Instead of '*', use your IP:

--source-address-prefixes 'YOUR_IP_ADDRESS/32'

Step 4: Create Public IP Address

What is a Public IP? An IP address accessible from the internet. Without this, you can’t connect to your VM from outside Azure.

# Create public IP
az network public-ip create \
  --resource-group rg-learn-vm \
  --name pip-learn-vm \
  --allocation-method Static \
  --sku Standard

# What this does:
# --allocation-method Static: IP address doesn't change (vs Dynamic)
# --sku Standard: Standard SKU (required for newer VMs)

Static vs Dynamic:

Static: IP address never changes (good for DNS records, firewall rules)
Dynamic: IP address can change when VM is stopped/started (cheaper, but less reliable)

Step 5: Create Network Interface (NIC)

What is a NIC? The network card that connects your VM to the network. It connects the VM to the VNet, NSG, and Public IP.

# Create NIC
az network nic create \
  --resource-group rg-learn-vm \
  --name nic-learn-vm \
  --vnet-name vnet-learn \
  --subnet default \
  --network-security-group nsg-learn \
  --public-ip-address pip-learn-vm

# What this does:
# --vnet-name: Connect to the VNet we created
# --subnet: Connect to the subnet (default)
# --network-security-group: Attach the NSG (firewall rules)
# --public-ip-address: Attach the public IP (for internet access)

Step 6: Create the Virtual Machine

Now we create the actual VM:

# Create VM
az vm create \
  --resource-group rg-learn-vm \
  --name vm-learn \
  --location eastus \
  --nics nic-learn-vm \
  --image UbuntuLTS \
  --size Standard_B2s \
  --admin-username azureuser \
  --generate-ssh-keys \
  --authentication-type ssh

# What this does:
# --name: Name of the VM (must be unique in resource group)
# --nics: Attach the network interface we created
# --image UbuntuLTS: Use Ubuntu Linux (latest LTS version)
#   Alternatives: 
#     - Win2019Datacenter (Windows Server 2019)
#     - RHEL (Red Hat Enterprise Linux)
#     - CentOS
# --size Standard_B2s: VM size (2 vCPU, 4 GB RAM, Burstable)
# --admin-username: Username for SSH login
# --generate-ssh-keys: Automatically generate SSH key pair
#   - Creates ~/.ssh/id_rsa (private key) and ~/.ssh/id_rsa.pub (public key)
#   - Public key is added to VM for passwordless login
# --authentication-type ssh: Use SSH keys (more secure than passwords)

What happens during VM creation?

Azure allocates hardware in a datacenter
Creates the VM with specified CPU/RAM
Attaches the OS disk (contains Ubuntu)
Connects to the network (via NIC)
Boots the VM
Installs your SSH public key
VM is ready in 2-5 minutes

Step 7: Connect to Your VM

# Get the public IP address
az vm show \
  --resource-group rg-learn-vm \
  --name vm-learn \
  --show-details \
  --query publicIps \
  --output tsv

# Connect via SSH (replace with your IP)
ssh azureuser@<PUBLIC_IP>

# Example:
# ssh azureuser@20.123.45.67

First-time connection: You’ll see a message asking to verify the host. Type yes and press Enter.

Step 8: Verify VM is Working

Once connected, run these commands to verify everything works:

# Check OS version
cat /etc/os-release

# Check CPU and memory
free -h
nproc

# Check disk space
df -h

# Check network
ip addr show
ping -c 3 8.8.8.8  # Test internet connectivity

Step 9: Install Software (Example: Nginx Web Server)

# Update package list
sudo apt update

# Install Nginx
sudo apt install -y nginx

# Start Nginx
sudo systemctl start nginx

# Enable Nginx to start on boot
sudo systemctl enable nginx

# Check status
sudo systemctl status nginx

Open port 80 in NSG (to access web server):

az network nsg rule create \
  --resource-group rg-learn-vm \
  --nsg-name nsg-learn \
  --name AllowHTTP \
  --priority 1001 \
  --protocol Tcp \
  --direction Inbound \
  --source-address-prefixes '*' \
  --destination-port-ranges 80 \
  --access Allow

Now visit http://<PUBLIC_IP> in your browser. You should see the Nginx welcome page!

Step 10: Clean Up (Important!)

Always delete resources when done to avoid charges:

# Delete the entire resource group (deletes all resources)
az group delete \
  --name rg-learn-vm \
  --yes \
  --no-wait

# What this does:
# --yes: Don't ask for confirmation
# --no-wait: Don't wait for deletion to complete (runs in background)

Cost: A Standard_B2s VM costs ~$30/month if left running. Always delete when not in use!

[!WARNING] Gotcha: VM Costs Add Up Quickly A single VM might cost $30/month, but if you forget to delete 10 VMs, that's$ 300/month wasted. Always set up cost alerts and tag resources with “Owner” so you know who to contact. Use Azure Cost Management to find and delete unused resources.

[!TIP] Jargon Alert: Deallocate vs Stop Stop (in OS): Shuts down the operating system, but Azure still reserves the hardware. You’re still charged for compute! Deallocate: Releases the hardware back to Azure. No compute charges, only storage charges. Always deallocate VMs when not in use.

[!INFO] Aside: Azure Automation for Cost Savings Use Azure Automation to automatically stop (deallocate) dev/test VMs at 6 PM and start them at 8 AM. Saves 60% on compute costs (no charges during nights/weekends).

Understanding VM Creation Options

When creating a VM, you make several important decisions:

Image (Operating System)
Size (VM Configuration)
Authentication

What it is: The OS and software pre-installed on the VM.Types:

Marketplace Images: Pre-configured OS (Ubuntu, Windows Server, RHEL)
Custom Images: Your own OS image (for consistent deployments)
Shared Image Gallery: Images shared across subscriptions

Common Choices:

UbuntuLTS: Ubuntu Linux (most popular for Linux)
Win2019Datacenter: Windows Server 2019
RHEL: Red Hat Enterprise Linux (enterprise)
CentOS: Community version of RHEL

How to Choose:

Linux: Cheaper (no Windows license), better for web servers, APIs
Windows: Required for .NET Framework apps, Windows-specific software

What it is: The amount of CPU, RAM, and storage allocated to the VM.Naming Convention:

Standard_B2s
│       │ │
│       │ └─ Small (s) or Medium (m) - storage performance
│       └─── Number of vCPUs (2)
└─────────── Series (B = Burstable)

Common Sizes:

B2s: 2 vCPU, 4 GB RAM - Dev/test, small websites ($30/month)
D2s_v3: 2 vCPU, 8 GB RAM - Small production apps ($100/month)
D4s_v3: 4 vCPU, 16 GB RAM - Medium production apps ($200/month)
E8s_v3: 8 vCPU, 64 GB RAM - Large databases ($400/month)

How to Choose:

Start with smallest size (B2s for dev, D2s_v3 for prod)
Monitor CPU and memory usage
Scale up if consistently >70% utilization
Use Azure Advisor for recommendations

VM Pricing Models

Pay-as-you-go

No commitment, highest cost

Billed per second
Stop VM = stop compute charges
Storage still charged

Use for: Short-term, unpredictable workloads

Reserved Instances

1 or 3-year commitment

30-50% discount (1-year)
50-70% discount (3-year)
Can exchange for different size

Use for: Stable, long-running workloads

Spot VMs

Up to 90% discount

Can be evicted anytime
30-second warning
No SLA

Use for: Batch jobs, testing, fault-tolerant apps

Azure Hybrid Benefit

Use existing Windows licenses

Up to 40% discount
Requires Software Assurance
Windows Server + SQL Server

Use for: Migrations from on-premises

Managed Disks

Disk Types
Disk Caching
Performance Tuning

Disk Type	IOPS	Throughput	Use Case
Standard HDD	500	60 MB/s	Backup, non-critical
Standard SSD	500-6,000	60-750 MB/s	Web servers, dev/test
Premium SSD	120-20,000	25-900 MB/s	Production databases
Ultra Disk	Up to 160,000	Up to 4,000 MB/s	SAP HANA, top-tier SQL

Premium SSD Sizes:

P4:  32 GB,   120 IOPS,  25 MB/s
P10: 128 GB,  500 IOPS,  100 MB/s
P30: 1 TB,    5,000 IOPS, 200 MB/s
P80: 32 TB,   20,000 IOPS, 900 MB/s

Cache Options:

ReadOnly: Cache reads (default for data disks)
ReadWrite: Cache reads + writes (default for OS disks)
None: No caching (for write-heavy workloads)

When to use:

OS Disk: ReadWrite (best performance)
Data Disk (read-heavy): ReadOnly
Data Disk (write-heavy): None
Temp Disk: Don't store data (ephemeral)

Performance Impact:

With cache: 20,000 IOPS (cache + disk)
Without cache: 5,000 IOPS (disk only)

Maximize IOPS:

# 1. Use Premium SSD
az disk create \
  --name disk-data \
  --resource-group rg-prod \
  --size-gb 1024 \
  --sku Premium_LRS

# 2. Enable disk bursting (P20-P30)
az disk update \
  --name disk-data \
  --resource-group rg-prod \
  --enable-bursting true

# 3. Stripe multiple disks (RAID 0)
# Windows: Storage Spaces
# Linux: LVM or mdadm

# 4. Use Ultra Disk for extreme performance
az disk create \
  --name disk-ultra \
  --resource-group rg-prod \
  --size-gb 1024 \
  --sku UltraSSD_LRS \
  --disk-iops-read-write 50000 \
  --disk-mbps-read-write 2000

VM High Availability

Availability Sets

Protect against planned maintenance and hardware failures

Fault Domains: 2-3 (different racks)
Update Domains: Up to 20 (staggered updates)

SLA: 99.95% (2+ VMs in availability set)

Use when: Regional deployment, no zone support

Availability Zones

Protect against datacenter failures

Deploy VMs across 3 zones:
- Zone 1: VM 1, 4, 7
- Zone 2: VM 2, 5, 8
- Zone 3: VM 3, 6, 9

SLA: 99.99% (2+ VMs across zones)

Use when: Maximum availability, region supports zones

VM Scale Sets

Autoscaling group of identical VMs

Features:
- Autoscale (CPU, memory, schedule)
- Load balancer integration
- Rolling upgrades
- Instance protection

SLA: 99.95% (availability set) or 99.99% (zones)

Use when: Scalable, stateless applications

3. VM Scale Sets

VM Scale Sets (VMSS) automatically scale identical VMs based on demand.

VMSS Architecture

Create VM Scale Set

# Create VMSS with autoscaling
az vmss create \
  --name vmss-web \
  --resource-group rg-prod \
  --image UbuntuLTS \
  --vm-sku Standard_D2s_v3 \
  --instance-count 2 \
  --zones 1 2 3 \
  --vnet-name vnet-prod \
  --subnet snet-web \
  --lb lb-web \
  --backend-pool-name pool-web \
  --admin-username azureuser \
  --generate-ssh-keys

# Configure autoscale
az monitor autoscale create \
  --resource-group rg-prod \
  --resource vmss-web \
  --resource-type Microsoft.Compute/virtualMachineScaleSets \
  --name autoscale-web \
  --min-count 2 \
  --max-count 10 \
  --count 2

# Scale out rule (CPU > 75%)
az monitor autoscale rule create \
  --resource-group rg-prod \
  --autoscale-name autoscale-web \
  --condition "Percentage CPU > 75 avg 5m" \
  --scale out 1

# Scale in rule (CPU < 25%)
az monitor autoscale rule create \
  --resource-group rg-prod \
  --autoscale-name autoscale-web \
  --condition "Percentage CPU < 25 avg 5m" \
  --scale in 1

VMSS Rolling Upgrades

# Update VMSS image
az vmss update \
  --name vmss-web \
  --resource-group rg-prod \
  --set virtualMachineProfile.storageProfile.imageReference.version=latest

# Perform rolling upgrade
az vmss rolling-upgrade start \
  --name vmss-web \
  --resource-group rg-prod

# Monitor upgrade
az vmss rolling-upgrade get-latest \
  --name vmss-web \
  --resource-group rg-prod

Upgrade Policy:

Manual: You control when to upgrade
Rolling: Upgrade in batches (recommended)
Automatic: Upgrade immediately (risky)

[!WARNING] Gotcha: VMSS Rolling Upgrades Can Cause Downtime If you don’t configure health probes correctly, a rolling upgrade might terminate healthy instances before new ones are ready. Always set minAvailable in PodDisruptionBudget (for AKS) or use instance protection (for VMSS) to prevent too many instances from being down at once.

[!TIP] Jargon Alert: VM Scale Set (VMSS) A VM Scale Set is a group of identical VMs that automatically scale based on demand. Think of it like a restaurant: when it’s busy (high CPU), you hire more waiters (add VMs). When it’s slow (low CPU), you send waiters home (remove VMs). All waiters are identical (same VM image), so they can handle any table (request).

4. Azure App Service

What is App Service? App Service is Azure’s Platform-as-a-Service (PaaS) offering for hosting web applications. Think of it as a “managed web server” where you just deploy your code, and Microsoft handles everything else: OS updates, scaling, load balancing, SSL certificates, and more. Why Use App Service Instead of VMs?

Aspect	App Service	Virtual Machines
Setup Time	5 minutes	30+ minutes
OS Management	Microsoft handles	You manage
Scaling	Automatic (1-30 instances)	Manual or complex setup
SSL Certificates	Free (managed)	You install and renew
Deployment	Git push, ZIP, Docker	SSH, RDP, manual
Cost	$0-$ 700/month	$30-$ 2000+/month
Control	Limited (can’t install custom software)	Full control

Real-World Analogy:

VM: Like renting an empty apartment. You furnish it, maintain it, fix everything yourself.
App Service: Like staying in a hotel. Everything is provided, you just check in and use it.

When to Use App Service: ✅ Modern web applications (Node.js, Python, .NET, PHP, Java) ✅ REST APIs ✅ Mobile app backends ✅ You want to focus on code, not infrastructure ✅ Need automatic scaling ✅ Want zero-downtime deployments When NOT to Use App Service: ❌ Need to install custom software on the OS ❌ Need specific OS version (Windows Server 2012 R2) ❌ Legacy applications that require full VM control ❌ Need to run background services (use VMs or Container Instances)

Understanding App Service Architecture

Before diving in, let’s understand how App Service works:

Your Code (GitHub, Local)
    ↓
App Service (Azure)
    ├── Web Server (IIS for Windows, Nginx for Linux)
    ├── Runtime (Node.js, Python, .NET, etc.)
    ├── Auto-scaling (adds/removes instances)
    ├── Load Balancer (distributes traffic)
    └── SSL Termination (handles HTTPS)
        ↓
    Users (Internet)

The Pro’s View: What’s inside an App Service?

When you scale an App Service to “3 instances”, what actually happens?

The Front End (Load Balancer): This is a shared layer provided by Microsoft. It receives all traffic to *.azurewebsites.net. It terminates SSL and routes the request to your specific worker.
The Worker (The Compute): This is your instance. This is where your code runs. If you have “3 instances”, you have 3 separate worker VMs (though you don’t manage them).
The File Server (Shared Storage): This is the most important “secret”. Your code and files don’t live on the worker’s local disk; they live on a Managed Remote File Share.
- When you write a file to local storage in your code, it’s actually being written over the network to this share.
- All 3 instances see the exact same files. This is why you don’t have to sync files between instances!

[!WARNING] Performance Gotcha: The File System is a Network Because the file system is remote, reading/writing thousands of small files (like a massive node_modules folder or a Local SQLite DB) can be slow. Solution: Use WEBSITE_RUN_FROM_PACKAGE=1. This mounts your entire app as a read-only ZIP file, which is cached locally on the worker for blazing-fast startups and file access.

Key Concepts:

App Service Plan: The “hosting environment” that defines:
- How much CPU/RAM you get
- How many apps can run on it
- What features are available (slots, VNet, etc.)
- The cost
Web App: Your actual application running on the plan. You can have multiple web apps on one plan (to save money).
Deployment Slot: A separate instance of your app for testing. You can swap slots for zero-downtime deployments.

Step-by-Step: Creating Your First Web App

Let’s create a complete web application from scratch:

Step 1: Create App Service Plan

What is an App Service Plan? Think of it as the “hosting package” that defines the resources and features available.

# Create App Service Plan (Free tier for learning)
az appservice plan create \
  --name plan-learn \
  --resource-group rg-learn-app \
  --sku FREE \
  --location eastus

# What this does:
# --name: Name of the plan (must be globally unique)
# --sku FREE: Pricing tier (FREE, B1, S1, P1V2, etc.)
#   - FREE: $0/month, 1 GB RAM, 60 minutes/day compute
#   - B1: $55/month, 1.75 GB RAM, always on
#   - S1: $100/month, 1.75 GB RAM, autoscaling, slots
#   - P1V2: $400/month, 3.5 GB RAM, better performance

Understanding SKUs:

SKU	Price	RAM	Always On	Slots	Autoscale	Use Case
FREE	$0	1 GB	❌	❌	❌	Learning only
B1	$55	1.75 GB	✅	❌	❌	Dev/test
S1	$100	1.75 GB	✅	✅ (5)	✅ (10)	Production
P1V2	$400	3.5 GB	✅	✅ (20)	✅ (30)	High traffic

Why create plan separately? You can host multiple web apps on one plan (saves money). Each app shares the plan’s resources.

Step 2: Create Web App

# Create Web App
az webapp create \
  --name mywebapp-learn-$(date +%s) \
  --resource-group rg-learn-app \
  --plan plan-learn \
  --runtime "NODE|18-lts"

# What this does:
# --name: Name of web app (must be globally unique, like a domain)
#   - Format: <name>.azurewebsites.net
#   - Example: mywebapp-learn-1234567890.azurewebsites.net
# --runtime: Programming language and version
#   Options:
#     - "NODE|18-lts" (Node.js 18 LTS)
#     - "PYTHON|3.11" (Python 3.11)
#     - "DOTNETCORE|7.0" (.NET 7)
#     - "PHP|8.2" (PHP 8.2)
#     - "JAVA|17" (Java 17)

Why the unique name? The web app name becomes part of the URL (mywebapp-learn-1234567890.azurewebsites.net). It must be globally unique across all Azure customers.

[!WARNING] Gotcha: App Service Name Cannot Be Changed Once you create an App Service, the name is permanent. You can’t rename it. If you need a different name, you must create a new app and migrate. Choose your name carefully!

[!TIP] Jargon Alert: App Service Plan An App Service Plan is like a “hosting package” that defines:

How much CPU/RAM you get

How many apps can run on it (you can host multiple apps on one plan)

What features are available (slots, VNet, autoscaling)

The cost

Think of it like a gym membership: the plan determines what equipment (features) you can use.

Step 3: Create a Simple Application

Let’s create a simple Node.js application:

# Create project directory
mkdir my-web-app
cd my-web-app

# Initialize Node.js project
npm init -y

# Install Express (web framework)
npm install express

# Create app.js
cat > app.js << 'EOF'
const express = require('express');
const app = express();
const port = process.env.PORT || 3000;

app.get('/', (req, res) => {
  res.send(`
    <h1>Hello from Azure App Service!</h1>
    <p>This is my first web app on Azure.</p>
    <p>Node.js version: ${process.version}</p>
    <p>Environment: ${process.env.WEBSITE_SITE_NAME || 'local'}</p>
  `);
});

app.get('/api/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

app.listen(port, () => {
  console.log(`Server running on port ${port}`);
});
EOF

# Create package.json (if not exists)
cat > package.json << 'EOF'
{
  "name": "my-web-app",
  "version": "1.0.0",
  "main": "app.js",
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {
    "express": "^4.18.0"
  }
}
EOF

What this code does:

Creates a simple Express.js web server
Responds to GET requests at / (homepage)
Has a health check endpoint at /api/health
Uses process.env.PORT (Azure sets this automatically)

Step 4: Deploy to App Service

Option A: Deploy from Local ZIP

# Create ZIP file
zip -r app.zip . -x "*.git*" "node_modules/*"

# Deploy to App Service
az webapp deployment source config-zip \
  --resource-group rg-learn-app \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --src app.zip

Option B: Deploy from GitHub (Recommended)

# First, push your code to GitHub
git init
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/YOUR_USERNAME/my-web-app.git
git push -u origin main

# Configure App Service to deploy from GitHub
az webapp deployment source config \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app \
  --repo-url https://github.com/YOUR_USERNAME/my-web-app.git \
  --branch main \
  --manual-integration

What happens during deployment:

Azure downloads your code from GitHub
Runs npm install (installs dependencies)
Looks for package.json → scripts.start
Runs npm start (starts your app)
Your app is live at https://mywebapp-learn-<NUMBER>.azurewebsites.net

Step 5: Access Your Web App

# Get the URL
az webapp show \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app \
  --query defaultHostName \
  --output tsv

# Visit in browser:
# https://mywebapp-learn-<NUMBER>.azurewebsites.net

What you’ll see:

Homepage: “Hello from Azure App Service!”
Health check: https://<URL>/api/health returns JSON

Step 6: View Logs

Real-time logs (see what your app is doing):

az webapp log tail \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app

Download logs:

az webapp log download \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app \
  --log-file app-logs.zip

Step 7: Configure Environment Variables

What are environment variables? Configuration values that change between environments (dev, staging, production).

# Set environment variable
az webapp config appsettings set \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app \
  --settings \
    DATABASE_URL="postgresql://user:pass@host:5432/db" \
    API_KEY="secret-key-123" \
    NODE_ENV="production"

# View environment variables
az webapp config appsettings list \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app

In your code, access them:

const dbUrl = process.env.DATABASE_URL;
const apiKey = process.env.API_KEY;

Best Practice: Never commit secrets to Git. Use environment variables or Azure Key Vault.

Step 8: Enable Continuous Deployment

What is Continuous Deployment? Automatically deploy new code when you push to GitHub.

# Enable continuous deployment
az webapp deployment source config \
  --name mywebapp-learn-<YOUR_NUMBER> \
  --resource-group rg-learn-app \
  --repo-url https://github.com/YOUR_USERNAME/my-web-app.git \
  --branch main \
  --manual-integration false

How it works:

You push code to GitHub
App Service detects the push
Automatically downloads and deploys new code
Your app updates in 1-2 minutes

Workflow:

# Make a change
echo "console.log('New version!');" >> app.js

# Commit and push
git add app.js
git commit -m "Add logging"
git push

# App Service automatically deploys (check logs to see it)

Understanding App Service Features

Deployment Slots (Zero-Downtime Deployments)
Autoscaling
Custom Domains & SSL

What it is: Separate instances of your app for testing before going live.How it works:

Deploy new version to “staging” slot
Test it thoroughly
Swap staging ↔ production (instant, zero downtime)
If issues, swap back (instant rollback)

Example:

# Create staging slot
az webapp deployment slot create \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging

# Deploy to staging
az webapp deployment source config-zip \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging \
  --src app-v2.zip

# Test staging URL: mywebapp-staging.azurewebsites.net

# Swap to production (zero downtime)
az webapp deployment slot swap \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging \
  --target-slot production

Benefits:

Test in production-like environment
Zero downtime deployments
Instant rollback if issues
Warm up app before swap (no cold start)

What it is: Automatically add/remove instances based on traffic.How it works:

Monitor metrics (CPU, memory, requests)
When threshold exceeded → add instance
When traffic drops → remove instance

Example:

# Configure autoscaling
az monitor autoscale create \
  --resource-group rg-prod \
  --resource mywebapp \
  --resource-type Microsoft.Web/serverfarms \
  --name autoscale-app \
  --min-count 2 \
  --max-count 10 \
  --count 2

# Scale out when CPU > 70%
az monitor autoscale rule create \
  --resource-group rg-prod \
  --autoscale-name autoscale-app \
  --condition "Percentage CPU > 70 avg 5m" \
  --scale out 1

# Scale in when CPU < 30%
az monitor autoscale rule create \
  --resource-group rg-prod \
  --autoscale-name autoscale-app \
  --condition "Percentage CPU < 30 avg 5m" \
  --scale in 1

Cost Impact:

2 instances: $200/month
10 instances (peak): $1,000/month
Auto-scales down at night → saves money

What it is: Use your own domain (example.com) instead of azurewebsites.net.Steps:

Buy domain (GoDaddy, Namecheap, etc.)
Add CNAME record pointing to your app
Add custom domain in App Service
Enable SSL (free managed certificate)

Example:

# Add custom domain
az webapp config hostname add \
  --webapp-name mywebapp \
  --resource-group rg-prod \
  --hostname www.example.com

# Enable SSL (free managed certificate)
az webapp config ssl bind \
  --name mywebapp \
  --resource-group rg-prod \
  --certificate-name example-com-cert \
  --ssl-type SNI

Result: Your app is accessible at https://www.example.com with a valid SSL certificate (green padlock in browser).

[!WARNING] Gotcha: SSL Certificate Propagation After adding a custom domain and SSL certificate, it can take 24-48 hours for DNS and SSL to fully propagate. Don’t panic if it doesn’t work immediately. Use dig or nslookup to verify DNS is pointing to your app.

[!INFO] Aside: App Service Free Tier Limitations The FREE tier is great for learning, but has serious limitations:

Apps “sleep” after 20 minutes of inactivity (takes 30+ seconds to wake up)

No custom domains

No SSL certificates

No deployment slots

No autoscaling

For production, use at least the Basic tier ($55/month). The FREE tier is only for learning/testing.

[!TIP] Jargon Alert: Deployment Slot A deployment slot is a separate instance of your app. Think of it like having two identical apartments—you can test new furniture (code) in one apartment before moving it to your main apartment. Slots enable zero-downtime deployments: deploy to staging slot, test it, then swap it with production instantly.

App Service Plans

Pricing Tiers
Features by Tier

Tier	Price	Features	Use Case
Free	$0	1 GB RAM, 60 min/day	Learning
Shared	$10/month	1 GB RAM, custom domain	Hobby projects
Basic	$55/month	1.75 GB RAM, SSD	Dev/test
Standard	$100/month	Autoscale, slots, VNet	Production
Premium	$400/month	More scale, better perf	High-traffic
Isolated	$700/month	Dedicated VNet (ASE)	Enterprise

Free/Shared:
❌ No custom SSL
❌ No autoscaling
❌ No deployment slots
❌ No VNet integration

Basic:
✅ Custom SSL
❌ No autoscaling
❌ No deployment slots
❌ No VNet integration

Standard:
✅ Custom SSL
✅ Autoscaling (up to 10 instances)
✅ Deployment slots (5)
✅ VNet integration
✅ Backup/restore

Premium:
✅ All Standard features
✅ Autoscaling (up to 30 instances)
✅ Deployment slots (20)
✅ Better performance
✅ Zone redundancy

Isolated (ASE):
✅ All Premium features
✅ Fully isolated network
✅ Internal load balancer
✅ Compliance (PCI, HIPAA)

Deployment Slots

Deployment Slots enable zero-downtime deployments.

# Create deployment slot
az webapp deployment slot create \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging

# Deploy to staging
az webapp deployment source config \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging \
  --repo-url https://github.com/user/repo \
  --branch main

# Swap slots (staging → production)
az webapp deployment slot swap \
  --name mywebapp \
  --resource-group rg-prod \
  --slot staging \
  --target-slot production

App Service Best Practices

1. Use Deployment Slots

Deploy to staging, test, then swap to production. Instant rollback if issues.

2. Enable Always On

Prevents app from unloading after idle time. Critical for production.

az webapp config set \
  --name mywebapp \
  --resource-group rg-prod \
  --always-on true

3. Use VNet Integration

Connect to private resources (databases, storage) without public endpoints.

az webapp vnet-integration add \
  --name mywebapp \
  --resource-group rg-prod \
  --vnet vnet-prod \
  --subnet snet-app

4. Configure Autoscale

Scale based on CPU, memory, or custom metrics.

az monitor autoscale create \
  --resource-group rg-prod \
  --resource mywebapp \
  --resource-type Microsoft.Web/serverfarms \
  --name autoscale-app \
  --min-count 2 \
  --max-count 10 \
  --count 2

5. Use Managed Identity

No secrets in code. Authenticate to Azure services automatically.

# Enable managed identity
az webapp identity assign \
  --name mywebapp \
  --resource-group rg-prod

# Grant access to Key Vault
az keyvault set-policy \
  --name myvault \
  --object-id <identity-id> \
  --secret-permissions get list

5. Azure Container Instances (ACI)

ACI runs containers without managing VMs or orchestrators.

When to Use ACI

✅ Use ACI For

Quick container execution
CI/CD build agents
Batch jobs
Event-driven tasks
Dev/test environments

❌ Don't Use ACI For

Multi-container orchestration
Service discovery
Load balancing
Health checks → Use AKS instead

Deploy Container

# Deploy single container
az container create \
  --name aci-demo \
  --resource-group rg-demo \
  --image mcr.microsoft.com/azuredocs/aci-helloworld \
  --cpu 1 \
  --memory 1 \
  --ip-address Public \
  --dns-name-label aci-demo-unique \
  --ports 80

# Deploy multi-container group (sidecar pattern)
az container create \
  --resource-group rg-demo \
  --name multi-container \
  --image nginx \
  --cpu 1 \
  --memory 1 \
  --ports 80 \
  --environment-variables LOG_LEVEL=debug

# Get logs
az container logs \
  --name aci-demo \
  --resource-group rg-demo

# Execute command in container
az container exec \
  --name aci-demo \
  --resource-group rg-demo \
  --exec-command "/bin/bash"

6. Interview Questions

Beginner

Q1: When would you choose App Service over VMs?

App Service (PaaS):

Less management (Microsoft handles OS, patching)
Built-in autoscaling, deployment slots
Faster time-to-market
Cost-effective for web apps

Virtual Machines (IaaS):

Full control over OS and software
Custom configurations
Legacy applications
Specific compliance requirements

Decision: Use App Service unless you need full OS control.

Q2: Explain availability sets vs availability zones

Availability Sets:

Protect against hardware failures within a datacenter
Fault domains (different racks) + Update domains (staggered updates)
SLA: 99.95%

Availability Zones:

Protect against entire datacenter failures
Physically separate datacenters (separate power, cooling, network)
SLA: 99.99%

Best Practice: Use availability zones for production workloads.

Intermediate

Q3: Design a scalable web application architecture

Architecture:

Frontend:
- Azure Front Door (global load balancing, WAF)
- App Service (autoscale 2-20 instances)
- Deployment slots (blue-green deployments)

Backend:
- VMSS or AKS (for microservices)
- Autoscaling based on CPU/memory
- Load balancer (internal)

Data:
- Azure SQL (zone-redundant)
- Redis Cache (session management)
- Blob Storage (static assets)

Monitoring:
- Application Insights (APM)
- Log Analytics (centralized logs)
- Autoscale based on custom metrics

CI/CD:
- GitHub Actions or Azure DevOps
- Deploy to staging slot → test → swap

Cost Optimization:
- Use B-series VMs for dev/test
- Reserved Instances for production
- Autoscale to match demand

Q4: How do you minimize VM costs?

Strategies:

1. Right-size VMs:
   - Monitor CPU/memory usage
   - Downsize underutilized VMs
   - Use Azure Advisor recommendations

2. Use Reserved Instances:
   - 1-year: 30-50% savings
   - 3-year: 50-70% savings
   - For stable, long-running workloads

3. Spot VMs:
   - Up to 90% discount
   - For fault-tolerant workloads (batch, testing)

4. Stop VMs when not in use:
   - Dev/test: Stop nights and weekends
   - Use Azure Automation for scheduling

5. Use B-series (Burstable):
   - For variable workloads
   - Accumulate credits when idle

6. Azure Hybrid Benefit:
   - Use existing Windows licenses
   - Up to 40% savings

7. Delete unused resources:
   - Unattached disks
   - Old snapshots
   - Orphaned NICs and public IPs

8. Use autoscaling:
   - Scale down during low traffic
   - Scale up during high traffic

Advanced

Q5: Implement blue-green deployment with zero downtime

Blue-Green Deployment with App Service:

1. Setup:
   Production slot (blue): Currently serving traffic
   Staging slot (green): New version

2. Deploy to Green:
   az webapp deployment source config \
     --name mywebapp \
     --slot staging \
     --repo-url https://github.com/user/repo \
     --branch release/v2.0

3. Test Green:
   - Access staging URL: mywebapp-staging.azurewebsites.net
   - Run smoke tests, integration tests
   - Verify database migrations

4. Warm Up Green:
   az webapp deployment slot swap \
     --name mywebapp \
     --slot staging \
     --target-slot production \
     --action preview

   # App Service warms up staging before swap

5. Swap (Zero Downtime):
   az webapp deployment slot swap \
     --name mywebapp \
     --slot staging \
     --target-slot production

   # Traffic instantly switches to green
   # No connection drops

6. Rollback (if needed):
   az webapp deployment slot swap \
     --name mywebapp \
     --slot production \
     --target-slot staging

   # Instant rollback (just swap again)

Benefits:
✅ Zero downtime
✅ Instant rollback
✅ Test in production-like environment
✅ No infrastructure changes

Q6: Optimize VM performance for database workload

SQL Server on Azure VM Optimization:

1. Choose Right VM Size:
   - Memory-optimized: E-series (8:1 memory:CPU)
   - Example: Standard_E16s_v5 (16 vCPU, 128 GB RAM)

2. Storage Configuration:
   - OS Disk: Premium SSD P30 (ReadWrite cache)
   - Data Files: Premium SSD P40+ (ReadOnly cache)
   - Log Files: Premium SSD P30 (None cache)
   - TempDB: Local NVMe SSD

3. Disk Striping:
   # Windows Storage Spaces (RAID 0)
   - Stripe 4x P30 disks → 20,000 IOPS
   - Better than 1x P80 (same IOPS, more expensive)

4. SQL Server Configuration:
   - Max Server Memory: 80% of VM RAM
   - TempDB on local SSD (D: drive)
   - Multiple data files (8 files for TempDB)
   - Instant File Initialization: Enabled

5. Network Optimization:
   - Enable Accelerated Networking
   - Private Endpoint for Azure SQL connectivity
   - No public IPs

6. Backup Strategy:
   - Azure Backup (application-consistent)
   - Backup to Blob Storage (cool tier)
   - Retention: 7 days (daily), 4 weeks (weekly)

7. Monitoring:
   - Azure Monitor for VMs
   - SQL Insights (database metrics)
   - Alert on CPU > 80%, Memory > 85%

Result:
- 20,000+ IOPS
- &lt;1ms latency (local SSD for TempDB)
- 99.95% availability (availability zones)

Troubleshooting: When Compute Fails

Production environments aren’t perfect. Here is how to debug the two most common compute services.

1. Virtual Machine: “VM Not Responding”

If you can’t SSH/RDP into your VM, follow this triage:

Resource Health: Check “Resource Health” in the portal. If it says “Platform Initiated”, Microsoft is currently moving your VM due to hardware failure. Wait 5 minutes.
Serial Console: Use the Serial Console tool. This gives you a direct command-line view of the VM’s boot process, even if the network is down.
Boot Diagnostics: Check the screenshot in “Boot Diagnostics”. See an “Update” screen or a “Blue Screen of Death” (BSOD)?
Redeploy: As a last resort, click Redeploy. This forces the Fabric Controller to move your VM to a completely different physical host.

2. App Service: “503 Service Unavailable”

If your website is down:

Diagnose and Solve Problems: Use this built-in tool in the App Service portal. It’s surprisingly good at detecting things like “High Memory Usage” or “IP Restrictions”.
Log Stream: Check the Live Log Stream. Are you seeing “Out of Memory” (OOM) errors?
Kudu Console: Go to https://<appname>.scm.azurewebsites.net. This is the “Kudu” management site. You can browse files, check processes, and run commands directly on the worker.
Restart (Advanced): Don’t just restart the App. Restart the App Service Plan. This recycles all workers and can clear “Zombie Processes” that a simple app restart misses.

7. Key Takeaways

Choose the Right Compute

VMs for control, App Service for simplicity, AKS for microservices, Functions for events.

Use Availability Zones

Deploy across zones for 99.99% SLA. Critical for production.

Autoscaling is Essential

Scale based on demand. Save money during low traffic, handle spikes automatically.

Managed Identities

No secrets in code. Every compute service supports managed identity.

Cost Optimization

Right-size, use reserved instances, stop when not needed, leverage spot VMs.

Deployment Slots

Zero-downtime deployments with instant rollback. Use for all production apps.

Interview Deep-Dive

When would you choose App Service over AKS for a production web application, and when would that choice be wrong?

Strong Candidate Answer:

Choose App Service when: You have a standard web application with a team of fewer than 10 engineers, predictable traffic, and no need for sidecar containers. App Service P1v3 at $74/month gives you auto-scaling, deployment slots, managed SSL, and zero Kubernetes overhead. A team of 3 developers can deploy 10 times a day without a platform engineer.
Choose AKS when: You have 5+ microservices scaling independently, need service mesh for inter-service communication, or require multi-cloud portability with Helm charts. AKS wins for GPU workloads, custom admission controllers, and CronJobs.
Where App Service fails: 20+ microservices create management sprawl — each is a separate App Service plan with no built-in service discovery. At that scale, AKS with a single cluster is operationally simpler.
Where AKS is wrong: A startup with 2 engineers choosing AKS for one REST API is over-engineering. The Kubernetes learning curve costs 2-4 weeks. Minimum viable node pool (2x Standard_D2s_v5) is $140/month plus Container Registry, monitoring, and 20% of an engineer’s time on platform work.

Follow-up: The CTO insists on migrating everything from App Service to AKS because “Kubernetes is the future.” How do you push back?Quantify the cost: 3-6 months of engineering work, a platform engineer ($150K+/year), and 40-60% higher infrastructure costs. Present migration as risk-reward: what specific problem does AKS solve that App Service cannot? If the answer is “nothing right now,” adopt AKS for new services and leave working App Service workloads alone.

Your VMSS auto-scales at 70% CPU but new instances take 5 minutes to be healthy. Users see errors during scale-out. How do you fix this?

Strong Candidate Answer:

Root cause: New VMs must boot OS (60-90s), run script extensions to install dependencies (2-3 min), register with load balancer, and pass health probes. During this window, existing instances are overloaded.
Fix 1 — Lower threshold: Scale at 50% CPU instead of 70%. Triggers 3-5 minutes earlier, before saturation. Cost of 1-2 extra instances during normal traffic is negligible versus user errors.
Fix 2 — Custom VM images: Bake all dependencies into a Packer image. Boot time drops from 5 minutes to 60-90 seconds. This is the single biggest improvement.
Fix 3 — Predictive autoscaling: For predictable patterns (9 AM peaks), use scheduled scaling to pre-provision 15 minutes early.
Fix 4 — Consider containers: Container startup is 2-5 seconds vs 60-300 for VMs. AKS with Horizontal Pod Autoscaler eliminates the VM boot problem entirely.

Follow-up: After optimization, instances start in 90 seconds but existing ones still hit 95% CPU during the gap. What else?Deploy a minimum instance count at 120% of average traffic. The extra 20% costs $50/month but absorbs the burst. For mission-critical systems, combine pre-provisioned headroom with container-based scaling.

Compare IaaS, PaaS, and Serverless for a real-time pipeline ingesting 10 million events per day. Which do you choose?

Strong Candidate Answer:

IaaS (VMs + Kafka/Flink): Maximum control. 4x D8s_v5 VMs at $2,500/month plus one full-time engineer at$ 12K/month = $14,500/month total cost.
PaaS (Event Hub + Stream Analytics): Event Hub at $500/month + Stream Analytics at$ 450/month = $950/month infra, plus 10% engineer time. Total:$ 2,150/month.
Serverless (Event Hub + Functions): Event Hub $500/month + Functions$ 5/month = $550/month total. But windowed aggregations in Functions require external state management — added complexity.
My recommendation: PaaS for core pipeline. Stream Analytics SQL makes windowed aggregations trivial. Functions for ancillary processing (dead-letter handling, alerting). I would choose IaaS only if processing requires custom GPU models or volume exceeds 1 billion events/day where PaaS pricing becomes prohibitive.

Follow-up: At 1 billion events/day, does PaaS still make sense?At 1B events/day, PaaS costs ~

10,500/month. Comparable IaaS is ~

8,000/month in VMs but add $12,000/month engineering overhead. PaaS remains cheaper TCO until approximately 5B events/day, when dedicated infrastructure with a platform team becomes more economical.

Next Steps

Continue to Chapter 5

Master Azure Storage: Blob, Files, Disks, and data management strategies

03c. Traffic & Security Storage Solutions

Documentation Index

​Compute Services

​What You’ll Learn

​What is “Compute”? (Start Here if You’re Completely New)

​The Simple Explanation

​Real-World Analogy

​Where Does Your Code Run?

​Why Azure Has Multiple “Compute” Services

​What is “Compute”?**

​Understanding the Compute-to-Application Relationship

​Breaking Down “Compute Resources”

​Understanding the Compute Spectrum

​The Evolution of Compute Needs

​Why Multiple Compute Options?

​Key Concepts You Must Understand

​1. IaaS vs PaaS vs Serverless

​2. Stateless vs Stateful Applications

​3. Horizontal vs Vertical Scaling

​1. Compute Decision Tree

​2. Virtual Machines Deep Dive

​Under the Hood: How Azure Compute Works

​1. The Fabric Controller (The Brain)

​2. The Hypervisor (The Gatekeeper)

​3. Service Healing (Self-Correcting Infrastructure)

​Understanding VM Components

​VM Size Families

​Step-by-Step: Creating Your First VM

​Prerequisites

​Step 1: Create Resource Group

​Step 2: Create Virtual Network

​Step 3: Create Network Security Group (NSG)

​Step 4: Create Public IP Address

​Step 5: Create Network Interface (NIC)

​Step 6: Create the Virtual Machine

​Step 7: Connect to Your VM

​Step 8: Verify VM is Working

​Step 9: Install Software (Example: Nginx Web Server)

​Step 10: Clean Up (Important!)

​Understanding VM Creation Options

​VM Pricing Models

Pay-as-you-go

Reserved Instances

Spot VMs

Azure Hybrid Benefit

​Managed Disks

​VM High Availability

​3. VM Scale Sets

​VMSS Architecture

​Create VM Scale Set

​VMSS Rolling Upgrades

​4. Azure App Service

​Understanding App Service Architecture

​The Pro’s View: What’s inside an App Service?

​Step-by-Step: Creating Your First Web App

​Step 1: Create App Service Plan

​Step 2: Create Web App

​Step 3: Create a Simple Application

​Step 4: Deploy to App Service

​Step 5: Access Your Web App

​Step 6: View Logs

​Step 7: Configure Environment Variables

​Step 8: Enable Continuous Deployment

​Understanding App Service Features

​App Service Plans

​Deployment Slots

​App Service Best Practices

​5. Azure Container Instances (ACI)

​When to Use ACI

✅ Use ACI For

❌ Don't Use ACI For

​Deploy Container

​6. Interview Questions

​Beginner

​Intermediate

​Advanced

​Troubleshooting: When Compute Fails

​1. Virtual Machine: “VM Not Responding”

​2. App Service: “503 Service Unavailable”

​7. Key Takeaways

Choose the Right Compute

Compute Services

What You’ll Learn

What is “Compute”? (Start Here if You’re Completely New)

The Simple Explanation

Real-World Analogy

Where Does Your Code Run?

Why Azure Has Multiple “Compute” Services

What is “Compute”?**

Understanding the Compute-to-Application Relationship

Breaking Down “Compute Resources”

Understanding the Compute Spectrum

The Evolution of Compute Needs

Why Multiple Compute Options?

Key Concepts You Must Understand

1. IaaS vs PaaS vs Serverless

2. Stateless vs Stateful Applications

3. Horizontal vs Vertical Scaling

1. Compute Decision Tree

2. Virtual Machines Deep Dive

Under the Hood: How Azure Compute Works

1. The Fabric Controller (The Brain)

2. The Hypervisor (The Gatekeeper)

3. Service Healing (Self-Correcting Infrastructure)

Understanding VM Components

VM Size Families

Step-by-Step: Creating Your First VM

Prerequisites

Step 1: Create Resource Group

Step 2: Create Virtual Network

Step 3: Create Network Security Group (NSG)

Step 4: Create Public IP Address

Step 5: Create Network Interface (NIC)

Step 6: Create the Virtual Machine

Step 7: Connect to Your VM

Step 8: Verify VM is Working

Step 9: Install Software (Example: Nginx Web Server)

Step 10: Clean Up (Important!)

Understanding VM Creation Options

VM Pricing Models

Managed Disks

VM High Availability

3. VM Scale Sets

VMSS Architecture

Create VM Scale Set

VMSS Rolling Upgrades

4. Azure App Service

Understanding App Service Architecture

The Pro’s View: What’s inside an App Service?

Step-by-Step: Creating Your First Web App

Step 1: Create App Service Plan

Step 2: Create Web App

Step 3: Create a Simple Application

Step 4: Deploy to App Service

Step 5: Access Your Web App

Step 6: View Logs

Step 7: Configure Environment Variables

Step 8: Enable Continuous Deployment

Understanding App Service Features

App Service Plans

Deployment Slots

App Service Best Practices

5. Azure Container Instances (ACI)

When to Use ACI

Deploy Container

6. Interview Questions

Beginner

Intermediate

Advanced

Troubleshooting: When Compute Fails

1. Virtual Machine: “VM Not Responding”

2. App Service: “503 Service Unavailable”

7. Key Takeaways

Interview Deep-Dive

Next Steps