Virtual Machine Availability and Downtime
Azure VMs can be affected by:
-
Unplanned Hardware Maintenance
-
Triggered when Azure predicts a hardware/platform failure.
-
Uses Live Migration to move VMs to healthy hardware, causing minimal downtime.
-
-
Unexpected Downtime
-
Caused by hardware/network failures.
-
Azure automatically migrates the VM to healthy hardware, but a reboot may occur, and temporary disk data can be lost.
-
-
Planned Maintenance
-
Routine updates to the underlying Azure platform.
-
Usually no impact on VMs.
-
Microsoft does not update your VM OS or software—this is the administrator’s responsibility.
-
2. Availability Sets
Purpose: Minimize impact of downtime by avoiding single points of failure.
-
VMs in an availability set are distributed across:
-
Fault Domains (FDs): Physical separation (racks, power, networking). At least 2 FDs per set.
-
Update Domains (UDs): Logical groupings for rolling updates. By default 5 UDs, can configure up to 20.
-
Best Practices:
-
Place multiple VMs in an Availability Set for redundancy.
-
Separate application tiers into different Availability Sets.
-
Combine with Load Balancer and Managed Disks.
SLA Guarantees:
-
Two or more VMs across Availability Zones: 99.99% connectivity.
-
Two or more VMs in an Availability Set: 99.95% connectivity.
-
Single VM using Premium Storage: 99.9% connectivity.
3. Availability Zones
-
High availability across physical datacenters in a region.
-
Each zone has independent power, cooling, and networking.
-
Minimum of 3 zones per region.
-
Use zonal services (resources pinned to a zone) or zone-redundant services (automatic replication).
-
SLA: 99.99% VM uptime when properly configured.
4. Scaling Concepts
Vertical Scaling (Scale Up/Down)
-
Change the size of a VM (CPU, memory).
-
Useful for under-utilized or resource-intensive VMs.
-
Limited by hardware availability; usually requires VM restart.
Horizontal Scaling (Scale Out/In)
-
Change the number of VM instances.
-
More flexible in cloud environments.
-
Supports autoscale to dynamically adjust based on workload.
5. Virtual Machine Scale Sets
Purpose: Manage a group of identical VMs with auto-scaling capability.
Benefits:
-
Simplifies management of hundreds of VMs.
-
Supports Azure Load Balancer (Layer 4) and Application Gateway (Layer 7).
-
Autoscale adjusts VM count dynamically to meet demand.
-
Supports up to 1,000 VMs (300 for custom images).
Configuration Considerations:
-
Initial instance count, VM size, managed disks, Azure Spot instances.
-
Spreading algorithm (max spreading recommended).
-
Scaling beyond 100 instances may require multiple placement groups.
6. Autoscale
Features:
-
Automatically adjusts VM count based on performance metrics.
-
Scale Out: Increase VM instances when demand rises.
-
Scale In: Decrease VM instances when demand falls.
-
Supports scheduled scaling (fixed time scaling).
Configuration Parameters:
-
Minimum/Maximum number of VMs
-
Default number of VMs
-
CPU thresholds for scale-out and scale-in
-
Number of VMs to add/remove per scaling event
Benefits: Reduces cost and management overhead while ensuring application performance.