Building a Production-Ready Azure VM Terraform Module

When I sit down to craft a Terraform module, I ask myself how future me—and the teams inheriting my code—will reason about every decision. I remind myself to start with clarity, keep security opinionated but flexible, and prove the workflow end to end before anyone else runs terraform apply. I literally keep a checklist on my desk that asks, “Have I explained why this variable exists? Have I enforced the right guardrails? Have I shown someone how to run this without me?” and I do not move on until each answer is yes. With that inner dialogue guiding the work, this walkthrough captures the blueprint I followed to deliver the hardened Azure VM module documented in this article so you can mirror the approach on your own projects.


1. Blueprint Overview

  • Start by mapping stakeholder requirements (naming, security, monitoring, backup) and translating them into Terraform objectives.
  • Design the module skeleton—modules/azure_vm/main.tf, variables.tf, locals.tf, outputs.tf, versions.tf, and the module README.md—so every concern lands in a predictable file.
  • Keep validation and testing in scope from day one by running terraform fmt, terraform validate, and terraform plan inside examples/basic/ after each major change.
  • Document expectations as you go; the README is the contract others will read first (modules/azure_vm/README.md).

2. Define Platform Constraints

  • Pin a provider version that supports every required feature while remaining widely available (modules/azure_vm/versions.tf).
  • Configure AzureRM, Random, and TLS providers for encryption keys, naming helpers, and the VM resources you expect.
  • Avoid Terraform experiments for portability; we reverted to core features once the optional-attribute journey proved fragile.

3. Model Inputs Intentionally

  • Accept flexible maps for complex objects (see structured objects in modules/azure_vm/variables.tf), layering validations so callers still receive helpful errors.
  • Distill raw inputs into normalized locals (modules/azure_vm/locals.tf), guaranteeing defaults even when fields are missing or null.
  • Enforce basic hygiene (non-empty location short names, mandatory VM size and admin account, at least one image source) with explicit validation rules in variables.tf.

4. Standardize Naming & Tagging

  • Generate sanitized tokens for location, service, environment, and customer with helper locals in locals.tf (no experimental functions).
  • Assemble canonical resource names (resource group, network, Key Vault, VM, diagnostics, etc.) and store them once in locals for reuse.
  • Merge tag layers—module defaults, input tags, ad-hoc extras—then apply local.combined_tags everywhere.

5. Build Networking Foundations

  • Accommodate existing and new VNets/subnets: check for IDs, create only when requested, and use Azure data sources for lookups.
  • Default posture: no public IP and no inbound NSG rules. Prefer Azure Bastion or JIT for admin access. If SSH must be exposed, restrict 22/TCP to known corporate egress ranges.
  • Wire NSG rules, optional public IP, and NIC associations using normalized locals to keep logic concise.
  • Honour Azure’s current schema: resource arguments such as dns_servers, ip_configurations, and service_endpoints prevent plan-time drift.

6. Embed Security & Compliance

  • Provision a Key Vault with configurable access policies, optional private endpoints, and customer-managed keys (CMK).
  • Enable Trusted Launch by default (secure_boot_enabled = true, vtpm_enabled = true). This requires Gen2 images—validate the selected image accordingly. Optionally offer Confidential VM for supported SKUs and attestation requirements.
  • Generate SSH keys only when needed, storing secrets in Key Vault when allowed.
  • Apply Disk Encryption Sets, identity defaults, and guardrails (no SSH without keys; Windows uses passwords).

7. Cover Operations & Observability

  • Prefer Azure Monitor Agent (AMA) with Data Collection Rules (DCR) and VM Insights for guest telemetry. Keep azurerm_monitor_diagnostic_setting for platform resource logs where needed (e.g., NIC/PIP).
  • Add baseline alerts (CPU, memory, disk, VM deallocation, NIC/NSG changes) to make signals actionable.
  • Surface outputs other teams rely on (VM ID, NIC, network IDs, Key Vault, DES) while keeping secrets out of stdout (modules/azure_vm/outputs.tf).
  • Provide turnkey examples so consumers can initialize Terraform in a dedicated directory and see successful plan output right away.

8. Test, Iterate, Document

  • Format and validate after every major change; we fixed numerous plan-time errors this way by repeatedly running terraform fmt, terraform validate, and terraform plan within examples/basic/.
  • Capture lessons learned—API quirks, authentication expectations, provider differences—in the README and commit notes.
  • Remind users that Terraform still needs Azure credentials; module quality can’t solve missing logins.

9. Governance, Backup, and Update Management

  • Azure Policy: assign initiatives to disallow public IP on NICs, require Trusted Launch, enforce required tags/allowed locations, and require AMA on VMs.
  • Backup/DR: protect the VM with a Recovery Services Vault and a backup policy for daily snapshots and retention; document restore testing and RPO/RTO expectations.
  • Updates: expose maintenance configurations or schedules with Azure Update Manager; set patch_mode/assessment to align with organizational SLAs.

10. CI/CD and IaC Security

  • Pipeline checks: run terraform fmt -check, terraform validate, tflint, and tfsec or checkov; gate merges on a clean report.
  • Policy-as-code: add OPA/Conftest to block plans that violate security controls (e.g., public IPs or missing AMA).
  • Remote state: use the azurerm backend with private endpoints, CMK, and RBAC; restrict write access and enable soft delete/lock.

Lessons for DevSecOps Engineers

11. Quickstart

  • Prereqs: Terraform >= 1.4, Azure CLI or Az PowerShell, access to a subscription, and an existing Log Analytics workspace if you plan to associate a DCR.
  • Login: az login and set the subscription with az account set --subscription <SUB_ID>.
  • Initialize an example: cd examples/basic && terraform init.
  • Plan and apply: terraform plan -out tfplan then terraform apply tfplan.
  • Review outputs, then destroy if this is a sandbox: terraform destroy.

12. Module Template Checklist

  • Inputs typed with validation; avoid any for public modules; provide helpful error messages.
  • Deterministic locals for names/tags; avoid repeating name logic in resources.
  • Outputs are stable and minimal; mark secrets sensitive = true.
  • At least one runnable example; README usage matches example code.
  • Auto-generated docs via terraform-docs; keep README concise and consistent.
  • SemVer and CHANGELOG; tag releases; optionally publish to Terraform Registry.
  • CI: fmt/validate/tflint/security scans; policy gates for public IPs, missing tags, missing AMA.
  • Backend hardened: storage with private endpoints, CMK, RBAC, soft delete, immutability (where applicable).
  • Pre-commit hooks mirroring CI checks.

13. Testing Strategies

  • Plan checks in CI: terraform plan -detailed-exitcode to detect drift or unintended changes.
  • Unit-ish tests with Terratest for naming, outputs, input validation, and creation toggles; run in a sandbox subscription.
  • Ephemeral environments: prefix resources with a unique token to allow parallel test runs; always destroy.
  • Negative tests: ensure invalid inputs fail fast with your validation blocks.
  • Extension health: verify VM extension provisioning succeeded (Azure Portal -> VM -> Extensions) and capture logs on failure.

14. Troubleshooting

  • Auth problems: ensure az login and correct subscription; service principals need roles on the RG/subscription.
  • Provider mismatch: align with required_providers in versions.tf; run terraform init -upgrade if needed.
  • Trusted Launch errors: verify Gen2 image and region support; disable secure_boot_enabled/vtpm_enabled only if necessary.
  • AMA/DCR: association requires Reader on the DCR and permission to write association on the VM; if the agent fails, check the VM’s Extension logs.
  • Key Vault RBAC vs access policies: in RBAC mode you won’t see access policies; use role assignments for DES identity and any MI that needs secrets.
  • Sanitizing inputs and deferring to locals keeps modules resilient to partial configuration and future schema changes.
  • Azure provider behaviour evolves; verifying attribute names against the latest docs prevents frustrating plan errors.
  • Security “defaults on” (encryption, identities, diagnostics) aligns with governance without sacrificing flexibility.
  • Plan-time validation matters—friendly errors beat debugging a failed apply.
  • Well-documented examples turn a complex module into plug-and-play infrastructure.

Next Steps for Readers

  1. Copy the module definitions from this article into your own workspace, run terraform fmt, and then run terraform validate to verify your environment.
  2. Review the embedded modules/azure_vm/README.md excerpt below for usage patterns, required inputs, and output references.
  3. Adapt the module with organization-specific policies (e.g., Defender settings, diagnostics sinks, tag taxonomies).
  4. Share feedback and enhancements with your DevSecOps peers—great modules evolve through collaboration.

Azure VM Module Reference

  • main.tf orchestrates resource creation, covering resource groups, networking, the virtual machine, Key Vault, disk encryption set, diagnostic settings, and optional Azure AD SSH configuration.
  • variables.tf defines the input schema, validation rules, and helpful defaults for networking, security, and compute objects.
  • locals.tf normalizes caller input into deterministic names, tags, and helper structures consumed across resources.
  • outputs.tf exposes identifiers for the VM, network interface, network resources, Key Vault artifacts, and generated credentials.
  • versions.tf locks the Terraform and provider versions to ensure reproducible plans.
  • README.md documents expectations, usage scenarios, and operational guidance for consumers.

Full Module Source Listings

The following code blocks reproduce the module so you can reference or inline it directly from this article. Start with the README excerpt to understand intent and usage, then copy the Terraform files as needed.

modules/azure_vm/versions.tf

terraform {
  required_version = ">= 1.4.0"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.100.0"
    }
    random = {
      source  = "hashicorp/random"
      version = ">= 3.5.1"
    }
    tls = {
      source  = "hashicorp/tls"
      version = ">= 4.0.4"
    }
  }
}

provider "azurerm" {
  features {}
}

modules/azure_vm/variables.tf

variable "global_settings" {
  description = "Global naming and tagging context for the deployment."
  type        = any

  validation {
    condition = alltrue([
      for key in ["location", "location_short_name", "service_name", "environment", "customer"] :
      contains(keys(var.global_settings), key) && try(length(trimspace(var.global_settings[key])), 0) > 0
    ])
    error_message = "global_settings must include non-empty values for location, location_short_name, service_name, environment, and customer."
  }

  validation {
    condition     = !contains(keys(var.global_settings), "tags") || can(merge({}, var.global_settings["tags"]))
    error_message = "global_settings.tags must be a map of strings when provided."
  }
}

variable "resource_group" {
  description = "Resource group configuration."
  type        = any
  default     = {}
}

variable "vm" {
  description = "Virtual machine compute configuration."
  type        = any
  default     = {}

  validation {
    condition     = contains(keys(var.vm), "size") && contains(keys(var.vm), "admin_username")
    error_message = "vm.size and vm.admin_username must be provided."
  }

  validation {
    condition = (
      contains(keys(var.vm), "source_image_id") && try(length(trimspace(var.vm["source_image_id"])), 0) > 0
      ) || (
      contains(keys(var.vm), "shared_gallery_image_id") && try(length(trimspace(var.vm["shared_gallery_image_id"])), 0) > 0
      ) || (
      contains(keys(var.vm), "source_image_reference") && var.vm["source_image_reference"] != null
    )
    error_message = "Provide vm.source_image_id, vm.shared_gallery_image_id, or vm.source_image_reference."
  }
}

variable "network" {
  description = "Networking configuration supporting existing or new deployments."
  type        = any
  default     = {}
}

variable "security" {
  description = "Security configuration including Key Vault, disk encryption, and related controls."
  type        = any
  default     = {}
}

variable "rbac_login" {
  description = "Controls for installing Azure AD SSH login extension."
  type        = any
  default     = {}
}

variable "extra_tags" {
  description = "Additional tags to merge with computed tags."
  type        = map(string)
  default     = {}
}

modules/azure_vm/locals.tf

locals {
  global_input = {
    location            = lookup(var.global_settings, "location", null)
    location_short_name = lookup(var.global_settings, "location_short_name", null)
    service_name        = lookup(var.global_settings, "service_name", null)
    environment         = lookup(var.global_settings, "environment", null)
    customer            = lookup(var.global_settings, "customer", null)
    tags                = lookup(var.global_settings, "tags", {})
  }

  resource_group_input = {
    create        = lookup(var.resource_group, "create", false)
    name_override = lookup(var.resource_group, "name_override", "")
    tags          = lookup(var.resource_group, "tags", {})
  }

  network_input = {
    virtual_network        = lookup(var.network, "virtual_network", {})
    subnet                 = lookup(var.network, "subnet", {})
    network_security_group = lookup(var.network, "network_security_group", {})
    public_ip              = lookup(var.network, "public_ip", {})
    network_interface      = lookup(var.network, "network_interface", {})
    dns_zone_id            = lookup(var.network, "dns_zone_id", null)
  }

  vm_input = {
    size                            = lookup(var.vm, "size", null)
    admin_username                  = lookup(var.vm, "admin_username", null)
    disable_password_authentication = lookup(var.vm, "disable_password_authentication", true)
    source_image_reference          = lookup(var.vm, "source_image_reference", null)
    source_image_id                 = lookup(var.vm, "source_image_id", null)
    shared_gallery_image_id         = lookup(var.vm, "shared_gallery_image_id", null)
    plan                            = lookup(var.vm, "plan", null)
    os_disk                         = lookup(var.vm, "os_disk", {})
    data_disks                      = lookup(var.vm, "data_disks", [])
    zone                            = lookup(var.vm, "zone", null)
    proximity_placement_group_id    = lookup(var.vm, "proximity_placement_group_id", null)
    patch_mode                      = lookup(var.vm, "patch_mode", null)
    reboot_setting                  = lookup(var.vm, "reboot_setting", null)
    enable_automatic_updates        = lookup(var.vm, "enable_automatic_updates", null)
    enable_agent                    = lookup(var.vm, "enable_agent", null)
    identity                        = lookup(var.vm, "identity", {})
    admin_ssh_key                   = lookup(var.vm, "admin_ssh_key", {})
    custom_data                     = lookup(var.vm, "custom_data", null)
    boot_diagnostics                = lookup(var.vm, "boot_diagnostics", {})
    diagnostics_settings            = lookup(var.vm, "diagnostics_settings", null)
    availability_set_id             = lookup(var.vm, "availability_set_id", null)
  }

  security_input = {
    key_vault                    = lookup(var.security, "key_vault", {})
    key                          = lookup(var.security, "key", {})
    store_admin_key_in_key_vault = lookup(var.security, "store_admin_key_in_key_vault", true)
    enable_azure_defender        = lookup(var.security, "enable_azure_defender", false)
  }

  rbac_login_input = {
    enable              = lookup(var.rbac_login, "enable", true)
    extension_version   = lookup(var.rbac_login, "extension_version", "1.5")
    managed_identity_id = lookup(var.rbac_login, "managed_identity_id", null)
  }
}

locals {
  sanitized_tokens = {
    location    = join("", regexall("[0-9a-z]", lower(local.global_input.location_short_name)))
    service     = join("", regexall("[0-9a-z-]", lower(local.global_input.service_name)))
    environment = join("", regexall("[0-9a-z-]", lower(local.global_input.environment)))
    customer    = join("", regexall("[0-9a-z-]", lower(local.global_input.customer)))
  }
  name_prefix  = join("-", compact([local.sanitized_tokens.location, local.sanitized_tokens.service]))
  name_postfix = join("-", compact([local.sanitized_tokens.environment, local.sanitized_tokens.customer]))
  vm_name      = join("-", compact([local.name_prefix, "vm", local.name_postfix]))

  default_tags = {
    Environment = local.global_input.environment
    Service     = local.global_input.service_name
    Customer    = local.global_input.customer
    Location    = local.global_input.location
  }
  combined_tags = merge(local.default_tags, local.global_input.tags, var.extra_tags)

  network_virtual_network    = local.network_input.virtual_network
  network_subnet             = local.network_input.subnet
  network_public_ip          = local.network_input.public_ip
  network_security_group     = local.network_input.network_security_group
  network_interface_settings = local.network_input.network_interface
  key_vault_settings         = local.security_input.key_vault
  key_settings               = local.security_input.key
  rbac_login_settings        = local.rbac_login_input
}

locals {
  resource_group_name_override_raw = trimspace(local.resource_group_input.name_override)
  resource_group_name_override     = local.resource_group_name_override_raw != "" ? local.resource_group_name_override_raw : null
  resource_group_name_base         = join("-", compact([local.sanitized_tokens.location, "rg", local.name_postfix]))
  resource_group_name              = coalesce(local.resource_group_name_override, local.resource_group_name_base)
  resource_group_location          = local.global_input.location
}

locals {
  base_resource_token = join("", [
    local.sanitized_tokens.location,
    local.sanitized_tokens.service,
    local.sanitized_tokens.environment,
    local.sanitized_tokens.customer
  ])
  key_vault_base = substr("kv${local.base_resource_token}", 0, 20)
  key_vault_name = lower(substr("${local.key_vault_base}${random_string.unique.result}", 0, 24))

  resource_names = {
    resource_group           = local.resource_group_name
    virtual_network          = coalesce(try(local.network_virtual_network.name, null), format("%s-vnet-%s", local.name_prefix, local.name_postfix))
    subnet                   = coalesce(try(local.network_subnet.name, null), format("%s-snet-%s", local.name_prefix, local.name_postfix))
    network_security_group   = coalesce(try(local.network_security_group.name, null), format("%s-nsg-%s", local.name_prefix, local.name_postfix))
    network_interface        = format("%s-nic-%s", local.name_prefix, local.name_postfix)
    public_ip                = format("%s-pip-%s", local.name_prefix, local.name_postfix)
    disk_encryption_set      = format("%s-des-%s", local.name_prefix, local.name_postfix)
    key_vault                = coalesce(try(local.key_vault_settings.name, null), local.key_vault_name)
    key_vault_secret_private = format("%s-admin-ssh-private", replace(local.vm_name, "-", ""))
    key_vault_secret_public  = format("%s-admin-ssh-public", replace(local.vm_name, "-", ""))
    os_disk                  = format("%s-osdisk-%s", local.name_prefix, local.name_postfix)
  }
}

locals {
  virtual_network_id_input      = try(local.network_virtual_network.id, null)
  virtual_network_rg_name       = coalesce(try(local.network_virtual_network.resource_group_name, null), local.resource_group_name)
  create_virtual_network        = local.virtual_network_id_input == null && try(local.network_virtual_network.create, false)
  need_existing_virtual_network = local.virtual_network_id_input == null && !local.create_virtual_network
}

locals {
  virtual_network_id = coalesce(
    local.virtual_network_id_input,
    local.create_virtual_network ? azurerm_virtual_network.this[0].id : null,
    local.need_existing_virtual_network ? data.azurerm_virtual_network.this[0].id : null
  )
}

locals {
  subnet_id_input      = try(local.network_subnet.id, null)
  create_subnet        = local.subnet_id_input == null && try(local.network_subnet.create, false)
  need_existing_subnet = local.subnet_id_input == null && !local.create_subnet
}

locals {
  subnet_id = coalesce(
    local.subnet_id_input,
    local.create_subnet ? azurerm_subnet.this[0].id : null,
    local.need_existing_subnet ? data.azurerm_subnet.this[0].id : null
  )
}

locals {
  network_security_group_id_input = try(local.network_security_group.id, null)
  create_network_security_group   = local.network_security_group_id_input == null && try(local.network_security_group.create, false)
}

locals {
  network_security_group_id = coalesce(
    local.network_security_group_id_input,
    local.create_network_security_group ? azurerm_network_security_group.this[0].id : null
  )
}

locals {
  public_ip_create     = try(local.network_public_ip.create, false)
  public_ip_zones      = try(local.network_public_ip.zones, null)
  public_ip_label_base = lower(replace(local.vm_name, "-", ""))
  public_ip_domain_label = local.public_ip_create ? coalesce(
    nullif(try(local.network_public_ip.domain_name_label, null), ""),
    format("%s%s", substr(local.public_ip_label_base, 0, max(0, 59 - length(random_string.unique.result))), random_string.unique.result)
  ) : null
}

locals {
  public_ip_id = local.public_ip_create ? azurerm_public_ip.this[0].id : null
}

locals {
  key_vault_enable_private_endpoint = try(local.key_vault_settings.enable_private_endpoint, false)
  key_vault_network_acls            = try(local.key_vault_settings.network_acls, {})
  key_vault_effective_access_policies = try(local.key_vault_settings.enable_rbac_authorization, false) ? [] : concat([
    {
      tenant_id = data.azurerm_client_config.current.tenant_id
      object_id = data.azurerm_client_config.current.object_id
      permissions = {
        keys         = ["Get", "List", "Create", "Update", "Delete", "Recover", "Backup", "Restore", "Import", "Purge"]
        secrets      = ["Get", "List", "Set", "Delete", "Recover", "Backup", "Restore"]
        certificates = ["Get", "List", "Create", "Update", "Import", "Delete", "Recover", "Backup", "Restore"]
        storage      = []
      }
    }
  ], try(local.key_vault_settings.access_policies, []))
}

locals {
  vm_disable_password_authentication = local.vm_input.disable_password_authentication
  provided_admin_public_key          = trimspace(lookup(local.vm_input.admin_ssh_key, "public_key", ""))
  use_generated_admin_key            = local.vm_disable_password_authentication && length(local.provided_admin_public_key) == 0
}

locals {
  admin_public_key  = local.use_generated_admin_key ? tls_private_key.admin[0].public_key_openssh : local.provided_admin_public_key
  admin_private_key = local.use_generated_admin_key ? tls_private_key.admin[0].private_key_pem : null
}

locals {
  identity_config = {
    type         = coalesce(try(local.vm_input.identity.type, null), "SystemAssigned")
    identity_ids = distinct(try(local.vm_input.identity.identity_ids, []))
  }
  vm_zone_value             = local.vm_input.zone
  vm_zone                   = local.vm_zone_value == null ? null : tostring(local.vm_zone_value)
  vm_source_image_id        = local.vm_input.source_image_id
  vm_source_image_reference = local.vm_input.source_image_reference
}

locals {
  vm_diagnostics_enabled  = local.vm_input.diagnostics_settings != null
  vm_diagnostics_settings = local.vm_input.diagnostics_settings != null ? local.vm_input.diagnostics_settings : {}
  vm_data_disks           = local.vm_input.data_disks != null ? local.vm_input.data_disks : []
}

locals {
  data_disks = { for disk in local.vm_data_disks : tostring(disk.lun) => disk }
}

modules/azure_vm/main.tf

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "this" {
  count    = local.resource_group_input.create ? 1 : 0
  name     = local.resource_group_name
  location = local.resource_group_location
  tags     = merge(local.combined_tags, local.resource_group_input.tags)
}

data "azurerm_resource_group" "this" {
  count = local.resource_group_input.create ? 0 : 1
  name  = local.resource_group_name
}

resource "random_string" "unique" {
  length  = 4
  upper   = false
  special = false
  numeric = true
}

resource "azurerm_virtual_network" "this" {
  count               = local.create_virtual_network ? 1 : 0
  name                = local.resource_names.virtual_network
  location            = local.resource_group_location
  resource_group_name = local.virtual_network_rg_name
  address_space       = try(local.network_virtual_network.address_space, ["10.0.0.0/16"])
  dns_servers         = try(local.network_virtual_network.dns_servers, [])
  tags                = local.combined_tags
}

data "azurerm_virtual_network" "this" {
  count               = local.need_existing_virtual_network ? 1 : 0
  name                = coalesce(try(local.network_virtual_network.name, null), local.resource_names.virtual_network)
  resource_group_name = local.virtual_network_rg_name
}

resource "azurerm_subnet" "this" {
  count                                         = local.create_subnet ? 1 : 0
  name                                          = local.resource_names.subnet
  resource_group_name                           = local.virtual_network_rg_name
  virtual_network_name                          = coalesce(try(local.network_virtual_network.name, null), azurerm_virtual_network.this[0].name)
  address_prefixes                              = try(local.network_subnet.address_prefixes, ["10.0.1.0/24"])
  service_endpoints                             = try(local.network_subnet.service_endpoints, [])
  private_endpoint_network_policies             = try(local.network_subnet.private_endpoint_network_policies, local.key_vault_enable_private_endpoint ? "Disabled" : "Disabled")
  private_link_service_network_policies_enabled = try(local.network_subnet.private_link_service_network_policies_enabled, true)

  dynamic "delegation" {
    for_each = try(local.network_subnet.delegations, [])
    content {
      name = delegation.value.name
      service_delegation {
        name    = delegation.value.service_delegation.name
        actions = delegation.value.service_delegation.actions
      }
    }
  }
}

data "azurerm_subnet" "this" {
  count                = local.need_existing_subnet ? 1 : 0
  name                 = coalesce(try(local.network_subnet.name, null), local.resource_names.subnet)
  resource_group_name  = local.virtual_network_rg_name
  virtual_network_name = coalesce(try(local.network_virtual_network.name, null), data.azurerm_virtual_network.this[0].name)
}

resource "azurerm_network_security_group" "this" {
  count               = local.create_network_security_group ? 1 : 0
  name                = local.resource_names.network_security_group
  location            = local.resource_group_location
  resource_group_name = local.resource_group_name
  tags                = local.combined_tags

  dynamic "security_rule" {
    for_each = try(local.network_security_group.rules, [])
    content {
      name                         = security_rule.value.name
      priority                     = security_rule.value.priority
      direction                    = security_rule.value.direction
      access                       = security_rule.value.access
      protocol                     = security_rule.value.protocol
      source_port_range            = try(security_rule.value.source_port_range, null)
      source_port_ranges           = try(security_rule.value.source_port_ranges, null)
      destination_port_range       = try(security_rule.value.destination_port_range, null)
      destination_port_ranges      = try(security_rule.value.destination_port_ranges, null)
      source_address_prefix        = try(security_rule.value.source_address_prefix, null)
      source_address_prefixes      = try(security_rule.value.source_address_prefixes, null)
      destination_address_prefix   = try(security_rule.value.destination_address_prefix, null)
      destination_address_prefixes = try(security_rule.value.destination_address_prefixes, null)
      description                  = try(security_rule.value.description, null)
    }
  }
}

resource "azurerm_public_ip" "this" {
  count                   = local.public_ip_create ? 1 : 0
  name                    = local.resource_names.public_ip
  resource_group_name     = local.resource_group_name
  location                = local.resource_group_location
  allocation_method       = coalesce(try(local.network_public_ip.allocation_method, null), "Static")
  sku                     = coalesce(try(local.network_public_ip.sku, null), "Standard")
  idle_timeout_in_minutes = try(local.network_public_ip.idle_timeout_in_minutes, 4)
  tags                    = local.combined_tags

  domain_name_label = local.public_ip_domain_label
  zones             = local.public_ip_zones
}

resource "azurerm_network_interface" "primary" {
  name                           = local.resource_names.network_interface
  location                       = local.resource_group_location
  resource_group_name            = local.resource_group_name
  accelerated_networking_enabled = try(local.network_interface_settings.accelerated_networking, false)
  ip_forwarding_enabled          = try(local.network_interface_settings.enable_ip_forwarding, false)
  tags                           = local.combined_tags

  ip_configuration {
    name                          = "primary"
    subnet_id                     = local.subnet_id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = local.public_ip_id
  }
}

resource "azurerm_network_interface_security_group_association" "primary" {
  count                     = local.network_security_group_id == null ? 0 : 1
  network_interface_id      = azurerm_network_interface.primary.id
  network_security_group_id = local.network_security_group_id
}

resource "azurerm_key_vault" "this" {
  name                       = local.resource_names.key_vault
  location                   = local.resource_group_location
  resource_group_name        = local.resource_group_name
  tenant_id                  = data.azurerm_client_config.current.tenant_id
  sku_name                   = upper(try(local.key_vault_settings.sku_name, "standard"))
  soft_delete_retention_days = try(local.key_vault_settings.soft_delete_retention_days, 90)
  purge_protection_enabled   = try(local.key_vault_settings.purge_protection_enabled, true)
  enable_rbac_authorization  = try(local.key_vault_settings.enable_rbac_authorization, false)
  tags                       = local.combined_tags

  dynamic "access_policy" {
    for_each = local.key_vault_effective_access_policies
    content {
      tenant_id               = access_policy.value.tenant_id
      object_id               = access_policy.value.object_id
      key_permissions         = access_policy.value.permissions.keys
      secret_permissions      = access_policy.value.permissions.secrets
      certificate_permissions = access_policy.value.permissions.certificates
      storage_permissions     = access_policy.value.permissions.storage
    }
  }

  dynamic "network_acls" {
    for_each = length(keys(local.key_vault_network_acls)) == 0 ? [] : [local.key_vault_network_acls]
    content {
      bypass                     = try(network_acls.value.bypass, "AzureServices")
      default_action             = try(network_acls.value.default_action, "Deny")
      ip_rules                   = try(network_acls.value.ip_rules, [])
      virtual_network_subnet_ids = try(network_acls.value.virtual_network_subnet_ids, [])
    }
  }
}

resource "azurerm_key_vault_key" "cmk" {
  name         = try(local.key_settings.name, "vm-cmk")
  key_vault_id = azurerm_key_vault.this.id
  key_type     = try(local.key_settings.key_type, "RSA")
  key_size     = try(local.key_settings.key_size, 4096)
  key_opts     = try(local.key_settings.key_opts, ["decrypt", "encrypt", "sign", "verify", "wrapKey", "unwrapKey"])
}

resource "azurerm_disk_encryption_set" "this" {
  name                = local.resource_names.disk_encryption_set
  location            = local.resource_group_location
  resource_group_name = local.resource_group_name
  key_vault_key_id    = azurerm_key_vault_key.cmk.id
  tags                = local.combined_tags

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_key_vault_access_policy" "disk_encryption" {
  count        = try(local.key_vault_settings.enable_rbac_authorization, false) ? 0 : 1
  key_vault_id = azurerm_key_vault.this.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_disk_encryption_set.this.identity[0].principal_id

  key_permissions = ["Get", "WrapKey", "UnwrapKey"]
}

resource "azurerm_private_endpoint" "key_vault" {
  count               = local.key_vault_enable_private_endpoint ? 1 : 0
  name                = format("%s-pe-kv", local.vm_name)
  location            = local.resource_group_location
  resource_group_name = local.resource_group_name
  subnet_id           = try(local.key_vault_settings.private_endpoint_subnet_id, null)
  tags                = local.combined_tags

  private_service_connection {
    name                           = format("%s-kv-psc", replace(local.vm_name, "-", ""))
    private_connection_resource_id = azurerm_key_vault.this.id
    subresource_names              = ["vault"]
    is_manual_connection           = false
  }

  dynamic "private_dns_zone_group" {
    for_each = local.network_input.dns_zone_id == null ? [] : [local.network_input.dns_zone_id]
    content {
      name                 = format("%s-kv-dns", replace(local.vm_name, "-", ""))
      private_dns_zone_ids = [private_dns_zone_group.value]
    }
  }
}

resource "tls_private_key" "admin" {
  count     = local.use_generated_admin_key ? 1 : 0
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "azurerm_key_vault_secret" "admin_private" {
  count        = local.admin_private_key == null || !local.security_input.store_admin_key_in_key_vault ? 0 : 1
  name         = local.resource_names.key_vault_secret_private
  value        = local.admin_private_key
  key_vault_id = azurerm_key_vault.this.id
  content_type = "application/x-pem-file"
  tags         = local.combined_tags
}

resource "azurerm_key_vault_secret" "admin_public" {
  count        = local.admin_public_key == null || !local.security_input.store_admin_key_in_key_vault ? 0 : 1
  name         = local.resource_names.key_vault_secret_public
  value        = local.admin_public_key
  key_vault_id = azurerm_key_vault.this.id
  content_type = "application/x-pem-file"
  tags         = local.combined_tags
}

resource "azurerm_linux_virtual_machine" "this" {
  name                            = local.vm_name
  resource_group_name             = local.resource_group_name
  location                        = local.resource_group_location
  size                            = local.vm_input.size
  network_interface_ids           = [azurerm_network_interface.primary.id]
  admin_username                  = local.vm_input.admin_username
  disable_password_authentication = local.vm_disable_password_authentication
  computer_name                   = substr(replace(local.vm_name, "-", ""), 0, 15)
  provision_vm_agent              = coalesce(local.vm_input.enable_agent, true)
  patch_mode                      = coalesce(local.vm_input.patch_mode, "AutomaticByPlatform")
  reboot_setting                  = coalesce(local.vm_input.reboot_setting, "IfRequired")
  allow_extension_operations      = true
  encryption_at_host_enabled      = true
  zone                            = local.vm_zone
  availability_set_id             = local.vm_input.availability_set_id
  proximity_placement_group_id    = local.vm_input.proximity_placement_group_id
  custom_data                     = local.vm_input.custom_data == null ? null : base64encode(local.vm_input.custom_data)
  source_image_id                 = local.vm_source_image_id

  dynamic "source_image_reference" {
    for_each = local.vm_source_image_id == null ? [local.vm_source_image_reference] : []
    content {
      publisher = source_image_reference.value.publisher
      offer     = source_image_reference.value.offer
      sku       = source_image_reference.value.sku
      version   = source_image_reference.value.version
    }
  }

  dynamic "plan" {
    for_each = local.vm_input.plan == null ? [] : [local.vm_input.plan]
    content {
      name      = plan.value.name
      product   = plan.value.product
      publisher = plan.value.publisher
    }
  }

  os_disk {
    name                      = local.resource_names.os_disk
    caching                   = lookup(local.vm_input.os_disk, "caching", "ReadWrite")
    storage_account_type      = lookup(local.vm_input.os_disk, "storage_account_type", "Premium_LRS")
    disk_size_gb              = lookup(local.vm_input.os_disk, "disk_size_gb", null)
    write_accelerator_enabled = lookup(local.vm_input.os_disk, "write_accelerator_enabled", false)
    disk_encryption_set_id    = azurerm_disk_encryption_set.this.id
  }

  dynamic "admin_ssh_key" {
    for_each = local.vm_disable_password_authentication && local.admin_public_key != "" ? [local.admin_public_key] : []
    content {
      username   = local.vm_input.admin_username
      public_key = admin_ssh_key.value
    }
  }

  dynamic "boot_diagnostics" {
    for_each = lookup(local.vm_input.boot_diagnostics, "enable", true) ? [1] : []
    content {
      storage_account_uri = lookup(local.vm_input.boot_diagnostics, "storage_account_uri", null)
    }
  }

  identity {
    type         = local.identity_config.type
    identity_ids = contains(["UserAssigned", "SystemAssigned,UserAssigned", "SystemAssigned, UserAssigned"], local.identity_config.type) ? local.identity_config.identity_ids : []
  }

  lifecycle {
    ignore_changes = [
      # Allow OS updates without forcing replacement
      source_image_reference,
      custom_data
    ]
  }

  tags = local.combined_tags
}

resource "azurerm_managed_disk" "data" {
  for_each               = local.data_disks
  name                   = coalesce(each.value.name, format("%s-datadisk-%02d", local.vm_name, tonumber(each.key)))
  location               = local.resource_group_location
  resource_group_name    = local.resource_group_name
  storage_account_type   = try(each.value.storage_account_type, "Premium_LRS")
  create_option          = "Empty"
  disk_size_gb           = each.value.disk_size_gb
  disk_encryption_set_id = azurerm_disk_encryption_set.this.id
  tags                   = local.combined_tags
}

resource "azurerm_virtual_machine_data_disk_attachment" "data" {
  for_each           = local.data_disks
  managed_disk_id    = azurerm_managed_disk.data[each.key].id
  virtual_machine_id = azurerm_linux_virtual_machine.this.id
  lun                = tonumber(each.key)
  caching            = try(each.value.caching, "ReadOnly")
}

resource "azurerm_virtual_machine_extension" "aad_ssh" {
  count                      = try(local.rbac_login_settings.enable, true) ? 1 : 0
  name                       = format("%s-aadssh", local.vm_name)
  virtual_machine_id         = azurerm_linux_virtual_machine.this.id
  publisher                  = "Microsoft.Azure.ActiveDirectory"
  type                       = "AADSSHLoginForLinux"
  type_handler_version       = try(local.rbac_login_settings.extension_version, "1.5")
  auto_upgrade_minor_version = true
  settings                   = jsonencode({})
}

resource "azurerm_monitor_diagnostic_setting" "vm" {
  count                          = local.vm_diagnostics_enabled ? 1 : 0
  name                           = coalesce(lookup(local.vm_diagnostics_settings, "name", null), format("%s-diag", local.vm_name))
  target_resource_id             = azurerm_linux_virtual_machine.this.id
  log_analytics_workspace_id     = lookup(local.vm_diagnostics_settings, "destination_resource_id", null)
  log_analytics_destination_type = lookup(local.vm_diagnostics_settings, "log_analytics_destination_type", null)

  dynamic "enabled_metric" {
    for_each = lookup(local.vm_diagnostics_settings, "metrics", [])
    content {
      category = enabled_metric.value.category
    }
  }

  dynamic "enabled_log" {
    for_each = lookup(local.vm_diagnostics_settings, "logs", [])
    content {
      category       = try(enabled_log.value.category, null)
      category_group = try(enabled_log.value.category_group, null)

      dynamic "retention_policy" {
        for_each = try(enabled_log.value.retention_policy, null) == null ? [] : [enabled_log.value.retention_policy]
        content {
          enabled = try(retention_policy.value.enabled, false)
          days    = try(retention_policy.value.days, 0)
        }
      }
    }
  }
}

modules/azure_vm/outputs.tf

output "vm_id" {
  description = "ID of the Linux virtual machine."
  value       = azurerm_linux_virtual_machine.this.id
}

output "vm_name" {
  description = "Name of the Linux virtual machine."
  value       = azurerm_linux_virtual_machine.this.name
}

output "network_interface_id" {
  description = "Primary network interface ID."
  value       = azurerm_network_interface.primary.id
}

output "subnet_id" {
  description = "Subnet ID attached to the VM."
  value       = local.subnet_id
}

output "virtual_network_id" {
  description = "Virtual network ID in use."
  value       = local.virtual_network_id
}

output "public_ip_address" {
  description = "Public IP address assigned to the VM (if created)."
  value       = local.public_ip_create ? azurerm_public_ip.this[0].ip_address : null
}

output "admin_username" {
  description = "Administrator username for the VM."
  value       = var.vm.admin_username
}

output "admin_public_key" {
  description = "SSH public key used for the VM."
  value       = local.admin_public_key
  sensitive   = true
}

output "admin_private_key_secret_id" {
  description = "Key Vault secret ID containing the generated admin private key (if generated)."
  value       = length(azurerm_key_vault_secret.admin_private) == 0 ? null : azurerm_key_vault_secret.admin_private[0].id
  sensitive   = true
}

output "key_vault_id" {
  description = "ID of the managed Key Vault for disk encryption and secrets."
  value       = azurerm_key_vault.this.id
}

output "disk_encryption_set_id" {
  description = "Disk Encryption Set ID assigned to the VM disks."
  value       = azurerm_disk_encryption_set.this.id
}

Example: examples/basic/main.tf

This is the runnable baseline from the repository. It shows the minimum inputs required to stand up a sandbox VM while exercising the module’s default security posture.

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.80.0"
    }
  }
}

provider "azurerm" {
  features {}
}

module "vm" {
  source = "../../modules/azure_vm"

  global_settings = {
    location            = "eastus"
    location_short_name = "eus"
    service_name        = "compute"
    environment         = "sandbox"
    customer            = "internal"
  }

  resource_group = {
    create = true
  }

  vm = {
    size           = "Standard_DS2_v2"
    admin_username = "azureuser"
    source_image_reference = {
      publisher = "Canonical"
      offer     = "0001-com-ubuntu-server-jammy"
      sku       = "22_04-lts-gen2"
      version   = "latest"
    }
  }

  network = {
    virtual_network = {
      create        = true
      address_space = ["10.20.0.0/16"]
    }
    subnet = {
      create            = true
      address_prefixes  = ["10.20.10.0/24"]
      service_endpoints = ["Microsoft.Storage"]
    }
    network_security_group = {
      create = true
      rules = [
        {
          name                   = "allow-ssh"
          priority               = 100
          direction              = "Inbound"
          access                 = "Allow"
          protocol               = "Tcp"
          source_port_range      = "*"
          destination_port_range = "22"
          source_address_prefix  = "*"
          description            = "Remote SSH"
        }
      ]
    }
    public_ip = {
      create = false
    }
  }

  extra_tags = {
    workload = "basic-demo"
  }
}

Production note: swap the marketplace image reference for a hardened image that your organization publishes (for example, via Shared Image Gallery or a preapproved managed image) to satisfy security baselines.

Running terraform init, terraform fmt, and terraform plan inside examples/basic/ proves the wiring before you adapt the module for more advanced workloads.

Example: WordPress VM with Diagnostics and Data Disks

This scenario extends the basics with hardened storage, diagnostic settings, and a public entry point suitable for a WordPress deployment. Pair it with Log Analytics and a hardened custom image before production rollout.

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.100.0"
    }
  }
}

provider "azurerm" {
  features {}
}

data "azurerm_log_analytics_workspace" "main" {
  name                = "law-shared-observability"
  resource_group_name = "rg-shared-ops"
}

data "azurerm_subnet" "app" {
  name                 = "snet-app"
  virtual_network_name = "vnet-shared"
  resource_group_name  = "rg-shared-network"
}

module "vm" {
  source = "../../modules/azure_vm"

  global_settings = {
    location            = "eastus2"
    location_short_name = "eus2"
    service_name        = "wordpress"
    environment         = "prod"
    customer            = "marketing"
    tags = {
      application = "cms"
      owner       = "digital-experience"
    }
  }

  resource_group = {
    create = true
  }

  vm = {
    size           = "Standard_D4s_v5"
    admin_username = "wpadmin"
    source_image_id = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-images/providers/Microsoft.Compute/images/wp-hardened-ubuntu"
    data_disks = [
      {
        lun             = 0
        disk_size_gb    = 256
        caching         = "ReadWrite"
        storage_account_type = "Premium_LRS"
      }
    ]
    diagnostics_settings = {
      destination_resource_id = data.azurerm_log_analytics_workspace.main.id
      metrics = [
        { category = "AllMetrics" }
      ]
      logs = [
        { category = "LinuxSyslog" }
      ]
    }
    boot_diagnostics = {
      storage_account_uri = "https://stshareddiag.blob.core.windows.net/"
    }
    custom_data = filebase64("cloud-init/wordpress.yaml")
  }

  network = {
    virtual_network = {
      id = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-shared-network/providers/Microsoft.Network/virtualNetworks/vnet-shared"
    }
    subnet = {
      id = data.azurerm_subnet.app.id
    }
    network_security_group = {
      create = true
      rules = [
        {
          name                       = "allow-http"
          priority                   = 100
          direction                  = "Inbound"
          access                     = "Allow"
          protocol                   = "Tcp"
          source_port_range          = "*"
          destination_port_range     = "80"
          source_address_prefix      = "*"
          destination_address_prefix = "*"
          description                = "Public HTTP"
        },
        {
          name                   = "allow-ssh"
          priority               = 110
          direction              = "Inbound"
          access                 = "Allow"
          protocol               = "Tcp"
          source_port_range      = "*"
          destination_port_range = "22"
          source_address_prefix  = "203.0.113.0/24" # restrict to your corporate egress range or omit and use Bastion/JIT
          description            = "SSH from corporate egress"
        }
      ]
    }
    public_ip = {
      create = true
      sku    = "Standard"
    }
  }

  security = {
    key_vault = {
      enable_private_endpoint    = true
      private_endpoint_subnet_id = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-shared-network/providers/Microsoft.Network/virtualNetworks/vnet-shared/subnets/snet-private"
    }
  }

  rbac_login = {
    enable = true
  }

  extra_tags = {
    workload = "wordpress"
    tier     = "web"
  }
}

This blueprint keeps observability and security front and center: it streams metrics and syslog into Log Analytics, stores boot diagnostics for troubleshooting, and insists on hardened custom images in production instead of marketplace defaults.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top