Tutorial 22: Built-in Functions and Expressions

Learning Objectives

By the end of this tutorial, you will be able to:

Use Terraform's built-in functions for data manipulation
Implement complex expressions and transformations
Apply string, numeric, collection, and date functions effectively
Use type conversion and validation functions
Create reusable expression patterns for common operations

Prerequisites

Understanding of HCL syntax and expressions
Completed Tutorial 17: Local Values and Computed Values
Basic knowledge of data types and structures

Introduction

Terraform provides over 100 built-in functions that enable powerful data manipulation, transformation, and computation within your configurations. These functions allow you to create dynamic, flexible infrastructure code that adapts to changing requirements.

String Functions

Basic String Manipulation

# variables.tf
variable "project_name" {
  description = "Project name"
  type        = string
  default     = "my-awesome-project"
}

variable "environment" {
  description = "Environment name"
  type        = string
  default     = "development"
}

variable "user_list" {
  description = "Comma-separated list of users"
  type        = string
  default     = "alice,bob,charlie,dave"
}

# main.tf
locals {
  # String formatting and manipulation
  resource_prefix = upper(replace(var.project_name, "-", "_"))
  env_short      = substr(var.environment, 0, 3)
  
  # String concatenation and formatting
  full_name = format("%s-%s", var.project_name, var.environment)
  
  # String case conversion
  project_upper = upper(var.project_name)
  project_lower = lower(var.project_name)
  project_title = title(replace(var.project_name, "-", " "))
  
  # String splitting and joining
  users = split(",", var.user_list)
  user_emails = [
    for user in local.users :
    format("%s@%s.com", user, replace(var.project_name, "-", ""))
  ]
  
  # String trimming and padding
  cleaned_name = trimspace(var.project_name)
  padded_env   = format("%04s", local.env_short)
  
  # String searching and replacing
  sanitized_name = replace(replace(var.project_name, " ", "-"), "_", "-")
  
  # Regular expressions
  is_valid_name = can(regex("^[a-z][a-z0-9-]*[a-z0-9]$", var.project_name))
  extracted_version = regex("v([0-9]+\\.[0-9]+\\.[0-9]+)", "app-v1.2.3-release")[0]
}

# Using string functions in resources
resource "aws_s3_bucket" "app" {
  bucket = "${local.sanitized_name}-${local.env_short}-${random_id.suffix.hex}"
  
  tags = {
    Name        = local.project_title
    Environment = title(var.environment)
    Prefix      = local.resource_prefix
  }
}

resource "random_id" "suffix" {
  byte_length = 4
}

# outputs.tf
output "string_examples" {
  description = "Examples of string function usage"
  value = {
    original_name    = var.project_name
    resource_prefix  = local.resource_prefix
    full_name       = local.full_name
    users           = local.users
    user_emails     = local.user_emails
    is_valid_name   = local.is_valid_name
    sanitized_name  = local.sanitized_name
  }
}

Advanced String Operations

# Template and formatting functions
locals {
  # Template rendering
  user_data_script = templatefile("${path.module}/user_data.tpl", {
    project_name = var.project_name
    environment  = var.environment
    users        = local.users
    config_data  = jsonencode({
      app_name = var.project_name
      env      = var.environment
      debug    = var.environment != "prod"
    })
  })
  
  # Advanced formatting
  instance_names = [
    for i in range(var.instance_count) :
    format("%s-%s-%02d", var.project_name, var.environment, i + 1)
  ]
  
  # String interpolation with conditionals
  bucket_policy = templatefile("${path.module}/bucket_policy.json", {
    bucket_name = aws_s3_bucket.app.bucket
    principals  = var.environment == "prod" ? 
      ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"] :
      ["*"]
    readonly_access = var.environment != "prod"
  })
  
  # URL and path manipulation
  api_endpoint = format("https://%s.%s", 
    var.subdomain, 
    trimsuffix(var.domain_name, ".")
  )
  
  # Base64 encoding/decoding
  encoded_config = base64encode(jsonencode({
    database_url = format("postgres://%s:%s@%s:%d/%s",
      var.db_username,
      var.db_password,
      aws_db_instance.main.endpoint,
      aws_db_instance.main.port,
      var.db_name
    )
  }))
}

# user_data.tpl template file example:
# #!/bin/bash
# echo "Setting up ${project_name} in ${environment} environment"
# 
# # Configure users
# %{ for user in users ~}
# useradd -m ${user}
# %{ endfor ~}
# 
# # Write config
# cat > /etc/app/config.json << 'EOF'
# ${config_data}
# EOF
# 
# # Start services
# systemctl enable app
# systemctl start app

Numeric Functions

# variables.tf
variable "instance_counts" {
  description = "Instance counts per environment"
  type        = map(number)
  default = {
    dev     = 1
    staging = 2
    prod    = 5
  }
}

variable "cpu_limits" {
  description = "CPU limits in millicores"
  type        = list(number)
  default     = [100, 250, 500, 1000]
}

variable "memory_sizes" {
  description = "Memory sizes in MB"
  type        = list(number)
  default     = [128, 256, 512, 1024, 2048]
}

# main.tf
locals {
  # Basic arithmetic
  total_instances = sum(values(var.instance_counts))
  max_instances   = max(values(var.instance_counts)...)
  min_instances   = min(values(var.instance_counts)...)
  avg_instances   = local.total_instances / length(var.instance_counts)
  
  # Rounding and ceiling/floor
  cpu_cores = [
    for cpu_milli in var.cpu_limits :
    ceil(cpu_milli / 1000)
  ]
  
  memory_gb = [
    for memory_mb in var.memory_sizes :
    floor(memory_mb / 1024)
  ]
  
  # Mathematical operations
  fibonacci_sizes = [1, 1, 2, 3, 5, 8, 13, 21]
  scaled_sizes = [
    for size in local.fibonacci_sizes :
    pow(2, size)
  ]
  
  # Absolute values and sign
  cost_difference = abs(var.budget_limit - var.current_cost)
  
  # Logarithmic functions
  log_based_scaling = [
    for i in range(1, 10) :
    floor(log(i, 2))
  ]
  
  # Modulo operations for distribution
  az_distribution = [
    for i in range(local.total_instances) :
    data.aws_availability_zones.available.names[i % length(data.aws_availability_zones.available.names)]
  ]
}

# Resource sizing based on calculations
resource "aws_instance" "app" {
  count = var.instance_counts[var.environment]
  
  ami           = data.aws_ami.ubuntu.id
  instance_type = local.cpu_cores[count.index % length(local.cpu_cores)] <= 1 ? "t3.micro" : "t3.small"
  
  availability_zone = local.az_distribution[count.index]
  
  tags = {
    Name      = "${var.project_name}-${count.index + 1}"
    CPUCores  = local.cpu_cores[count.index % length(local.cpu_cores)]
    MemoryGB  = local.memory_gb[count.index % length(local.memory_gb)]
  }
}

# Dynamic scaling calculations
locals {
  # Calculate target group weights based on instance counts
  environment_weights = {
    for env, count in var.instance_counts :
    env => floor((count / local.total_instances) * 100)
  }
  
  # Calculate storage requirements
  storage_per_instance = 20  # GB
  total_storage = local.total_instances * local.storage_per_instance
  
  # Cost estimation
  hourly_cost_per_instance = 0.0464  # t3.micro cost
  monthly_cost = local.total_instances * local.hourly_cost_per_instance * 24 * 30
}

data "aws_availability_zones" "available" {
  state = "available"
}

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
}

Collection Functions

List Operations

# variables.tf
variable "allowed_cidrs" {
  description = "List of allowed CIDR blocks"
  type        = list(string)
  default     = ["10.0.0.0/16", "172.16.0.0/12", "192.168.0.0/16"]
}

variable "environments" {
  description = "Environment configurations"
  type = list(object({
    name         = string
    instance_type = string
    replicas     = number
    enabled      = bool
  }))
  default = [
    {
      name         = "dev"
      instance_type = "t3.micro"
      replicas     = 1
      enabled      = true
    },
    {
      name         = "staging"
      instance_type = "t3.small"
      replicas     = 2
      enabled      = true
    },
    {
      name         = "prod"
      instance_type = "t3.medium"
      replicas     = 3
      enabled      = false
    }
  ]
}

# main.tf
locals {
  # List length and element access
  cidr_count = length(var.allowed_cidrs)
  first_cidr = element(var.allowed_cidrs, 0)
  last_cidr  = element(var.allowed_cidrs, length(var.allowed_cidrs) - 1)
  
  # List filtering and transformation
  enabled_environments = [
    for env in var.environments :
    env if env.enabled
  ]
  
  production_environments = [
    for env in var.environments :
    env if env.name == "prod"
  ]
  
  # List slicing and chunking
  first_two_cidrs = slice(var.allowed_cidrs, 0, 2)
  
  # List concatenation and flattening
  all_cidrs = concat(var.allowed_cidrs, ["203.0.113.0/24"])
  
  nested_lists = [
    ["web1", "web2"],
    ["api1", "api2", "api3"],
    ["db1"]
  ]
  all_services = flatten(local.nested_lists)
  
  # List sorting and reversing
  sorted_cidrs = sort(var.allowed_cidrs)
  reversed_envs = reverse([for env in var.environments : env.name])
  
  # List deduplication
  unique_instance_types = distinct([
    for env in var.environments : env.instance_type
  ])
  
  # List indexing and contains
  has_dev_env = contains([for env in var.environments : env.name], "dev")
  dev_env_index = index([for env in var.environments : env.name], "dev")
  
  # List chunking for subnet distribution
  az_count = length(data.aws_availability_zones.available.names)
  subnets_per_az = chunklist(range(24), local.az_count)
}

# Create subnets using list functions
resource "aws_subnet" "app" {
  count = length(local.enabled_environments) * local.az_count
  
  vpc_id            = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.available.names, count.index % local.az_count)
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index + 1)
  
  tags = {
    Name         = "${var.project_name}-subnet-${count.index + 1}"
    Environment  = element([for env in local.enabled_environments : env.name], floor(count.index / local.az_count))
    AZ           = element(data.aws_availability_zones.available.names, count.index % local.az_count)
  }
}

Map Operations

# variables.tf
variable "service_configs" {
  description = "Service configuration map"
  type = map(object({
    port         = number
    protocol     = string
    health_path  = string
    replicas     = number
    cpu_request  = string
    memory_request = string
  }))
  default = {
    frontend = {
      port           = 80
      protocol       = "HTTP"
      health_path    = "/health"
      replicas       = 2
      cpu_request    = "100m"
      memory_request = "128Mi"
    }
    backend = {
      port           = 8080
      protocol       = "HTTP"
      health_path    = "/api/health"
      replicas       = 3
      cpu_request    = "200m"
      memory_request = "256Mi"
    }
    database = {
      port           = 5432
      protocol       = "TCP"
      health_path    = ""
      replicas       = 1
      cpu_request    = "500m"
      memory_request = "1Gi"
    }
  }
}

# main.tf
locals {
  # Map keys and values
  service_names = keys(var.service_configs)
  service_configs_list = values(var.service_configs)
  
  # Map lookup with defaults
  frontend_config = lookup(var.service_configs, "frontend", {
    port = 80
    protocol = "HTTP"
    health_path = "/"
    replicas = 1
    cpu_request = "100m"
    memory_request = "128Mi"
  })
  
  # Map merging
  default_config = {
    timeout = 30
    retries = 3
    enabled = true
  }
  
  enhanced_configs = {
    for name, config in var.service_configs :
    name => merge(local.default_config, config, {
      full_name = "${var.project_name}-${name}"
      port_name = "${name}-port"
    })
  }
  
  # Map filtering and transformation
  http_services = {
    for name, config in var.service_configs :
    name => config if config.protocol == "HTTP"
  }
  
  high_replica_services = {
    for name, config in var.service_configs :
    name => config if config.replicas > 2
  }
  
  # Map to list transformations
  service_ports = [
    for name, config in var.service_configs :
    {
      service = name
      port    = config.port
      protocol = config.protocol
    }
  ]
  
  # Nested map operations
  service_env_matrix = {
    for service_name in local.service_names :
    service_name => {
      for env in ["dev", "staging", "prod"] :
      env => merge(var.service_configs[service_name], {
        replicas = env == "prod" ? var.service_configs[service_name].replicas : 1
        resources = {
          cpu = env == "prod" ? "500m" : var.service_configs[service_name].cpu_request
          memory = env == "prod" ? "512Mi" : var.service_configs[service_name].memory_request
        }
      })
    }
  }
}

# Create target groups using map functions
resource "aws_lb_target_group" "services" {
  for_each = local.http_services
  
  name     = "${var.project_name}-${each.key}-tg"
  port     = each.value.port
  protocol = each.value.protocol
  vpc_id   = aws_vpc.main.id
  
  health_check {
    enabled             = true
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
    interval            = 30
    path                = each.value.health_path
    matcher             = "200"
  }
  
  tags = {
    Name    = "${var.project_name}-${each.key}"
    Service = each.key
    Port    = each.value.port
  }
}

Date and Time Functions

# Date and time functions
locals {
  # Current timestamp
  deployment_time = timestamp()
  
  # Formatted timestamps
  deployment_date = formatdate("YYYY-MM-DD", local.deployment_time)
  deployment_hour = formatdate("hh", local.deployment_time)
  
  # Time-based logic
  is_business_hours = tonumber(formatdate("hh", local.deployment_time)) >= 9 && 
                     tonumber(formatdate("hh", local.deployment_time)) <= 17
  
  is_weekend = contains(["Saturday", "Sunday"], formatdate("EEEE", local.deployment_time))
  
  # Date arithmetic for retention
  backup_retention_date = timeadd(local.deployment_time, "-${var.backup_retention_days * 24}h")
  
  # Time-based resource naming
  timestamped_name = "${var.project_name}-${formatdate("YYYY-MM-DD-hhmm", local.deployment_time)}"
  
  # Scheduled operations
  maintenance_window = var.environment == "prod" ? "sun:03:00-sun:04:00" : "sat:02:00-sat:03:00"
  backup_window = formatdate("hh:mm-", timeadd(local.deployment_time, "2h"))
}

# Resources with time-based configuration
resource "aws_db_instance" "main" {
  identifier = "${var.project_name}-db"
  
  engine         = "mysql"
  engine_version = "8.0"
  instance_class = "db.t3.micro"
  
  allocated_storage = 20
  
  db_name  = var.db_name
  username = var.db_username
  password = var.db_password
  
  backup_retention_period = var.backup_retention_days
  backup_window          = "${formatdate("hh:mm", timeadd(local.deployment_time, "3h"))}-${formatdate("hh:mm", timeadd(local.deployment_time, "4h"))}"
  maintenance_window     = local.maintenance_window
  
  # Time-based final snapshot naming
  skip_final_snapshot       = false
  final_snapshot_identifier = "${var.project_name}-final-${formatdate("YYYY-MM-DD-hhmm", local.deployment_time)}"
  
  tags = {
    Name         = "${var.project_name}-database"
    CreatedAt    = local.deployment_date
    CreatedHour  = local.deployment_hour
    BusinessHours = local.is_business_hours
  }
}

Type Conversion and Validation Functions

# variables.tf
variable "mixed_config" {
  description = "Mixed configuration values"
  type = map(any)
  default = {
    instance_count = "3"
    enable_monitoring = "true"
    cpu_threshold = "80.5"
    tags = "Name=test,Environment=dev"
    ports = "80,443,8080"
  }
}

# main.tf
locals {
  # Type conversion functions
  instance_count = tonumber(var.mixed_config.instance_count)
  enable_monitoring = tobool(var.mixed_config.enable_monitoring)
  cpu_threshold = tonumber(var.mixed_config.cpu_threshold)
  
  # String to list conversion
  port_list = [
    for port in split(",", var.mixed_config.ports) :
    tonumber(port)
  ]
  
  # String to map conversion
  tag_pairs = split(",", var.mixed_config.tags)
  parsed_tags = {
    for pair in local.tag_pairs :
    split("=", pair)[0] => split("=", pair)[1]
  }
  
  # Type validation and conversion
  validated_config = {
    instance_count = can(tonumber(var.mixed_config.instance_count)) ? 
      max(1, min(10, tonumber(var.mixed_config.instance_count))) : 1
    
    enable_monitoring = can(tobool(var.mixed_config.enable_monitoring)) ? 
      tobool(var.mixed_config.enable_monitoring) : false
    
    cpu_threshold = can(tonumber(var.mixed_config.cpu_threshold)) ? 
      max(0, min(100, tonumber(var.mixed_config.cpu_threshold))) : 80
  }
  
  # Complex type conversions
  service_list = tolist([
    {
      name = "web"
      port = 80
    },
    {
      name = "api"
      port = 8080
    }
  ])
  
  service_map = {
    for service in local.service_list :
    service.name => service
  }
  
  # JSON encoding/decoding
  config_json = jsonencode({
    instance_count = local.validated_config.instance_count
    monitoring = local.validated_config.enable_monitoring
    threshold = local.validated_config.cpu_threshold
    ports = local.port_list
    tags = local.parsed_tags
  })
  
  parsed_config = jsondecode(local.config_json)
  
  # YAML encoding/decoding (if yaml functions are available)
  config_yaml = yamlencode({
    services = local.service_map
    config = local.validated_config
  })
}

# Validation functions
locals {
  # Input validation
  valid_cidr = can(cidrhost(var.vpc_cidr, 0))
  valid_email = can(regex("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$", var.admin_email))
  valid_instance_type = contains(["t3.micro", "t3.small", "t3.medium"], var.instance_type)
  
  # Conditional values based on validation
  final_vpc_cidr = local.valid_cidr ? var.vpc_cidr : "10.0.0.0/16"
  final_instance_type = local.valid_instance_type ? var.instance_type : "t3.micro"
  
  # Try/catch equivalent using can()
  safe_division = can(var.numerator / var.denominator) ? var.numerator / var.denominator : 0
  
  # Complex validation patterns
  validation_results = {
    cidr_valid = local.valid_cidr
    email_valid = local.valid_email
    instance_type_valid = local.valid_instance_type
    all_valid = local.valid_cidr && local.valid_email && local.valid_instance_type
  }
}

Advanced Function Combinations

# Complex data transformations
locals {
  # Multi-step data processing
  raw_user_data = "alice:admin,bob:user,charlie:admin,dave:user"
  
  # Step 1: Split into pairs
  user_pairs = split(",", local.raw_user_data)
  
  # Step 2: Parse each pair
  user_roles = {
    for pair in local.user_pairs :
    split(":", pair)[0] => split(":", pair)[1]
  }
  
  # Step 3: Group by role
  users_by_role = {
    for role in distinct(values(local.user_roles)) :
    role => [
      for user, user_role in local.user_roles :
      user if user_role == role
    ]
  }
  
  # Step 4: Generate policies
  role_policies = {
    for role, users in local.users_by_role :
    role => templatefile("${path.module}/policies/${role}_policy.json", {
      users = users
      resources = role == "admin" ? ["*"] : ["arn:aws:s3:::${var.project_name}-*"]
    })
  }
  
  # Complex subnet calculation
  vpc_cidr = "10.0.0.0/16"
  az_count = length(data.aws_availability_zones.available.names)
  
  # Calculate subnet distribution
  subnet_configs = flatten([
    for tier_index, tier in ["public", "private", "database"] : [
      for az_index in range(local.az_count) : {
        name = "${tier}-${az_index + 1}"
        cidr = cidrsubnet(
          local.vpc_cidr,
          8,  # Additional bits for subnet
          tier_index * local.az_count + az_index + 1
        )
        availability_zone = data.aws_availability_zones.available.names[az_index]
        tier = tier
        index = az_index
        route_table = tier == "public" ? "public" : "private"
      }
    ]
  ])
  
  # Convert to map for for_each
  subnets = {
    for subnet in local.subnet_configs :
    subnet.name => subnet
  }
  
  # Generate security group rules dynamically
  application_ports = [80, 443, 8080, 8443]
  database_ports = [3306, 5432, 27017]
  
  security_rules = flatten([
    # Application tier rules
    for port in local.application_ports : [
      {
        type        = "ingress"
        from_port   = port
        to_port     = port
        protocol    = "tcp"
        cidr_blocks = ["0.0.0.0/0"]
        description = "Allow ${port} from anywhere"
        tier        = "application"
      }
    ],
    # Database tier rules
    for port in local.database_ports : [
      {
        type        = "ingress"
        from_port   = port
        to_port     = port
        protocol    = "tcp"
        cidr_blocks = [local.vpc_cidr]
        description = "Allow ${port} from VPC"
        tier        = "database"
      }
    ]
  ])
  
  # Group rules by tier
  rules_by_tier = {
    for tier in ["application", "database"] :
    tier => [
      for rule in local.security_rules :
      rule if rule.tier == tier
    ]
  }
}

# Create resources using complex functions
resource "aws_subnet" "main" {
  for_each = local.subnets
  
  vpc_id            = aws_vpc.main.id
  cidr_block        = each.value.cidr
  availability_zone = each.value.availability_zone
  
  map_public_ip_on_launch = each.value.tier == "public"
  
  tags = {
    Name = "${var.project_name}-${each.key}"
    Tier = each.value.tier
    AZ   = each.value.availability_zone
  }
}

# IAM policies using complex functions
resource "aws_iam_policy" "role_policies" {
  for_each = local.role_policies
  
  name        = "${var.project_name}-${each.key}-policy"
  description = "Policy for ${each.key} role"
  policy      = each.value
  
  tags = {
    Role = each.key
    Users = join(",", local.users_by_role[each.key])
  }
}

Performance and Optimization

# Optimized function usage
locals {
  # Avoid repeated expensive calculations
  availability_zones = data.aws_availability_zones.available.names
  az_count = length(local.availability_zones)
  
  # Pre-calculate commonly used values
  environment_multipliers = {
    dev     = 1
    staging = 2
    prod    = 3
  }
  
  base_instance_count = 2
  scaled_instance_count = local.base_instance_count * local.environment_multipliers[var.environment]
  
  # Efficient list processing
  instance_configs = [
    for i in range(local.scaled_instance_count) : {
      name = format("%s-%s-%02d", var.project_name, var.environment, i + 1)
      az   = local.availability_zones[i % local.az_count]
      size = i < 2 ? "small" : "medium"  # First 2 are small, rest are medium
    }
  ]
  
  # Batch operations
  all_tags = merge(
    var.common_tags,
    {
      Environment = var.environment
      Project     = var.project_name
      ManagedBy   = "terraform"
      CreatedAt   = formatdate("YYYY-MM-DD", timestamp())
    }
  )
}

Key Takeaways

String Manipulation: Use string functions for formatting, validation, and transformation
Numeric Calculations: Apply math functions for scaling, distribution, and resource sizing
Collection Operations: Leverage list and map functions for data transformation
Type Safety: Use type conversion and validation functions for robust configurations
Performance: Pre-calculate expensive operations and reuse results
Complex Logic: Combine multiple functions for sophisticated data processing

Next Steps

Tutorial 23: Learn about conditional expressions and dynamic blocks
Practice combining multiple functions for complex transformations
Experiment with template functions for configuration generation
Review the Terraform Functions Documentation

Built-in Functions