Auto Scaling Group with Rolling Deployment Module

This Terraform Module creates an Auto Scaling Group (ASG) that can do a zero-downtime rolling deployment. That means every time you update your app (e.g. publish a new AMI), all you have to do is run terraform apply and the new version of your app will automatically roll out across your Auto Scaling Group. Note that this module only creates the ASG and it's up to you to create all the other related resources, such as the launch template, ELB, and security groups.

** Note: This module used to use Launch configurations but has been updated to use Launch templates. This has been recommended by AWS for some time and Launch configurations will finally be deprecated entirely on Dec 31st 2023.

What's an Auto Scaling Group?

An Auto Scaling Group (ASG) is used to manage a cluster of EC2 Instances. It can enforce pre-defined rules about how many instances to run in the cluster, scale the number of instances up or down depending on traffic, and automatically restart instances if they go down.

How does rolling deployment work?

Since Terraform does not have rolling deployment built in (see https://github.com/hashicorp/terraform/issues/1552), we are faking it using the create_before_destroy lifecycle property. This approach is based on the rolling deploy strategy used by HashiCorp itself, as described by Paul Hinze here. As a result, every time you update your launch templates (e.g. by specifying a new AMI to deploy), Terraform will:

Create a new ASG with the new launch templates.
Wait for the new ASG to deploy successfully and for the instances to register with the load balancer (if you associated an ELB or ALB with this ASG).
Destroy the old ASG.
Since the old ASG is only removed once the new ASG instances are registered with the ELB and serving traffic, there will be no downtime. Moreover, if anything went wrong while rolling out the new ASG, it will be marked as tainted (i.e. marked for deletion next time) and the original ASG will be left unchanged, so again, there is no downtime.

Note that if all we did was use create_before_destroy, on each redeploy, our ASG would reset to its hard-coded desired_capacity, losing the capacity changes from auto scaling policies. We solve this problem by using an external data source that runs the Python script get-desired-capacity.py to fetch the latest value of the desired_capacity parameter:

If the script finds a value from an already-existing ASG, we use it, to ensure that the changes form auto scaling events are not lost.
If the script doesn't find an already-existing ASG, that means this is the first deploy, and we fall back to the hard-coded desired_capacity value.

Reference

Inputs
Outputs

Required

desired_capacitynumberrequired

The desired number of EC2 Instances to run in the ASG initially. Note that auto scaling policies may change this value. If you're using auto scaling policies to dynamically resize the cluster, you should actually leave this value as null.

launch_templateobject(…)required

The ID and version of the Launch Template to use for each EC2 instance in this ASG. The version value MUST be an output of the Launch Template resource itself. This ensures that a new ASG is created every time a new Launch Template version is created.

Type Details

object({
    id      = string
    name    = string
    version = string
  })

max_sizenumberrequired

The maximum number of EC2 Instances to run in the ASG

min_sizenumberrequired

The minimum number of EC2 Instances to run in the ASG

vpc_subnet_idslist(string)required

A list of subnet ids in the VPC were the EC2 Instances should be deployed

Optional

custom_tagslist(object(…))optional

A list of custom tags to apply to the EC2 Instances in this ASG. Each item in this list should be a map with the parameters key, value, and propagate_at_launch.

Type Details

list(object({
    key                 = string
    value               = string
    propagate_at_launch = bool
  }))

Default:[]

Examples

Example

   default = [
     {
       key = "foo"
       value = "bar"
       propagate_at_launch = true
     },
     {
       key = "baz"
       value = "blah"
       propagate_at_launch = true
     }
   ]

deletion_timeoutstringoptional

Timeout value for deletion operations on autoscale groups.

Default:"10m"

enabled_metricslist(string)optional

A list of metrics the ASG should enable for monitoring all instances in a group. The allowed values are GroupMinSize, GroupMaxSize, GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances.

Default:[]

Examples

Example

   enabled_metrics = [
      "GroupDesiredCapacity",
      "GroupInServiceInstances",
      "GroupMaxSize",
      "GroupMinSize",
      "GroupPendingInstances",
      "GroupStandbyInstances",
      "GroupTerminatingInstances",
      "GroupTotalInstances"
    ]

health_check_grace_periodnumberoptional

Time, in seconds, after an EC2 Instance comes into service before checking health.

Default:300

load_balancerslist(string)optional

A list of Elastic Load Balancer (ELB) names to associate with this ASG. If you're using the Application Load Balancer (ALB), see target_group_arns.

Default:[]

max_instance_lifetimenumberoptional

The maximum amount of time, in seconds, that an instance inside an ASG can be in service, values must be either equal to 0 or between 604800 and 31536000 seconds.

Default:null

min_elb_capacitynumberoptional

Wait for this number of EC2 Instances to show up healthy in the load balancer on creation.

Default:0

tag_asg_id_keystringoptional

The key for the tag that will be used to associate a unique identifier with this ASG. This identifier will persist between redeploys of the ASG, even though the underlying ASG is being deleted and replaced with a different one.

Default:"AsgId"

target_group_arnslist(string)optional

A list of Application Load Balancer (ALB) target group ARNs to associate with this ASG. If you're using the Elastic Load Balancer (ELB), see load_balancers.

Default:[]

termination_policieslist(string)optional

A list of policies to decide how the instances in the auto scale group should be terminated. The allowed values are OldestInstance, NewestInstance, OldestLaunchTemplate, AllocationStrategy, ClosestToNextInstanceHour, Default.

Default:[]

use_elb_health_checksbooloptional

Whether or not ELB or ALB health checks should be enabled. If set to true, the load_balancers or target_groups_arns variable should be set depending on the load balancer type you are using. Useful for testing connectivity before health check endpoints are available.

Default:true

wait_for_capacity_timeoutstringoptional

A maximum duration that Terraform should wait for the EC2 Instances to be healthy before timing out.

Default:"10m"

asg_arn

asg_name

asg_unique_id