Bringing Order to the Cloud: Day 2 Operations in AWS with Ansible

July 14, 2020 by Ashton Davis

Cloud environments do not lend themselves to manual management or interference, and only thrive in well-automated environments. Many cloud environments are created and deployed from a known definition/template, but what do you do on day 2? In this blog post, we will cover some of the top day 2 operations use cases available through our Red Hat Certified Ansible Content Collection for AWS (requires a Red Hat Ansible Automation Platform subscription) or from Ansible Galaxy (community supported).

How to automate hybrid clouds


Let’s manage some clouds!

No matter the road that led you to managing a cloud environment, you’ll likely have run into the ever-scaling challenge of maintaining cloud-based services over time. Cloud environments do not operate the same ways the old datacenter-based infrastructures did. Coupled with the ease of access for just about anyone to deploy services, you’ll have a potential recipe for years of unlimited maintenance headaches.

The good news is that there is one way to bring order to all the cloud-based chaos: Ansible. In this blog post we will explore common day 2 operations use cases for Amazon Web Services using the Ansible Certified Content Collection. For more information on how to use Ansible Content Collections, check out our blog post for Getting Started With Ansible Content Collections.



Snapshotting is a common operation during maintenance windows. The example below demonstrates a simple snapshotting action on an EC2 instance using the ec2_snapshot module from the Collection.

- name: take a snapshot of the instance to create an image
    instance_id: "{{ instance_id }}"
    device_name: /dev/xvda
    state: present
  register: setup_snapshot

In this example, all we need to know is the instance ID and the device ID you want to snapshot. Note the register line (we’ll come back to that). “That’s great,” I can hear you saying. “Excellent, a bunch of code that does what I can do with the click of a button.” Indeed, so let’s explore an application for this.

Say you have an instance that needs patching during a maintenance window. Each instance should be snapshotted, patched, verified; then the snapshot should be cleared - standard fare. Next, imagine that you actually have over a hundred EC2 instances in need of patching. Now imagine that a few Ansible tasks were able to accomplish that entire procedure, including clearing the snapshot. Now we’re talking!

You may not love the ever-growing number of EC2 instances out there, but at least you can rest assured that your patching operations can be scaled to match. Next, let’s explore another use for the ec2_snapshot module.


AMI Creation

AMI Management can become quite a challenge without automated workflows to assist, especially when managing otherwise identical AMIs in multiple regions. Let’s look at a few different AMI-related tasks that may spark ideas for how to simplify your daily image maintenance, starting with generating a new AMI from an EC2 instance snapshot.

- name: AMIs
   - name: take a snapshot of the instance to create an image
        instance_id: "{{ instance_id }}"
        device_name: /dev/xvda
        state: present
     register: setup_snapshot
- name: create an image from the instance instance_id: "{{ instance_id }}" state: present name: "acme_{{ ec2_ami_name }}_ami" description: "{{ ec2_ami_description }}" tags: Name: "acme_{{ ec2_ami_name }}_ami" wait: yes root_device_name: /dev/xvd

In fewer than 20 lines, you too can automate AMI creation! How else can this apply? It has long been a standard practice to create virtual machine templates from a gold standard. If you are maintaining your baseline configurations with Ansible Tower, you can add this step to a Workflow Job Template and set it to a schedule. This process would ensure you have up to date AMIs to deploy instances from as often as that scheduled workflow runs.


AMI Lookup

If you’ve ever tried to deploy an instance using an automation tool, you’ve probably found yourself hopelessly wading through the sea of available AMIs to find “the right one,” only to find out that there’s a different AMI ID for every single global region. If this sounds like you, you’re not alone. Also, good news - Ansible can help with this too. Let’s start with the following code snippet.

    - name: Get a list of our AMIs
          architecture: x86_64
          virtualization-type: hvm
          root-device-type: ebs
          name: "acme_*_ami"
      register: amis

    - name: Pick the first AMI ID returned in the previous step 
        image_id: "{{ (amis.images|first).image_id }}"

This will pull a list of the AMIs we created that match the virtualization and root device types, using a search criteria that will match our AMI naming scheme. It will set as a fact the AMI ID of the first AMI in the list. But Amazon may not always return a consistent list - so what should we do? 

    - name: Get a list of Amazon HVM AMIs
          architecture: x86_64
          virtualization-type: hvm
          root-device-type: ebs
          name: "acme_*_ami"
      register: amis

    - name: And select the most recent one
        image_id: "{{ amis.images | sort(attribute='creation_date') | last }}"

In this example, we sort the results by creation_date and set the fact to the most recent (the last in the list). This is a much more useful example in the real world. Let’s tie this back to our previous two examples. In conjunction with using Ansible to deploy instances, you can reasonably set up a system that will always have fresh AMIs ready for provisioning, and a provisioning workflow that will always take the most recent AMI. 


Elastic Load Balancers

There are a host of ways to use and configure ELBs. For the sake of demonstrating what is possible with Ansible, let’s take a fairly simple action: adding a listener.

    - name: Deploy listeners with health checks on 80 and 443
        name: "{{ lb_name }}"
        state: present
        zones: "{{ ec2_zones }}"
          - protocol: http
            load_balancer_port: 80
            instance_port: 80
          - protocol: https
            load_balancer_port: 443
            instance_port: 443
            ping_protocol: http 
            ping_port: 80
            ping_path: "/healthcheck.html”
            response_timeout: 5
            interval: 30
            unhealthy_threshold: 2
            healthy_threshold: 10

“But that’s a day 1 task!” While true, it is also important that services like ELBs have standardized and consistent configurations. A listener definition like the above can be paired with application deployment workflows to ensure that the load balancer configuration will always stay up to date with each passing release, and can be kept consistent with all other load balancer configurations. From day 2 onward, you will be able to query your load balancers by their definitions and easily (and with much lower risk) deploy changes.


VPC Management

Our final day 2 management example focuses on VPCs. As with the ELB example, it is imperative that VPCs are deployed and maintained from a definition, and that definition is kept up to date. While there are multitudes of reasons for this, a good one is that you can do useful things like this:

    - name: Add IPv6 CIDR to existing VPC
        state: present
        cidr_block: "{{ vpc_cidr }}"
        name: "{{ vpc_name }}"
        ipv6_cidr: true
      register: vpc_info

Now, would you needlessly start adding IPv6 to your network definitions just because? Of course not! But what’s important to understand is that from day 2 onward, you have the capability to make incremental, even large changes with simple updates to your cloud infrastructure definitions. After executing the above, there would be a host of options available to you, many of which would require little more than minor code changes to existing definitions.


Takeaways and where to go next

In this blog, we covered some typical day 2 cloud management operations with Ansible ranging from AMI creation to full VPC management. We hope you found this blog useful! More importantly, we hope it inspired you to start thinking about your cloud management in a different way. Check out the getting started guide when you’re ready to get started!

And if you want to know more about the Red Hat Ansible Automation Platform:


*This blog was co-written by Jill Rouleau, Sr Software Engineer on the Ansible Cloud Engineering team




Ashton Davis

Ashton Davis is an open source fanatic and life-long automator (read: he’s never met a task he liked enough to do twice). Currently Ashton is a practice lead covering management and automation at Red Hat.


See All

rss-icon  RSS Feed

AF 2022 - Blog static promo