In today's environment, mastering the hybrid cloud has become a key factor in IT transformation and business innovation. In this context, network complexity can be a nightmare, especially as organizations expand their infrastructure and embrace hybrid cloud and multi-cloud strategies. Without automation, monitoring and controlling network routing, infrastructure, and security in a hybrid and multi-cloud environment are difficult to manage. Furthermore, identifying and resolving network performance issues in these infrastructures are quite challenging.
In one of the previous blogs, titled “Crank up your automation with Ansible validated content”, Nuno Martins highlighted the Ansible validated content included in Red Hat Ansible Automation Platform 2.3.
In this blog post, we will show you how to leverage the amazon.aws_troubleshooting Collection for hybrid cloud to troubleshoot network performance issues and maximize your hybrid cloud mastery. In particular, we’ll use the aws_troubleshooting.connectivity_troubleshooter role.
First, let’s take a look at the amazon.aws_troubleshooting Collection.
Deep dive on cloud.aws_troubleshooting
Let’s take a deep look at the amazon.aws_troubleshooting Collection. This Collection includes a variety of Ansible Roles to help troubleshoot AWS resources. The Collection includes the following roles:
- cloud.aws_troubleshooting.troubleshoot_rds_connectivity - A role to troubleshoot RDS connectivity from an EC2 instance.
- cloud.aws_troubleshooting.aws_setup_credentials - A role to define credentials for AWS modules.
- cloud.aws_troubleshooting.connectivity_troubleshooter - A role to troubleshoot connectivity issues.
Specifically, the aws_troubleshooting.connectivity_troubleshooter role helps to troubleshoot connectivity issues in four different scenarios:
- AWS resources in different Amazon Virtual Private Clouds (VPCs) within the same AWS region that are connected using VPC peering.
- AWS resources in an Amazon VPC and an internet resource using an internet gateway.
- AWS resources within an Amazon VPC.
- AWS resources in an Amazon VPC and an internet resource using a network address translation (NAT) gateway.
The aws_troubleshooting.connectivity_troubleshooter uses a subset of roles (one for each of the previously listed scenarios) that automatically identifies which role to execute based on the configuration to be tested specified by the user. You can also run specific roles individually if desired. Each of the roles provides a detailed description that will guide you through the process (e.g., how to individually use the cloud.aws_troubleshooting.connectivity_troubleshooter_peering).
Next, let's put aws_troubleshooting.connectivity_troubleshooter role to the test using a typical cloud scenario. Here, we will deal with the first two scenarios.
However, in one of our upcoming blog posts, we will show how to use the cloud.aws_troubleshooting.troubleshoot_rds_connectivity role instead. Stay tuned!
Cloud scenario
Perhaps you need your applications or workloads in different IT departments or VPCs to access data or centralized resources in another VPC. This is where VPC peering connections come into play. These are a simple and straightforward way to enable communication between two VPCs within or between two different AWS accounts and/or regions.
In this scenario, centralized resources can be configured in a VPC with multiple VPC peering connections to all other IT departments and their VPCs.
In general, VPC peering can be used to enable data sharing or to allow EC2 instances to communicate with each other, make Amazon RDS databases available to both VPCs, or to make lambda functions available through the peering connection.
Set up automation
For the sake of simplicity, we will set up a VPC peering connection between two VPCs (VPC #1 and VPC #2) in the same account and within the same region. Each of the two IT departments or VPC networks has an EC2 instance and the two EC2 instances need to be able to communicate with each other.
Therefore, we set up two VPCs, each one with a subnet as follows:
- name: Create VPC #1
amazon.aws.ec2_vpc_net:
name: "{{ vpc_name }}-1"
cidr_block: "{{ vpc_1_cidr }}"
register: __create_vpc_1
- name: Create Subnet #1
amazon.aws.ec2_vpc_subnet:
cidr: "{{ vpc_1_subnet_cidr_1 }}"
vpc_id: "{{ __create_vpc_1.vpc.id }}"
state: present
register: __create_vpc_subnet_1
- name: Create VPC #2
amazon.aws.ec2_vpc_net:
name: "{{ vpc_name }}-2"
cidr_block: "{{ vpc_2_cidr }}"
register: __create_vpc_2
- name: Create Subnet #2
amazon.aws.ec2_vpc_subnet:
cidr: "{{ vpc_2_subnet_cidr }}"
vpc_id: "{{ __create_vpc_2.vpc.id }}"
state: present
map_public: true
register: __create_vpc_subnet_2
Next, we must set the VPC peering connection. One of the VPCs (the requester) must initiate the VPC peering process, while the other VPC (the accepter) must accept the request for the peering connection to be established. Traffic will only flow via the VPC peering connection if the appropriate routes are added to the peers' routing tables.
- name: Create VPC Peering
community.aws.ec2_vpc_peer:
vpc_id: "{{ __create_vpc_1.vpc.id }}"
peer_vpc_id: "{{ __create_vpc_2.vpc.id }}"
state: present
register: __create_vpc_peering
- name: Accept VPC Peering
community.aws.ec2_vpc_peer:
peering_id: "{{ __create_vpc_peering.peering_id }}"
state: accept
register: __accept_vpc_peering
Next, routes must be configured in the routing tables of both VPCs to allow traffic to flow through the new VPC peering connection. The routing setup on the peered VPCs can only exist if the IP CIDR blocks do not overlap. In each peered subnet, the static route pointing the other VPC's CIDR to the target of the VPC peering connection must be added to the routing tables. For testing purposes, we intentionally omit adding the route that points to the VPC peering connection target for the Subnet #2.
- name: Gather Route Tables
amazon.aws.ec2_vpc_route_table_info:
filters:
vpc-id: "{{ __create_vpc_1.vpc.id }}"
register: __route_table_info
- name: Set Route
amazon.aws.ec2_vpc_route_table:
vpc_id: "{{ __create_vpc_1.vpc.id }}"
route_table_id: "{{ __route_table_info.route_tables[0].id }}"
tags:
Name: "{{ vpc_name }}-1"
subnets:
- "{{ __create_vpc_subnet_1.subnet.id }}"
routes:
- dest: "{{ vpc_2_subnet_cidr }}"
vpc_peering_connection_id: "{{ __create_vpc_peering.peering_id }}"
register: __route_table
Next, a security group is created for each VPC. For example, suppose EC2 instances host web servers. Then, we should allow port 22 for remote connection and port 80 for web server access.
- name: Security Group #1
amazon.aws.ec2_security_group:
name: "{{ vpc_name }}-1-secgroup"
vpc_id: "{{ __create_vpc_1.vpc.id }}"
purge_rules: true
description: Ansible-Generated internal rule
rules:
- proto: tcp
from_port: 22
to_port: 22
cidr_ip: 0.0.0.0/0
- proto: tcp
from_port: 80
to_port: 80
cidr_ip: 0.0.0.0/0
register: __security_group_1
- name: Security Group #2
amazon.aws.ec2_security_group:
name: "{{ vpc_name }}-2-secgroup"
vpc_id: "{{ __create_vpc_2.vpc.id }}"
purge_rules: true
description: Ansible-Generated internal rule
rules:
- proto: tcp
from_port: 22
to_port: 22
cidr_ip: 0.0.0.0/0
- proto: tcp
from_port: 80
to_port: 80
cidr_ip: 0.0.0.0/0
register: __security_group_2
Now we want to create and launch an EC2 instance in each of the VPCs we created.
- name: Deploy EC2 instance #1
amazon.aws.ec2_instance:
instance_type: t2.micro
image_id: "{{ image_id }}"
wait: true
network:
assign_public_ip: false
vpc_subnet_id: "{{ __create_vpc_subnet_1.subnet.id }}"
security_groups: "{{ __security_group_in.group_id }}"
tags:
Name: "{{ instance_name }}-1"
register: __create_ec2_instance_1
- name: Deploy EC2 instance #2
amazon.aws.ec2_instance:
instance_type: t2.micro
image_id: "{{ image_id }}"
wait: true
network:
assign_public_ip: true
vpc_subnet_id: "{{ __create_vpc_subnet_2.subnet.id }}"
security_groups: "{{ __security_group_out.group_id }}"
tags:
Name: "{{ instance_name }}-2"
register: __create_ec2_instance_2
- name: Set 'ip_instance_1' and 'ip_instance_2' variables
ansible.builtin.set_fact:
ip_instance_1: "{{ __create_ec2_instance_1.instances.0.private_ip_address }}"
ip_instance_2: "{{ __create_ec2_instance_2.instances.0.private_ip_address }}"
We would like to test reachability from EC2 instance #1 to EC2 instance #2 on port 80.
- name: Include 'connectivity_troubleshooter' role
ansible.builtin.include_role:
name: cloud.aws_troubleshooting.connectivity_troubleshooter
vars:
connectivity_troubleshooter_destination_ip: "{{ ip_instance_2 }}"
connectivity_troubleshooter_destination_port: 80
connectivity_troubleshooter_source_ip: "{{ ip_instance_1 }}"
The following error message is displayed:
"msg": "Destination Subnet route table does not contain a valid peering route for source: 192.168.0.6"
This error is due to not adding the route pointing to the destination of the VPC peering connection for Subnet #2. Because routing is challenging and complex, oversights like this can occur. We solve the problem by adding the missing rule to the security group.
- name: Set Route
amazon.aws.ec2_vpc_route_table:
vpc_id: "{{ __create_vpc_2.vpc.id }}"
tags:
Name: "{{ vpc_name }}-2"
subnets:
- "{{ __create_vpc_subnet_2.subnet.id }}"
routes:
- dest: "{{ vpc_1_subnet_cidr_1 }}"
vpc_peering_connection_id: "{{ __create_vpc_peering.peering_id }}"
register: __route_table
We run the aws_troubleshooting.connectivity_troubleshooter role again. This time a success message is displayed, as shown below.
"result": "VPC peering evaluation successful"
We can also use the aws_troubleshooting.connectivity_troubleshooter to test whether EC2 instance #1 reaches the Internet. You need to use the role as follows:
- name: Include 'connectivity_troubleshooter' role
ansible.builtin.include_role:
name: cloud.aws_troubleshooting.connectivity_troubleshooter
vars:
connectivity_troubleshooter_destination_ip: 8.8.8.8
connectivity_troubleshooter_destination_port: 80
connectivity_troubleshooter_source_ip: "{{ ip_instance_1 }}"
As expected, the EC2 instance #1 does not have outbound connectivity to the Internet.
"msg": "No route found for destination: 8.8.8.8"
Therefore, it is necessary to set up an Internet gateway and add a route for it.
- name: Create Internet Gateway
amazon.aws.ec2_vpc_igw:
vpc_id: "{{ __create_vpc_2.vpc.id }}"
state: present
register: __create_igw
- name: Gather Route Tables
amazon.aws.ec2_vpc_route_table_info:
filters:
vpc-id: "{{ __create_vpc_1.vpc.id }}"
register: __route_table_info
- name: Set Route
amazon.aws.ec2_vpc_route_table:
vpc_id: "{{ __create_vpc_2.vpc.id }}"
tags:
Name: "{{ vpc_name }}-2"
subnets:
- "{{ __create_vpc_subnet_2.subnet.id }}"
purge_routes: true
routes:
- dest: "{{ vpc_1_subnet_cidr_1 }}"
vpc_peering_connection_id: "{{ __create_vpc_peering.peering_id }}"
- dest: 0.0.0.0/0
gateway_id: "{{ __create_igw.gateway_id }}"
register: __route_table
We run the aws_troubleshooting.connectivity_troubleshooter role again. Voila! This time a success message is displayed, as shown below.
"result": "Source evaluation successful"
Thanks to the Ansible validated content for amazon.aws_troubleshooting, we easily demystified connectivity issues, limited unwanted headaches, and maximized our mastery of the hybrid cloud.
Where to go next
- Come visit us at AnsibleFest, now a part of Red Hat Summit 2023.
- Missed out on AnsibleFest 2022? Check out the Best of AnsibleFest 2022.
- Self-paced lab exercises - We have interactive, in-browser exercises to help you get started with Ansible Automation Platform.
- Try Ansible Automation Platform free for 60 days.