For a system administrator, a perfect world would consist of just one type of server that we needed to support and just one tool to do that work. Unfortunately, we don’t live in an ideal world. Many system admins are required to manage day to day operations of very different servers with different operating systems. The complexity gets magnified when you start looking for tools to manage these distinct systems. Looking at how to automate these systems could lead you down a path of one automation tool per OS type. But why? When you can have one central automation platform that can be used for all servers. In this example, we are going to look at managing Red Hat Enterprise Linux (RHEL) and Windows servers in one data center by the same group of system administrators. While we are going to cover the use case of managing web servers on both RHEL and Windows in some technical details, be aware that this method can be used for almost any typical operational tasks.
Scenario: Managing the web service on RHEL and Windows
In this scenario, we have a system administrator that is tired of getting calls from the network operations center (NOC) to stop, start, and even restart Internet Information Services (IIS) web services. Many times this is during the workday, but plenty of times these calls come in at night or on the weekends. While the task itself isn’t difficult, it takes up valuable family time and disrupts sleep. For security reasons, he prefers not to give NOC personnel admin rights on these Windows servers. To make matters even worse, half the time they want him to also manage web services such as Apache on RHEL servers, but he doesn’t have access to those servers. He normally needs to then turn around and wake up another administrator with those privileges for the same mundane tasks. How can this administrator set up a secure, automated method for this work to be done when needed? Just as important, how can he do this on both OS platforms?
Solution: Red Hat Ansible Automation Platform
The solution involves writing a vendor-agnostic playbook to manage both web services. This playbook will utilize the service module for RHEL and the win_service module for Windows. This playbook is then set up in a job template in Red Hat Ansible Tower with a simple survey to make this task much easier for the NOC personnel. We will take a look at each of these components in greater detail.
First, let’s take a look at the playbook itself. How do we make it vendor-agnostic if modules are built specifically for each type of OS? By the way, this is by design. Windows for Windows and Linux for Linux makes sure that these modules work exactly as needed to automate against those devices. What this Ansible Playbook does is bring both well-designed modules together under one Ansible Automation Platform.
Let’s start with taking a look at the first part of the playbook. First, let’s touch on connection variables. Here you only see local, but the true values used for connection type lives in the inventory file itself. For the windows devices, for example, we use a connection variable such as ansible_connection: winrm. This is set in the inventory and available every time you run this or any other playbook against that server. Take a look at the image for an example:
For more details on how to setuping Windows hosts with WinRM, take a look at the “Connecting to a Windows host” blog.
The first task you can ignore, but I left it in to show that you can use this to see all the different facts or info that is gathered when you enable “gather_facts: True”. This will come in handy in the future. Today, we will focus on the second task, which shows us one variable that we will use in the rest of the playbook to make sure we use the right module for the appropriate OS. This magic variable is called “ansible_distribution.” All this task is doing is printing it to the screen so you can see the OS type. Showing this is simply for demo purposes; the value of this variable is stored in memory and you will see how we use it, but as you play around with Ansible and troubleshoot, you may want to view it as well. Next, we will see how this one variable determines which tasks to run on a Windows host and which to run on a RHEL host.
As our playbook hits each host in the data center, we can now see if it's a Windows or RHEL server and can use this valuable info to set module-specific variable values. The next four tasks as shown above aren’t making any changes to the server, but instead are assigning values to variables that in turn will be used in the two tasks after this. The trick here is that each of the modules for Windows and RHEL will need slightly different values. This is where we use the value of “ansible_distribution” to determine if we set that variable to a value that will be used in a Windows module or a RHEL module.
As an example, the name of the web service itself will not be the same on a Windows IIS server or RHEL Apache server. The first two tasks take the generic “web service” value that was passed from the Ansible survey and make a distinction between the IIS service or the httpd (Apache) service. It does this by using the value of the magic variable “ansible_distribution”. Take a look above at the when statements on the first two tasks in that section. The first one looks to see if ansible_distribution contains “windows” in the value AND if service_surrvey is equal to “web service.” If they are both true, then it will set the value of the “service” variable to “W3Svc” utilizing the set_fact module. In the case of a Windows IIS server it will set this value, but in the case of a RHEL instance this “when” statement will be false and the task will simply be skipped and not executed. Now, the second task in this section is the opposite. It will use the set_fact module to set a value for “service” if ansible_distribution contains “RedHat.” Once again, for any Windows servers, it will simply skip that task.
Now, let's look at the last two tasks in this playbook. These are actually the ones that will change the state of that service on the servers. Now that we know the type of OS and have assigned the right values to the variables we need in our modules, we can use the appropriate module to change the state of the service. In the first task in this section, we see it uses the win_service module. As you can infer, this will only work on Windows servers, so once again you will see we have a “when” statement that checks for the value of ansible_distribution. In this case, it’s checking to make sure “ansible_distribution” contains “Windows” in the value and, if so, it will execute this task. In the next task, you can see it uses the “service” module and again with a “when” statement but this time checking that “ansible_distribution” contains “RedHat”. As you can see, in the case of a Windows IIS server the first task will be executed while the second will be skipped. In the case of a RHEL Apache server, the first task will be skipped and the second will be executed.
To put it all together, let’s take a look at the survey you will use in Red Hat Ansible Tower to solicit the info from the NOC personnel when they need to manage those web services.
Here you can see that in very basic English we ask the NOC personnel to enter some generic information that our playbook will use to manage that service. First, we have to respond to what service. In this case, we have been talking about the web services, so we make that selection. Next, we must decide what state we want the service in by responding to the second question. In this case, we want to restart the service in the case of let’s say an outage where we suspect the service may be stalled. Finally, we can even designate if we want this service to start automatically after any reboot. In this case, yes we would so we select “yes”. Keep in mind in most cases a NOC technician will not use this option but could be there if needed or used in a change management situation. At this point, the NOC technician would simply click on “NEXT” and you will get one more chance to review the values of these variables before you launch the job.
Going back to our scenario, we can see that the NOC technicians will be able to launch a job template, fill out the short and simple survey and manage that web service regardless of it being IIS or httpd. The job and underlying playbook will handle the work on both Windows and RHEL servers without the technicians needing either specific individual rights on the servers or specific expertise on those OS types. This concept can be expanded to other services or even to other functions that are applicable to both servers. You can set these jobs up in Red Hat Ansible Tower with easy to understand and use surveys, allowing anyone to self-service. This allows for quicker response to outages, more efficiency, and frees up your administrators to focus on innovation and bigger projects without burning them out.
If you would like to see a video demo of the vendor-agnostic service management playbook, you can view it at the Ansible Automation Platform YouTube channel, which contains many other videos as well. Hope you found this helpful and keep automating!