AnsibleFest has just wrapped up, with a whole track dedicated to security automation, our answer to the lack of integration across the IT security industry. If you’re looking for a use case to start with, our investigation enrichment blog will give you yet another example of where Ansible can facilitate typical operational challenges of security practitioners.

Ansible security automation is about integrating various security technologies with each other. One part of this challenge is the technical complexity: different products, interfaces, workflows, etc. But another equally important part is getting the processes of different teams in the security organization aligned. After all, one sign of successful automation is the deployment across team boundaries.

This is especially true with threat hunting activities: when security analysts suspect a malicious activity or want to prove a hypothesis, they need to work with rules and policies to fine tune the detection and identification. This involves changes and configurations on various target systems managed by different teams.

In this blog post, we will start with a typical day-to-day security operations challenge and walk through some example threat hunting steps - adding more teams and products over the course to finally show how Red Hat Ansible Automation Platform can bring together the separated processes of various teams into a single streamlined one.

Simple Demo Setup

To better showcase challenges and how Ansible answers them, we use the same simplified setup already deployed during our blog post about investigation enrichment: a Check Point Next Generation Firewall (NGFW), a Snort Intrusion Detection System (IDS) and a IBM QRadar Security Information and Event Management (SIEM) gathering and analyzing the data. Also, we have an “attacker” simulating an attack pattern on the target running the IDS.

Of course, this setup is a simplified example; real world security automation solutions will be more complex and could feature other vendors.

threat hunt blog 1

First contact: the firewall team

We are security operators in charge of the firewall. Now, imagine that we just found out that a policy in our environment was violated:

threat hunt blog 2

Note the dropped packages here. Of course, we should at least investigate what this is - so let’s forward these to the SIEM team. As shown in our previous blog post, we can use Ansible Playbooks for this. But the example playbook covered multiple endpoints and required various credentials all in one go. In a larger enterprise organization, such deep cooperation might not be in place yet - or might not even be desired. Instead, as the firewall team, we can only control our part of the security ecosystem with Ansible. Nevertheless, if we are part of a larger team, maybe someone already wrote such playbooks. If so, where are they stored? And which credentials are necessary? How do we work on security automation even inside our own team?

The answer here is Red Hat Ansible Tower: this can act as an automation library, providing pre-approved automation workflows at a central place, to be consumed and executed by others.

Let’s say in our example as the firewall team we have three automation tasks we often need: blocking an IP address (e.g. of an attacker), allowing traffic from an IP address, and sending policies logs to the SIEM. The playbooks for this are stored in a Git repository. Now, we can take everything that is needed for the execution - the playbooks, but also the credentials, variables, inventory information, and so on - and put it into Ansible Tower:

threat hunt blog 3

This automation content can be consumed by the entire firewall team whenever it’s needed. In this case, we want to send our logs to our SIEM, and can initiate this right away by executing the above job.

And Ansible Tower can do more: it can simplify auditing! All data concerning the automation execution is stored: when it was triggered, by whom, what Git revision the content had, the output of the automation execution, etc.

threat hunt blog 4

This data is stored long term, and can be looked up whenever there are questions about a past automation execution, or when auditing requests are coming in. And of course, these data are not only provided via web UI, but also via the included REST API:

threat hunt blog 5

From now on, the firewall logs will be sent to the SIEM.

Investigation: the SOC team

We are now analysts part of the SOC (Security Operations Center) team. We use a SIEM for our day-to-day activities, which is now showing a new alert generated by the logs that the firewall team started sending out. Again, we can use automation content - and again we can use pre-approved content stored in Ansible Tower as a central library.

Usually the SOC team has their own view of the automation. To provide them a separate perspective, we use the role based access control capabilities of the Red Hat Ansible Automation Platform: we define multiple separate teams, each with their own users and each with different, domain specific automation content. The SOC team, for example, has content to reconfigure their SIEM to access logs from various sources:

threat hunt blog 6

This separation of automation access and execution ensures that domain experts keep the sovereignty over their assets while at the same time being able to use automation for their daily tasks. 

In this case, we launch the playbook to accept logs from the firewall in QRadar. Moments after we do so, the logs start showing up in the right queue. Being the analysts and seeing these logs, we decide we want to have a closer look at the machine that is targeted here. We want the IDS team to deploy a new rule.

Hunting: IDPS rules and their deployment

Let’s switch roles again - now we are part of the operations team that is responsible for the IDPS. We get the request of the analysts to deploy a new rule looking for a certain pattern that aligns to the behaviour seen at the firewall. As with the other teams, we also use Ansible Tower as a central library for all our automation content. But something is different for this team: the pre-approved automation content to deploy a new IDPS rule indeed allows them to deploy any kind of rule. The rule needs to be provided the moment the automation content is executed:

threat hunt blog 7

The query for the rule seen above is realized via the self service capabilities of Ansible Tower. This allows for security automation content to be written and provided in a more dynamic and flexible way. Of course, this extends far beyond the example of IDPS. Think of IPs or networks for firewall policies here, host names for endpoint protection or user names for privileged access management.

As soon as the new rule has been deployed, the next step is to configure our IDPS to send the logs to the SIEM, just like we did as the firewall operators above. But this time, we use another feature of Ansible Tower to showcase how security teams can collaborate in a much more aligned way: we as the IDPS operators write the automation content, we provide it in Ansible Tower, we enter all the credentials and so - but we leave it to the analysts to execute it as they see fit.

threat hunt blog 8

In the above screenshot, you see how we as the user “opsids” have admin rights on the automation content, while the user “analyst” only has the rights to execute the content.

Back to central: SIEM takes control

Let’s head back to our role as a member of the analyst team. We can see the automation content in our Ansible Tower. If you pay close attention to the bottom of the listing of available automation content, you see that this can only be executed by the analyst teams, but not deleted (or edited otherwise):

threat hunt blog 9

This separation of the right to execute from the right to access and modify can act as a bridge between teams: automation content can be created, certified and then safely shared with other teams. Credentials and the overall control is still fully with the domain team. But other teams - as in this case the analysts - can reuse the content, enabling more frictionless collaboration and less need for interaction between the teams.

However, having to click through different automation components is still not optimal. How do we keep an overview of what to click if the process contains many pieces to click? We merge them into workflows!

In our use case example, is it a two step process to add logs to a SIEM: we first have to configure the IDS to send the logs, and then tell the SIEM to accept them. We can merge both steps into a single workflow, thus having all available in one single execution. And of course workflows are not limited to two actions - we can bring together all the above mentioned steps into a single workflow spanning multiple vendors and teams in the process:

threat hunt blog 10

As you see, workflows bring together security automation content from different teams, enabling streamlined automation processes.

In our case, we now have all logs from the IDS in our SIEM, parallel to the already existing logs from the firewall. As the analysts, we can now go deeper into the subject, performing additional steps to analyze the machine’s behavior to exclude the possibility that it has been compromised. For the sake of the simple demo, let’s assume that we do not see any hits from the IDS, and thus conclude that there is not a wide attack going on, but indeed only one single IP knocking onto the firewall. Let’s say, based on this, we are able to figure out that this access was indeed authorised previously, but the policy was misconfigured. We want to remediate now, allowing that IP access to the target host. We mark this as resolved, and as a final step, roll back all changes so that the logs are not forwarded to the SIEM anymore. Remember, as we identified already in the investigation enrichment blog post:

Why don’t we add those logs to QRadar permanently? This could create alert fatigue, where too much data in the system generates too many events, and analysts might miss the crucial events. Additionally, sending all logs from all systems easily consumes a huge amount of cloud resources and network bandwidth.

From teams to organizations: Extending security automation further

Above, we shared how Ansible Automation Platform can be used in threat hunting while at the same time bringing together siloed teams. But large enterprises often face additional challenges: they have more than one automation resource, often running multiple clusters at different locations, with different content. How can we bring those together? And how do we ensure automation is properly governed when a large number of users need access to it?

Here, the software as a service offerings from the platform come in: with the help of automation services catalog, we can bring together security workflows from different clusters under one umbrella. Additionally, self service capabilities and the possibility to add multi level approval processes further ensure that the security automation is governed across large organizations in a secure way, even stretching out to a line of business functions.

threat hunt blog 11

Additionally, Automation Analytics offers a deeper understanding of what is happening across our enterprise organizations, and what security automation code is executed most often. This provides higher level stakeholders a better understanding of where security automation is used, and where there might be room for improvement.

threat hunt blog 12

Takeaways and where to go next

As we have shown above, Ansible security automation can play a crucial role in threat hunting by integrating different products with each other. At the same time, the people side of the processes can be greatly enhanced with central automation tooling provided by Ansible Automation Platform, enabling various teams to collaborate on security processes in an automation fashion.

As next steps, there are plenty of resources to follow up on the topic: