Red Hat Ansible Automation Platform 2 is the next generation automation platform from Red Hat’s trusted enterprise technology experts. We are excited to announce that the Ansible Automation Platform 2.3 release includes automation controller 4.3.
In the previous blog, we saw that automation controller 4.1 provides significant performance improvements as compared to Red Hat Ansible Tower 3.8. Automation controller 4.3 is taking that one step further. We will elaborate on an important change with callback receiver workers in automation controller 4.3 and how it can have an impact on the performance.
Callback Receiver
The callback receiver is the process in charge of transforming the standard output of Ansible into serialized objects in the automation controller database. This enables reviewing and querying results from across all your infrastructure and automation. This process is I/O and CPU intensive and requires performance considerations.
Every control node in automation controller has a callback receiver process. It receives job events that result from Ansible jobs. Job events are JSON structures, created when Ansible calls the runner callback plugin hooks. This enables Ansible to capture the result of a playbook run. The job event data structures contain data from the parameters of the callback plugin hooks plus unique IDs that reference other job events. The following is an example job event:
"event": "playbook_on_play_start",
"counter": 2,
"event_display": "Play Started (all)",
"event_data": {
"playbook": "chatty_tasks.yml",
"playbook_uuid": "aca1b0da-f29c-4fcf-be35-1aa59a30a4e0",
"play": "all",
"play_uuid": "faacc0d4-457c-ac33-a7f4-00000000006a",
"play_pattern": "all",
"name": "all",
"pattern": "all",
"uuid": "faacc0d4-457c-ac33-a7f4-00000000006a",
"guid": "a70eb73c9c2241e0995963a6dcd4b89b"
},
These job events are pushed to the redis database queue and processed by the callback receiver. Each callback receiver has workers that process these job events and saves them in the database. Prior to automation controller 4.3, by default each callback receiver had four workers to process job events regardless of the size of the control node. For customers who vertically scale their control nodes, this could cause performance issues as the callback receiver workers were not scaled based on the capacity of the control node(s).
Performance Issues
Large Ansible Automation Platform clusters generate a huge volume of job events when running at their maximum capacity (max allowed forks), i.e. running loads of jobs. Also, if the job templates are run at higher verbosity, that generates even more job events. During our performance analysis, we noticed that job events were getting queued at the redis database waiting to be processed when a large volume of job events took place that could not be handled by the default four callback receiver workers. As more and more job events were queued up at the redis database (an in-memory database), the underlying control node ran out of memory (OOM) and the redis database processes were killed.
Solution
While versions of automation controller prior to 4.3 had the option of modifying the JOB_EVENT_WORKERS
setting to increase the size of the callback receiver from the default four, it was not a well known administrative setting. Now, in automation controller 4.3, vertically scaling control nodes not only increases capacity to run jobs (which generate events), it proportionally scales the number of callback receiver workers to better handle the output from those jobs and to utilize host resources available to automation controller.
This is accomplished by enhancements to the traditional installer and the Red Hat OpenShift operator. For virtual machine and bare metal installations, the 4.3 installer sets the number of callback receiver workers equal to the number of CPU. For example, if a VM control node has eight CPUs, the installer sets the callback receiver worker to eight.
For Red Hat OpenShift operator based installs, the number of callback receiver workers is set to the CPU limit for the task container if the CPU limit is greater than four. Additionally, administrators may set the callback receiver worker manually if they so choose by setting JOB_EVENT_WORKERS
property in a custom settings file. For more information on making this modification manually, visit the performance tuning guide.
Takeaways & where to go next
With the above change of how callback receiver workers are implemented, the risk of running into OOM issues is reduced and improves the overall performance of automation controller. In the next blog, we compare some of the results of the above change in two different clusters of automation controller.
If you're interested in detailed information on automation controller, then the automation controller documentation is a must-read. To download and install the latest version, please visit the automation controller installation guide. To view the release notes of recent automation controller releases, please visit release notes 4.3. If you are interested in more details about Ansible Automation Platform, be sure to check out our e-books.
About the author
Browse by channel
Automation
The latest on IT automation that spans tech, teams, and environments
Artificial intelligence
Explore the platforms and partners building a faster path for AI
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
Explore how we reduce risks across environments and technologies
Edge computing
Updates on the solutions that simplify infrastructure at the edge
Infrastructure
Stay up to date on the world’s leading enterprise Linux platform
Applications
The latest on our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Developer resources
- Customer support
- Red Hat value calculator
- Red Hat Ecosystem Catalog
- Find a partner
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit