Organizations running workloads on Microsoft Azure use Azure Service Bus as their enterprise messaging backbone – routing events, alerts, and telemetry between services, monitoring tools, and operational systems. When Azure Monitor, Defender for Cloud, or a custom application publishes a critical event to a Service Bus queue or topic, the response today is manual: an engineer reads the message, investigates the affected resource, and remediates by hand.
This guide demonstrates how to connect Azure Service Bus Queues directly to Event-Driven Ansible (EDA), creating a real-time pipeline that consumes Azure events, enriches them with AI-driven root cause analysis, and triggers automated remediation – turning Azure’s messaging layer into the trigger for a closed-loop AIOps workflow.
This guide builds on the AIOps reference architecture.
For the full end-to-end AIOps pipeline – including AI inference, Lightspeed playbook generation, and the Crawl/Walk/Run maturity model – see AIOps automation with Ansible. This guide focuses specifically on using Azure Service Bus as the event transport layer.
Azure Service Bus is a fully managed enterprise messaging service from Microsoft that supports message queuing and publish-subscribe patterns. It decouples event producers (monitoring tools, applications, Azure services) from consumers (automation systems, dashboards, alerting tools), providing reliable message delivery with features like dead-letter queues, sessions, and scheduled delivery.
Azure Service Bus Messaging – microsoft.com
In an AIOps context, Azure Service Bus acts as the event transport layer – sitting between Azure’s monitoring and alerting tools (Azure Monitor, Azure Alerts, Defender for Cloud) and Ansible’s automation engine. Events are published to a Service Bus queue or topic, and Event-Driven Ansible subscribes to consume them in real time.
This architecture is particularly valuable for organizations that already use Azure Service Bus as their event backbone. Instead of building custom integrations for each alert type, you route all events through Service Bus and let a single EDA rulebook handle the dispatch to enrichment and remediation workflows.
What makes up the solution?
EDA is part of Ansible Automation Platform.
EDA uses rulebooks to monitor events, then executes specified job templates or workflows based on the event. Think of it simply as inputs and outputs. EDA is an automatic way for inputs into Ansible Automation Platform, where Automation controller is the output (running a job template or workflow).
| Persona | Challenge | What They Gain |
|---|---|---|
| Monitoring Azure Service Bus queues for critical events and manually investigating each one across hybrid Azure and on-prem infrastructure | Azure events automatically trigger enrichment and remediation workflows – the fix starts before the engineer even sees the alert | |
| Integrating Azure’s messaging layer with Ansible requires understanding Service Bus authentication, EDA event source plugins, and credential management | A reference architecture with production-ready EDA rulebooks, Azure Service Bus configuration, and tested collection usage | |
| Azure event volume is growing but response times aren’t improving; hybrid cloud operations span Azure and on-prem with no unified automation trigger | A single event pipeline that bridges Azure monitoring with Ansible remediation across hybrid environments – measurable MTTR reduction |
| Collection | Type | Purpose |
|---|---|---|
| ansible.eda | Certified | EDA event sources and filters – includes the azure_service_bus event source plugin |
| azure.azcollection | Certified | Manage Azure resources (VMs, networking, storage, etc.) for remediation tasks |
| ansible.controller | Certified | Automation Controller configuration as code (job templates, workflows, surveys) |
| redhat.ai | Certified | AI model inference using the OpenAI-compatible API via InstructLab |
| ansible.scm | Certified | Git operations (commit and push generated playbooks) |
| System | Required | Notes |
|---|---|---|
| Azure Service Bus namespace | Yes | With at least one queue or topic/subscription configured |
| Azure subscription | Yes | With permissions to create Service Bus resources and configure Shared Access Policies |
| AI inference endpoint | Yes | Red Hat AI (RHEL AI + InstructLab) or any OpenAI-compatible API |
| Ansible Lightspeed | Recommended | For dynamic playbook generation at the Run maturity level |
| Git repository | Yes | GitHub, GitLab, or Gitea for storing generated playbooks |
| Chat or ITSM tool | Recommended | Slack, Mattermost, or ServiceNow for human-in-the-loop notifications |
The workflow has four stages, matching the AIOps reference architecture:
| Stage | Operational Impact | Why |
|---|---|---|
| 1. Azure Event → EDA | None | Read-only – EDA reads a message from the Service Bus queue. No changes to systems. |
| 2. Enrichment Workflow | Low | Collects logs and system info, calls an AI API, posts to chat/ITSM. No infrastructure changes. |
| 3. Remediation Workflow | Low | Generates a playbook, commits to Git, creates a Job Template. Prepares the fix but does not touch production. |
| 4. Execute Remediation | High | Modifies production infrastructure. Should go through a change window or approval gate. |
Azure Monitor/Alert → Service Bus Queue → EDA Rulebook → Enrichment Workflow → AI Analysis → Remediation Workflow → Execute Fix
Azure Service Bus is the event transport layer.
In the AIOps reference architecture, Kafka or a direct webhook handles event transport. When using Azure Service Bus, it replaces that layer – Azure services publish events to a queue, and EDA subscribes directly using the built-in
ansible.eda.azure_service_busplugin.
Operational Impact: None (Azure configuration only)
Create a Service Bus namespace and queue in Azure, then configure a Shared Access Policy that EDA will use to authenticate.
Using the Azure CLI:
az servicebus namespace create \
--name aiops-events \
--resource-group rg-aiops \
--location eastus \
--sku Standard
az servicebus queue create \
--name infrastructure-alerts \
--namespace-name aiops-events \
--resource-group rg-aiops
az servicebus namespace authorization-rule create \
--name eda-listener \
--namespace-name aiops-events \
--resource-group rg-aiops \
--rights Listen
Or automate the same setup with the azure.azcollection:
- name: Create Azure Service Bus queue for AIOps events
azure.azcollection.azure_rm_servicebusqueue:
resource_group: rg-aiops
namespace: aiops-events
name: infrastructure-alerts
max_size_in_mb: 1024
default_message_time_to_live: "PT1H"
delegate_to: localhost
Route Azure Monitor alerts to Service Bus.
In the Azure Portal, navigate to Monitor → Alerts → Action Groups and create an action group with a Service Bus Queue action type. This routes Azure Monitor alerts directly into your queue where EDA can consume them.
Operational Impact: None
The EDA rulebook subscribes to the Azure Service Bus queue using the built-in ansible.eda.azure_service_bus event source plugin. When a message arrives, the rulebook evaluates conditions and triggers the enrichment workflow.
---
- name: Azure Service Bus AIOps response
hosts: all
sources:
- ansible.eda.azure_service_bus:
conn_str: "{{ azure_service_bus_connection_string }}"
queue_name: infrastructure-alerts
logging_enable: true
rules:
- name: Azure infrastructure alert detected
condition: event.body is defined
action:
run_workflow_template:
organization: "Default"
name: "Azure Alert Enrichment Workflow"
extra_vars:
alert_name: "{{ event.body.data.essentials.alertRule | default('Unknown Alert') }}"
severity: "{{ event.body.data.essentials.severity | default('Sev3') }}"
affected_resource: "{{ event.body.data.essentials.alertTargetIDs[0] | default('') }}"
description: "{{ event.body.data.essentials.description | default('') }}"
fired_time: "{{ event.body.data.essentials.firedDateTime | default('') }}"
The rulebook extracts key fields from the Azure Monitor alert schema – the alert rule name, severity, affected Azure resource ID, description, and timestamp – and passes them to the enrichment workflow.
Azure Monitor Common Alert Schema.
Azure Monitor alerts follow a Common Alert Schema with
essentials(severity, alert rule, target resource) andalertContext(metric/log details). Your rulebook conditions can filter on any of these fields – for example, only triggering automation onSev0orSev1alerts.
Operational Impact: Low
Once EDA triggers the enrichment workflow, Ansible gathers additional context from the affected Azure resource and sends everything to Red Hat AI for root cause analysis. This is the same pattern described in the AIOps reference architecture – Log Enrichment and Prompt Generation Workflow.
Step 3a: Gather context from the affected Azure VM
- name: Gather Azure VM context for alert
hosts: localhost
tasks:
- name: Get Azure VM details
azure.azcollection.azure_rm_virtualmachine_info:
resource_group: "{{ resource_group }}"
name: "{{ vm_name }}"
register: vm_info
- name: Get recent Azure activity log entries
azure.azcollection.azure_rm_resource_info:
url: "/subscriptions/{{ subscription_id }}/providers/Microsoft.Insights/eventtypes/management/values"
api_version: "2015-04-01"
register: activity_log
Step 3b: Gather diagnostics from the host itself
- name: Gather host diagnostics
hosts: "{{ affected_host }}"
become: true
tasks:
- name: Check for failed systemd services
ansible.builtin.shell:
cmd: "systemctl list-units --state=failed --no-pager"
register: failed_services
- name: Collect recent error logs
ansible.builtin.command:
cmd: "journalctl --since '2 hours ago' --priority=err --no-pager"
register: recent_errors
- name: Check disk and memory usage
ansible.builtin.setup:
filter:
- ansible_memtotal_mb
- ansible_memfree_mb
- ansible_mounts
Step 3c: Analyze with Red Hat AI
- name: Analyze Azure alert with Red Hat AI
hosts: localhost
tasks:
- name: Build enriched prompt
ansible.builtin.set_fact:
enriched_prompt: |
An Azure Monitor alert fired: {{ alert_name }}
Severity: {{ severity }}
Affected resource: {{ affected_resource }}
Description: {{ description }}
VM status: {{ vm_info.vms[0].power_state | default('unknown') }}
Failed services: {{ hostvars[affected_host].failed_services.stdout | default('none') }}
Recent errors: {{ hostvars[affected_host].recent_errors.stdout | default('none') | truncate(2000) }}
Diagnose the root cause and recommend a remediation step.
- name: Send to Red Hat AI for analysis
redhat.ai.completion:
base_url: "http://{{ rhelai_server }}:{{ rhelai_port }}"
token: "{{ rhelai_token }}"
prompt: "{{ enriched_prompt }}"
model_path: "/root/.cache/instructlab/models/granite-8b-lab-v1"
delegate_to: localhost
register: ai_response
Operational Impact: Low (notification) → High (remediation execution)
The enrichment workflow posts the AI analysis to your team’s communication channel and – depending on your maturity level – either queues a remediation playbook for human approval or executes it automatically.
- name: Post AI diagnosis to Slack
ansible.builtin.uri:
url: "{{ slack_webhook_url }}"
method: POST
body_format: json
body:
text: |
:rotating_light: *Azure Alert:* {{ alert_name }}
*Severity:* {{ severity }}
*Resource:* {{ affected_resource }}
*AI Diagnosis:* {{ ai_response.choice_0_text }}
- name: Create ServiceNow incident with enriched context
servicenow.itsm.incident:
instance:
host: "{{ snow_instance }}"
username: "{{ snow_username }}"
password: "{{ snow_password }}"
short_description: "Azure Alert: {{ alert_name }}"
description: "AI Analysis: {{ ai_response.choice_0_text }}"
priority: "{{ '1' if severity == 'Sev0' else '2' if severity == 'Sev1' else '3' }}"
when: snow_instance is defined
From here, the remediation follows the same pattern as the AIOps Remediation Workflow – Lightspeed generates a playbook, it gets committed to Git, and a Job Template is created for execution.
| Stage | What to Verify | Success Indicator |
|---|---|---|
| 1. Azure Event → EDA | Message arrives in queue and EDA consumes it | EDA Controller shows the rulebook activation as Running; event log shows the Azure alert payload |
| 2. Enrichment Workflow | AI analyzed the alert and notifications were sent | Workflow Visualizer shows all nodes green; Slack/ITSM received the AI diagnosis |
| 3. Remediation Workflow | Playbook was generated and committed | New playbook file exists in the Git repository; Job Template was created |
| 4. Execute Remediation | The fix was applied | Azure resource returns to healthy state; Azure Monitor alert auto-resolves |
Quick validation test – Send a test message to the Service Bus queue using the Azure CLI:
az servicebus queue send \
--namespace-name aiops-events \
--queue-name infrastructure-alerts \
--resource-group rg-aiops \
--body '{
"data": {
"essentials": {
"alertRule": "Test High CPU Alert",
"severity": "Sev2",
"alertTargetIDs": ["/subscriptions/xxx/resourceGroups/rg-aiops/providers/Microsoft.Compute/virtualMachines/web01"],
"description": "CPU usage exceeded 95% for 5 minutes",
"firedDateTime": "2026-02-18T14:30:00Z"
}
}
}'
| Symptom | Likely Cause | Fix |
|---|---|---|
| EDA rulebook is active but no events are received | Connection string is wrong or queue name doesn’t match | Verify conn_str has Listen permissions; check queue name matches exactly |
| EDA receives event but condition doesn’t match | Azure alert payload structure differs from expected schema | Use debug action in the rulebook to print event.body and compare against your conditions |
| Enrichment workflow runs but AI response is empty | Prompt is too vague or AI endpoint is unreachable | Verify the Red Hat AI server is running; test with a simple prompt first |
| Azure VM info retrieval fails | Azure credentials in AAP are expired or lack permissions | Verify the Azure Service Principal credential; test with azure.azcollection.azure_rm_virtualmachine_info |
| Messages pile up in the dead-letter queue | EDA consumer is crashing or message format is unexpected | Check EDA Controller logs; inspect dead-letter messages in Azure Portal for error details |
| Maturity | Approach | How It Works | AI Role |
|---|---|---|---|
| Alert Enrichment | Azure alert → Service Bus → EDA → AI diagnoses → enriched context posted to Slack/ITSM → human investigates and remediates | Read-only: AI interprets, humans act | |
| Curated Remediation | Azure alert → Service Bus → EDA → AI diagnoses → AI selects a pre-approved playbook → human approves → playbook executes | AI selects from existing automation | |
| Auto-Remediation | Azure alert → Service Bus → EDA → AI diagnoses → Lightspeed generates a remediation playbook → policy engine validates → playbook executes | AI generates new automation within policy boundaries |
Start with Crawl.
Route Azure Monitor alerts to Service Bus and deploy the EDA rulebook and enrichment workflow first. Let your team see AI-enriched Azure alerts in Slack for a few weeks before adding automated remediation. This builds confidence in the pipeline and catches edge cases early.
infra.ai and redhat.ai collections.With Azure Service Bus connected to Event-Driven Ansible, your Azure monitoring alerts become the trigger for automated enrichment and remediation. Instead of engineers manually triaging every Azure Monitor alert, events flow through Service Bus into an AIOps pipeline that diagnoses root cause with AI and remediates with Ansible – reducing MTTR and bridging the gap between Azure’s cloud monitoring and your operational automation, whether the affected systems are in Azure, on-prem, or hybrid.
