Manual network configurations are a leading cause of downtime. When managing over 250 network nodes, ranging from Cisco Catalyst campus switches to Juniper MX edge routers, manual changes quickly become unsustainable and error-prone.
To enforce standardization, we built an automation fabric using NetBox as our Single Source of Truth (SSOT) and Ansible/AWX to automate configuration validation and deployments.
The Architecture: NetBox as the SSOT
Before writing automation scripts, you need a structured database representing what your network should look like. We populated NetBox with all physical assets, virtual interfaces, VLAN assignments, IP addresses, and BGP peering details.
Ansible playbooks query the NetBox REST API (using the netbox.netbox Ansible collection) to retrieve the target state, rather than relying on local text inventories or static variable files.
Step 1: Querying NetBox for Dynamic Inventories
Instead of keeping static hostnames, we use the NetBox dynamic inventory plugin. This ensures that whenever a router is added or modified in NetBox, Ansible immediately knows its IP, platform type, and active status:
# netbox_inventory.yml
plugin: netbox.netbox.nb_inventory
api_endpoint: https://netbox.local.net
token: "{{ lookup('env', 'NETBOX_API_TOKEN') }}"
validate_certs: false
group_by:
- device_roles
- platforms
query_filters:
- status: "active"
Step 2: Generating Configuration Templates
We use Jinja2 templates to compile configuration blocks. For instance, configuring BGP peers involves looping through all IP links defined in NetBox for that router:
{# BGP configuration template for Cisco IOS-XE #}
router bgp {{ bgp_asn }}
bgp log-neighbor-changes
{% for peer in bgp_peers %}
neighbor {{ peer.ip_address }} remote-as {{ peer.remote_asn }}
neighbor {{ peer.ip_address }} description {{ peer.description }}
neighbor {{ peer.ip_address }} activate
{% endfor %}
Step 3: Dry-Runs and Pre-Flight Validation
A core rule of our automation is never push configurations without validation. Every playbook executes in a dry-run mode (using Ansible's --check flag or the router's commit-confirm capability) to perform pre-flight checks:
- Syntax Check: Ensures generated CLI syntax is valid for the target OS (Cisco IOS-XE, Juniper Junos, or Nokia SR OS).
- Schema Audit: Cross-references active routes against NetBox IPAM allocations to catch rogue IP assignments.
- State Verification: Pings peer addresses from the device before establishing BGP connections to ensure physical reachability.
# Ansible playbook snippet for config dry-run
- name: Push and Validate Configuration
juniper.device.config:
load: "merge"
format: "set"
src: "/tmp/generated_configs/{{ inventory_hostname }}.conf"
check: true # Perform commit check without committing
register: check_result
- name: Assert Configuration Integrity
ansible.builtin.assert:
that:
- check_result.failed == false
fail_msg: "Pre-flight validation failed for {{ inventory_hostname }}!"
Operational Impact
By tying Ansible/AWX execution to git pushes and pulling live parameters from NetBox, we achieved:
- Zero Syntax Failures: Production configurations pushed via pipeline have had a 100% successful commit rate.
- Onboarding Time: Reducing new device provisioning times from 2 hours to less than 5 minutes.
- Audit Compliance: Automatic detection of manual "out-of-band" config changes, signaling configuration drift via Zabbix alarms.
Conclusion
Building a Single Source of Truth is the most critical step in network automation. Once NetBox holds your network's target model, tools like Ansible make rendering, validating, and applying configs a predictable, risk-free operation.