The pair of PA5050 firewalls are at the edge of the network, the downstream of PA5050 pairs has a pair of Cisco Catalyst 6506 and a pair of Cisco Catalyst 4506 switches. The diagram is illustrated as below.
The pair of Cisco Catalyst 6506 is configured as a virtual switching system, which unifies the pair into one logical layer3 switch. Also Cisco non-stop-forwarding is used together with stateful-switch over, the layer3 routing redundancy is taken care by Cisco redundancy technology 😉 Also HSRP version 2 is used so that the downstream trusted host have redundancy….
Layer2 link aggregation (802.3ad) is used for all downstream (towards Cisco Catalyst 4500) and upstream (towards PA5050) links.
The layer 3 is terminated at the pair of Cisco Catalyst 6506, the PA5050 HA pair divides the network into trusted and untrusted zones. The firewall policy will then be implemented on the zones.
Before implementing firewall policy the resiliency of PA5050 active/active redundancy is in question, you are tasked to perform a proof of concept (PoC) to demonstrate the effectiveness of the pair of PAN firewalls.
Cisco Virtual Switching System VSS
The configuration is described in this post. The VSS configuration used for this scenario is identical to the one I use for this post hence there is no need to repeat the steps here.
The concept of the VSS is similar to stackwise technology. However to build VSS you need 10GB ethernet link for the virtual switch link (VSL). The VSS unifies the pair of Cisco Catalyst 6500 switches into one logical switch. One switch is in standby where you cannot apply any configuration or any commands, the other switch is the active switch where you can issue commands to configure and verify.
The unification of the pair of Cisco Catalyst 6500 will look like this:
To the perception of PA5050 and Cisco Catalyst 4500 there is only one switch. This type of setup is known as Active/Active Layer3 High Availability with Multi-chassis link aggregation topology by Palo Alto Networks Design Guide Revision A.
High Availability links of PAN firewall in general
There are two build-in HA interfaces in PA5050 namely HA1 and HA2. The physical HA interfaces locations are designed such a way that it is easily understood at a glance.
HA1 interface is together with console and management interfaces, this tells you that HA1 is the control link.
HA2 interface is together with the data interfaces, this tells you HA2 is data link.
HA1 and HA2 are sufficed for Active/Passive redundancy. You need HA3 if you want Active/Active redundancy.
Concise notes about Control Link HA1
1. This is the only layer 3 HA link, in other words this is the only HA link that requires IP address.
2. HA1 is for HA agents (PA5050 active/active firewalls) to communicate with each other.
3. HA1 acts as a keepalive between the HA agents, it senses powercycle, reboot and power down of the peer HA agent.
4. TCP28769 is for clear text communication.
5. TCP49969 is for SSH encrypted communication. You need to import the public key manual to make encryption works.
6. Default monitor hold time is 3000ms.
7. HA1 also acts to monitor the HA status such configuration synchronization (for active/passive redundancy, active/active is not necessary) and management plane synchronization.
The conclusion is HA1 monitors the HA state.
The location of HA configuration is at Device tab, then select High Availability from the left side menu.
Concise notes about Data Link HA2
1. HA2 is layer 2 link, in other words no IP address is required although you can specify layer3 information as well in the web gui.
2. HA2 is used to synchronize HA states, routing information, IPSec security association, ARP table and traffic sessions.
3. Transport protocol for IP is IP number 99, if UDP is used for layer 4 transportation then it is UDP29281. If neither IP nor UDP is chosen then the default is ethernet.
Concise notes about HA3
1. HA3 is a layer 2 link using MAC-in-MAC encapsulation.
2. You have to choose a data interface and set it to HA mode in order to be included in HA3.
3. HA3 is for packet forwarding between session owner and session setup firewall.
4. Link aggregated data interface can be used for HA3 if the mode is configured as HA.
Session setup options
The session setup can be distributed for setup load sharing by using IP modulo or IP Hash only. Primary Device always setup the session, there is no distribution of setup load between HA agents in the HA cluster.
IP Modulo – The session setup load sharing is distributed between HA agents in the HA cluster based on the parity of the source IP address.
IP Hash – The session setup load sharing is distributed between HA agents within the HA cluster based on hash of either the source IP address or the combination of source and destination IP address.
Primary Device – HA agent with Active-Primary state will always setup the session.
There are two session owner options.
Primary Device – Active-Primary HA agent will always be the session owner.
First Packet – The first packet that is sent out by the HA agent (either Active-Primary or Active-Secondary) is the session owner.
Monitoring High Availability
The default dashboard does not have the high availability monitoring, you have to add it yourself.
WARNING: You should not synchronize the running configuration between the HA agents for active/active redundancy, the running configuration synchronization is only applicable to active/passive HA. The active/passive HA agents only one HA agent is active the standby HA agent does not do data forwarding or data routing or zoning, if you have used Cisco supervisory engine stateful switch over before this concept is not hard to grasp. The reason for running config synchronization is because when the active agent is down, the standby agent can resume the role hence the running configuration should be synchronized to prevent a blackhole or outage.
Active/Active HA configuration steps
This section shows how to configure Active-Primary HA agent in CLI and web GUI.
It is not clear in the web gui, the device-id actually determines which one HA agent is active-primary and active-secondary. If device-id is 0 it is active-primary if 1 it is active-secondary.
The first is to configure the HA setup. The peer-ip address is the peer active-secondary HA agent.
The CLI also includes the use of ethernet as the transport method for HA2. In web ui there is no need to configure the transport method for HA2 if you do not use IP or UDP.
configure edit deviceconfig high-availability set group 1 peer-ip 172.16.0.2 mode active-active device-id 0 session-owner-selection first-packet session-setup ip-modulo set group 1 mode active-active packet-forwarding yes network-configuration sync virtual-router yes qos no set group 1 state-synchronization enabled yes transport ethernet up set high-availability enabled yes top
From the web interface click on Device tab, select High Availability from the left side menu. Then from the Setup section click the icon button that looks like a gear.
Configure HA1 control link
configure edit deviceconfig set high-availability interface ha1 ip-address 172.16.0.1 netmask 255.255.255.252 link-speed 1000 link-duplex full monitor-hold-time 3000
Go to the Control Link (HA1) section then click on primary.
Assign one interface as HA then include this interface as HA3.
configure set network interface ethernet ethernet1/13 ha edit deviceconfig high-availability set interface ha3 port ethernet1/13
From the Web interface click Network tab, click on the interface you want to assign as HA interface type.
Click on Device tab, select High Availability, select Active/Active Config tab.
You do not need virtual address actually, for this setup the Cisco Catalyst 6506 actually handles the layer3 redundancy for the downstream trusted host using HSRP. The virtual address concept is very similar to VRRP and HSRP.
In the command line for configuring virtual address
configure set deviceconfig high-availability group 1 mode active-active virtual-address ae1.100 ip 192.168.50.1 floating device-priority device-0 1 device-1 10 failover-on-link-down yes
0 device id priority is the highest priority and highest device id priority is 255 is lowest priority. Confusing eh? Perhaps the way I articulate has the problem 😉 I chose link aggregated ae1.100 is because I want device-0 to be the primary router for vlan 100, however for this setup I have not connected the link aggregated link.
Configure Active-Secondary HA agent
There are repetitions in the configuration steps as Active-Primary HA agent, you need to take note the peer ip address should be the active-primary HA agent.
If you have configured virtual address for the HA3 link the address is identical to that of active-primary HA agent’s HA3 configuration.
Verifying active active HA states
About Election Settings
You DO NOT configure Election Settings for active/active HA, this is only for active/passive HA whereby the election is required to determine which HA agent is active and passive.
I was wrong, the preemptive option actually still influences the Active-Primary role election, when preemptive is enabled the previously downed Active-Primary will resume the role as Active-Primary once it had finished the initialization with the Active/Active peer. Both firewalls have to enable preemptive to make this works. Of course if you do not wish the previously Active-Primary firewall to resume the original role as Active-Primary after the firewall is up then you can ignore this option.
The promotion hold-time will ensure the Active-Secondary remain as Active-Secondary over the period of time specified by this option, however if Active-Primary is downed, Active-Secondary will take over the Active-Primary role immediately irregardless of whichever promotion hold-down time you have configured.
The lowest device priority HA agent wins the active HA agent role! The default priority is 100.
How fast is the recovery?
The downtime when a link breaks and when active-primary PA firewall is rebooted is very good and acceptable for data network, taking into account that the HA agent pair actually has to synchronize the routing table, ARP table and firewall stateful tables.
This setup uses OSPF as the dynamic routing protocol among Cisco VSS and the PA5050 HA agents. Unfortunately PA5050 does not support IS-IS which is more scalable than OSPF in wider networks, IS-IS actually can support 100 routers within a single area and IS-IS has inherent support for IPv6 there is no need to define extra IS-IS process to do IPv4 to IPv6 migration.
3 thoughts on “Palo Alto Networks: Active/Active High Availability”
I am a Korean PANW Engineer. Cool posting that helps me. Thanks!
Heartbeat backup option is so useful to install HA. MGT could be processed as a HA1 backup when enabled heartbeat backup. I’ve always enabled heartbeat backup when both of mgt is in same subnet.
Can you tell me how ha works without ip address