Connecting the world…

normal

Fiber optics and UDLD

UDLD (Unidirectional Link Detection) is a protocol to help prevent forwarding loops in switched networks. A fiber cable is build from two separate fibers (transmit and receive), where one of the two fiber could fail, which would result in a switch port not able to receive or send traffic. This scenario could result in some serious problems.

The Problem

Spanning-Tree Protocol (STP) resolves redundant physical topology into a loop-free, tree-like forwarding topology. This is done by blocking one or more ports. By blocking one or more ports, there are no loops in the forwarding topology. STP relies in its operation on reception and transmission of the Bridge Protocol Data Units (BPDUs). If the STP process that runs on the switch with a blocking port stops receiving BPDUs from its upstream (designated) switch on the port, STP eventually ages out the STP information for the port and moves it to the forwarding state. This creates a forwarding loop or STP loop.

Check the following two pictures:

STP-A STP-B

The left pictures shows a regular layer 2 network, where switch B is the designated switch for the B-C segment. Switch C on the B-C link is in blocking state. In the right picture switch C’s Tx is broken, switch C doesn’t receive and BPDU packets from switch B anymore and ages the information received with the last BPDU. Once the STP information is aged out on the port, that port transitions from the blocking state to the listening, learning and eventually to the forwarding STP state. This creates a forwarding loop, as there is no blocking port in the triangle A-B-C. Packets cycle along the path (B still receives packets from C) taking additional bandwidth until the links are completely filled up. This brings the network down.

UDLD explained

UDLD is a Layer 2 (L2) protocol that works with the Layer 1 (L1) mechanisms to determine the physical status of a link. At Layer 1, auto-negotiation takes care of physical signaling and fault detection. UDLD performs tasks that auto-negotiation cannot perform, such as detecting the identities of neighbors and shutting down misconnected ports. When you enable both auto-negotiation and UDLD, Layer 1 and Layer 2 detections work together to prevent physical and logical unidirectional connections and the malfunctioning of other protocols.

UDLD works by exchanging protocol packets between the neighboring devices. In order for UDLD to work, both devices on the link must support UDLD and have it enabled on respective ports.

Each switch port configured for UDLD sends UDLD protocol packets that contain the port’s own device/port ID, and the neighbor’s device/port IDs seen by UDLD on that port. Neighboring ports should see their own device/port ID (echo) in the packets received from the other side.

If the port does not see its own device/port ID in the incoming UDLD packets for a specific duration of time, the link is considered unidirectional. Once the unidirectional link is detected by UDLD, the respective port is disabled and this message is printed on the console and the logging:

UDLD-3-DISABLE: Unidirectional link detected on port 1/2. Port disabled

UDLD can operate in two modes:

  1. Normal: if the link state of the port was determined to be bi-directional and the UDLD information times out, no action is taken by UDLD. The port state for UDLD is marked as undetermined. The port behaves according to its STP state.
  2. Aggressive: if the link state of the port is determined to be bi-directional and the UDLD information times out while the link on the port is still up, UDLD tries to re-establish the state of the port. If not successful, the port is put into the errdisable state.

Depending on the fiber uplink (type of cable, length of the cable, age of backbone and more) I use UDLD aggressive mode. Aggressive mode will put the port in errdisable, but in my opinion it is better to loose some switches then flooding the complete layer 2 network and disturbing even more users.

PIX Failover not working

Today I received the question why a PIX failover configuration wasn’t working. The customer accidentally disconnected the power cable from the primary PIX firewall. The secondary PIX firewall became the active one, but multiple DMZ segments weren’t working anymore. After rebooting the PIX firewall and making that the primary one again, the DMZ segments were reachable again.

I started looking at the failover state of the firewall. Entering the commando show failover on the primary PIX resulted in the following output:

This host: Primary – Active
Active time: 5430735 (sec)
Interface outside (1.1.1.1): Normal
Interface inside (2.2.2.1): Normal
Interface DMZ01 (3.3.3.1): Normal
Interface DMZ02 (4.4.4.1): Normal (Waiting)
Interface DMZ03 (5.5.5.1): Normal (Waiting)
Other host: Secondary – Standby Ready
Active time: 5235 (sec)
Interface outside (1.1.1.2): Normal
Interface inside (2.2.2.2): Normal
Interface DMZ01 (3.3.3.2): Normal
Interface DMZ02 (4.4.4.2): Normal (Waiting)
Interface DMZ03 (5.5.5.2): Normal (Waiting)

Interface DMZ02 and DMZ03 were the problem interfaces.

After doing some simple ping test, I noticed the problem interfaces couldn’t ping each other. Looking at the output above the interfaces are in Normal (Waiting) state. Searching the internet resulted in the following:

Failover does not start to monitor the network interfaces until it has heard the second “hello” packet from the other unit on that interface. This takes about 30 seconds. If the unit is attached to a network switch that runs Spanning Tree Protocol (STP), this takes twice the “forward delay” time configured in the switch (typically configured as 15 seconds), plus this 30 second delay. This is because at PIX bootup and immediately following a failover event, the network switch detects a temporary bridge loop. Upon detection of this loop, it stops forwarding packets on these interfaces for the “forward delay” time. It then enters the “listen” mode for an additional “forward delay” time, during which time the switch listens for bridge loops but not forwarding traffic (and thus not forwarding failover “hello” packets). After twice the forward delay time (30 seconds), traffic resumes flowing. Each PIX remains in “waiting” mode until it hears 30 seconds worth of “hello” packets from the other unit. During the time the PIX is passing traffic, it does not fail the other unit based on not hearing the “hello” packets. All other failover monitoring still occurs (that is, Power, Interface Loss of Link, and Failover Cable “hello”).
Source

Here they are talking about the STP configuration. I couldn’t imagine that the problem is lying in the STP configuration. I started tracking the mac-address on the switches to check the physical patching.

The problem was found directly. The patching of the secondary PIX wasn’t correct. The customer had swapped the two interfaces resulting in patching them in the wrong VLAN. After swapping the patches the failover state functions normally, looking at the output below:

This host: Primary – Active
Active time: 5430735 (sec)
Interface outside (1.1.1.1): Normal
Interface inside (2.2.2.1): Normal
Interface DMZ01 (3.3.3.1): Normal
Interface DMZ02 (4.4.4.1): Normal
Interface DMZ03 (5.5.5.1): Normal
Other host: Secondary – Standby Ready
Active time: 5235 (sec)
Interface outside (1.1.1.2): Normal
Interface inside (2.2.2.2): Normal
Interface DMZ01 (3.3.3.2): Normal
Interface DMZ02 (4.4.4.2): Normal
Interface DMZ03 (5.5.5.2): Normal

This proves again how import correct patching is!!!!!