Cisco Catalyst 2960X keeps crashing

Yesterday evening I had to troubleshoot a Cisco Catalyst 2960X switch stack, which didn’t return to normal after a reboot. The following error message was visible on the console:

Error: ASIC/PHY POST failed. Cannot continue.

%Software-forced reload

This error message is listed as a bug (CSCut90593) at Cisco.com. Cisco describes a very “good” workaround: the switch can boot up normally after this crash. I tried to reboot the switch several times, but it didn’t boot normally. The stack is running Cisco IOS version 15.2(2)E6 and the bug should be fixed in 15.2(2)E3.

It took me till midnight to recover the stack, so I am done troubleshooting and I will start the RMA procedure. I have to say that I am not really happy with the Cisco Catalyst 2960X. I have several customers running this type of switch and I had a lot more “troubles” with this switch, then I have/had with other Cisco switches. I hope stability will improve!!!!

ClearPass & MobileIron – Error: not well-formed (invalid token)

This post isn’t going to describe what HPE Aruba ClearPass or MobileIron is. And neither will it describe the configuration steps necessary to add MobileIron to ClearPass, but I will give a short summary:

  1. Add the MobileIron VSP to ClearPass as Endpoint Context Server (CPPM – Administration – External Servers);
  2. The account on MobileIron needs API rights to enable ClearPass to retrieve information from MobileIron;

This post tells a bit more about an error message I suddenly started to receive in the CPPM Eventy Viewer.

CPPM - MDM - invalid token

Error: not well-formed (invalid token)

I checked the internet, but I couldn’t find any useful information. I opened a TAC case to look into this error. The TAC engineer told me he had seen this error before, where MobileIron sends invalid token characters to ClearPass. He told me that CPPM does batch processing of the devices and the entire batch fails when CPPM doesn’t understand special characters. He also told me how to see which device is causing the problem.

You have to collect the CPPM logs (CPPM – Administration – Server Manager – Server Configuration – Collect Logs). After you untar the tar.gz file, you should look at the directory “strange string”\PolicyManagerLogs\mdm\MI\mdm-server and you should open the file 0.xml.bak.

Scroll down to the line mentioned in the error message and you will see something like below. I always use Notepad++ to open the file.

CPPM MDM - XML Error

CPPM doesn’t understand these special characters in the key. When you start scrolling up, you can determine which device in MobileIron triggers the error message in CPPM.

After I found the device in MobileIron I checked every setting on the device to find the special character, but I couldn’t find one. In the end there was only one solution for me: retire the device. This basically means remove the device from MobileIron and the user needs to reprovision the device in MobileIron. The sync between CPPM en MobileIron was successful again after I retired the device.

Tip of the week: I guess you aren’t always looking at the Event Viewer for errors, so maybe it is useful to configure ClearPass Insight to send a notification if a System Error Event occurs!!!

Cisco ASA: web interface not working

I had to troubleshoot a Cisco ASA today, where the client wasn’t able to connect to the management web interface anymore via https. The customer didn’t install ASDM locally, but always starts the Java-based version.

After upgrading the Cisco ASA to software version 8.2(1) and a reboot, the client wasn’t able to connect to the web interface anymore. I was able to connect to the firewall with my locally installed ASDM client, but I couldn’t access the web interface either.

While troubleshooting I first tried the basic settings, like management access-list, regenerate crypto keys and change the management port. All these options didn’t help, but the strange thing was that the web interface was working remotely.

While working with Mozilla I received the following error:

cannot communicate securely with peer: no common encryption algorithm(s).

In Google Chrome I receive the following error:

Error 113 (net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH): Unknown error.

And of course Internet Explorer didn’t gave any usable information. I started looking at the supported encryption algorithms within the firewall with a show version. I noticed that VPN-3DES-AES was disabled. The next step was the enable the VPN-3DES-AES ciphers. The upgrade license for this feature is available for free at http://www.cisco.com/go/license.

I activated the VPN-3DES-AES feature, but still wasn’t able to connect to the firewall with the web interface. I checked the SSL encryption used by the firewall.

fw01# show ssl
Accept connections using SSLv2, SSLv3 or TLSv1 and negotiate to SSLv3 or TLSv1
Start connections using SSLv3 and negotiate to SSLv3 or TLSv1
Enabled cipher order: des-sha1
Disabled ciphers: 3des-sha1 rc4-md5 rc4-sha1 aes128-sha1 aes256-sha1 null-sha1
No SSL trust-points configured
Certificate authentication is not enabled

The firewall still didn’t enable the ciphers supported in my browser. If the VPN-3DES-AES license isn’t installed, only the cipher des-sha1 is enabled by default. I added the correct ciphers with the following command:

fw01(config)# ssl encryption aes256-sha1 aes128-sha1 3des-sha1

After adding the command I was able to connect to the ASA with both the web interface and the ASDM.

AutoQos error while generating commands

First of all, the post isn’t about explaining QoS. Configuring AutoQos on Cisco switches should be very easy. At least, that is what all the Cisco documentation tells you. I always thought that the statements about configuration AutoQos were true, but a few days ago I would disagree.

I was configuring multiple switches, Cisco Catalyst 2960-24PC-L switches to be more precise. The switches are configured with a default template, which is generated according the needs of the customer. I uploaded the most recent software into the switches, which is 12.2(55)SE. The customer is going to use an Avaya IP telephony environment, so I tried to apply the appropriate AutoQos command to an interface. After applying the command, I received the following error:

AutoQoS Error while generating commands on Gi0/2

The switches didn’t have any QoS configuration before applying the command. Searching multiple forums didn’t gave me a valid solution. I did my own research and found an incompatibility with the configuration mode exclusion command, like shown below:

configuration mode exclusive auto expire 500 lock-show config-wait 5 retry-wait 5

I removed the configuration mode exclusion command and the AutoQos commands can be implemented without errors. I tried to find why it isn’t possible to apply the AutoQos commands while the configuration mode exclusive command is enabled.

I guess the problem is that the configuration mode exclusive commands prevents other users or the auto-generation of commands to be executed. When applying the AutoQos commands, the commands are executed by the router / switch and not by the user who locked the cli.

Citrix NetScaler: Protocol Driver Error

Today I have been troubleshooting a Citrix NetScaler configuration, where some clients received the Protocol Driver Error message when executing a published application. This error message is mostly related to a wrong configuration of the Security Ticket Authorities (STA’s). I spent a lot of time troubleshooting this issue and focused on the STA configuration. I have been deleting, adding and modifying multiple STA’s to the configuration of the NetScaler without any luck. I checked the configuration of the firewall, but this wasn’t the problem either.

I decided to go back to basic and just added one STA to the Virtual Server and the STA service group used by the Web Interface. I was able to login with that STA server, but after a while I wasn’t able to login and received the Protocol Driver Error again.

While browsing through the NetScaler I noticed one thing. Every time I wasn’t able to login, there were 5 concurrent users already connected with the NetScaler. I did some more research on the internet and found the following article.

Reading the article, tells me that, by default, only 5 concurrent ICA sessions are possible through the NetScaler. I checked the license and the customer has a license for 50 SSL VPN connections, like shown below:

> show license
License status:
Web Logging: YES
Surge Protection: NO
Load Balancing: YES
Content Switching: YES
Cache Redirection: NO
Sure Connect: NO
Compression Control: NO
Delta Compression: NO
Priority Queuing: NO
SSL Offloading: YES
Global Server Load Balancing: NO
GSLB Proximity: NO
Http DoS Protection: NO
Dynamic Routing: YES
Content Filtering: YES
Integrated Caching: NO
SSL VPN: YES  (Maximum users = 50)
AAA: NO
OSPF Routing: NO
RIP Routing: NO
BGP Routing: NO
Rewrite: YES
IPv6 protocol translation: YES
Application Firewall: NO
Responder: YES
HTML Injection: NO
NetScaler Push: NO

I increased the default value to 50 with the following command:

> set aaa parameter -maxAAAUsers 50

From that point on I was able to start another ICA connection, while there were already 5 concurrent users connected. Now I have to wait and see if this actually solved the problem, but I guess it has.

I tried to increase the –maxAAAUsers value to a value higher than 50, but that isn’t possible as you can see.

> set aaa parameter -maxAAAUsers 100
ERROR: MaxAAAUsers value not allowed by license