Try to resolve the issue by performing the following procedure:
- Make sure that the Network Connect Server IP Address value under Network settings > Network Connect is not the same as the internal port's physical IP/Cluster VIP IP on both of the nodes.
If they are the same, change the Network Connect Server IP address to its default value - 10.200.200.200 and reboot the cluster. Verify if the cluster is up by monitoring the cluster status.
- Check the link speed between the PCS node and the switch port, which is connected to the node. Verify under Network settings> Internal port settings, for both nodes, if the link speed setting is the same and matches with the link speed on the switch port, which is connected to the internal port of both the nodes.
- Verify that there are no duplicate ARP entries for the cluster's internal VIP IP in the network.
If the above procedure does not resolve the issue, collect the following logs and open a Pulse Secure TAC case:
On both of the nodes:
Note: The event codes are case-sensitive and should be entered as described above.
- Under Troubleshooting, enable Monitoring > Node monitoring.
- Under Troubleshooting, enable Monitoring > Debug logging, set the log level as 15 and log size as 50, and type the event code as DSUtil,-DSLog,-DSConfig,dsnetd::ipat,dsnetd::garpsweep (without any spaces).
- When the debugging options are enabled, enable TCP dump from the internal port on both of the nodes, leave it on for 3-5 minutes, and capture the TCP dump from both of the nodes.
- Obtain an admin generated snapshot with the Include system config and Include debug log checkboxes enabled on both of the nodes and turn the above 2 options off for monitoring, when the system snapshot has been taken.
- Obtain the User access, event, and admin access logs from both of the nodes. These logs contain the time stamp of when the snapshot and TCP dump were captured.
- On both of the nodes, via the admin UI, take a complete screenshot of the Status overview graphs, which are filtered for a week’s data. You can filter the graph by going to Status overview > Page settings.
For the 2nd scenario, a physical reboot of the node, which went down, restores the cluster in most cases. Open a case with Pulse Secure TAC to diagnose the issue.
The following logs should be collected for this scenario:
Before rebooting the node that went down, connect to the it via serial console and try to obtain the serial console output by performing the following procedure:Note: If the console is not even responding, skip step 1 and go to step 2. If the console displays errors, take a screenshot of the console screen.
- Kernel Trace/dump:
- Go to HyperTerminal menu > Transfer > Capture Text.
- Select the file to write to and click Start.
- To set the Kernel Logging to Level 9, use the key combination of Ctrl + 9.
- Leave it on for 15 minutes.
- The Ctrl+Break+T key sequence will output the kernel trace/dump to the console.
- After the problematic node has been rebooted, obtain an admin generated snapshot with the Include system config and Include debug log check boxes enabled, immediately after the node has been brought back up.
- If one of the nodes is accessible via the web, when the issue occurs, obtain an admin generated snapshot on the node, with the Include system config and Include debug log check boxes enabled.
- Obtain the user access, event, and admin access logs from both of the nodes.
- On both of the nodes, via the admin UI, under Status overview > Page settings > Advanced status, select the Check storage checkbox; so that the storage percentage is displayed in the admin UI overview.
Obtain a complete screenshot of the Status overview Graphs from both of the nodes, which are filtered for a day’s data and another one filtered for a week’s data. You can filter the graph by going to Status overview > Page settings.
- Obtain the recently generated process snapshots, if any, from both of the nodes under Troubleshooting > System snapshot.