Prepping the device to be ready to Log Kernel Log/Trace:
Note: If Jump boxes such as Avocent / Digi Passport are used, the admins would need to allow required access methods such as Telnet/SSH on them.
- Download Putty : https://www.putty.org/.
- Configure the Putty Session logging as needed.
- Connect to the Serial Console of the affected PCS/PPS device as shown below.
. 4. Click on the left tip of the Putty session Window as shown below.
5. Select “Special Command” and Select “Break”.
6. As soon as “Break” is selected, hit the 9 (Number Key).
7. This will enable the Kernel Tracing with output “LogLevel set to 9” (As shown below).
Note - This needs to be set on the affected device(s) well before the issue occurrence to maximize the chances of getting the Kernel Trace during the freeze.
8. If this output isn’t obtained, please try it a few times.
9. Leave the serial console in this state while capturing the contents into a local file as from time to time the device may log important Kernel Messages important to Root Cause system issues.
10. When the system Freezes/Stalls or when Web/UI goes inaccessible, click on the left tip of the Putty session as shown below and select “Special Command” & select “Break”.
11. As soon as “Break” is selected, hit the T key (Upper or Lower Case).
12. The console should output a list of entries similar to the screenshot below,
|Disclaimer: Do not execute this command in a fully operational system as this could disconnect user sessions.|
13. Save the output completely or take a screenshot.
14. If the console still responds, navigate to Option # 7. System Maintenance on the Serial Console and generate a System snapshot.
15. Reboot the device & login to the Admin UI.
16. Save the PCS logs (User Access/Admin/Events) by navigating to System -> Log/Monitoring -> Select “Save All Logs”.
17. Navigate to “Maintenance” > “Troubleshooting” > “System Snapshot”. Ensure that “Include Debug log” & “Include System config” options are selected.
18. Generate a system snapshot, if you could not generate via Serial Console.
19. Gather the following details without fail and upload them to the Support case,
- If the device is already in the Stall/Hung state, Execute Step # 6 and wait for 10-15 minutes without fail, before proceeding to step # 11.
- During these 10-15 minutes, the device may print useful Kernel logs. Therefore, this waiting period becomes very important.
- Complete Serial Console Output.
- PCS Logs.
- System snapshots along with any Process Snapshots or Watchdogs that get generated under the System snapshot tab.
- Dashboard graphs for 1 day/week/month.
- Date/time of the occurrence along with the timezone.
- Response to Ping.
Steps to collect Kernel Trace (Console Logs) for Azure Deployments:
- Navigate to the Serial Console of the affected Azure Instance.
- Click on the “Keyboard” icon on the Command Bar and select “Send SysRq Command” as shown below,
- A SysRq (System Request) is a sequence of keys understood by the Linux operation system kernel, which can trigger a set of pre-defined actions. These commands are often used when virtual machine troubleshooting or recovery can't be performed through traditional administration.
3. In the resulting Dialog box, type “9
” upon selecting “Enter Key or Key sequence to send below:
” and click on “Send SysRq
” button (As shown below). This should set the log level to verbose. Note - This needs to be set on the affected device(s) well before the issue occurrence to maximize the chances of getting the Kernel Trace during the freeze.
4. ONLY when the VM crashes/Freezes, please send lower case t as a SysRq command as shown below (Please DO NOT
execute this command on a LIVE/PRODUCTION system).This will generate the stack trace (Console output) on the Serial Console (Wait up to 10-15 minutes if the Console Output takes time).
To download the complete Serial console logs, navigate to Boot Diagnostics -> Serial Log & Click on Download Serial Log as shown in the screenshot below.
Gather these console logs along with the other logs described for Hardware as well and upload them to the case for Engineering analysis.