ESX Server 3 vmkernel.log Lists

VERSION 2 Published

Created on: Apr 10, 2009 5:14 PM by Steve Chambers - Last Modified:  Apr 10, 2009 5:30 PM by Steve Chambers

Introduction

 

This document provides the red (critical - immediate action), orange (warning - track this alert) and black (ignore but trend on this message) for the vmkernel.log of ESX Server 3.

 

Intended Audience

 

VMware Certified Professionals (VCPs) and systems management professionals when implementing log management for VI3.

 

Outline

 

  1. Introduction to vmkernel.log

  2. vmkernel.log Red List

  3. vmkernel.log Orange List

  4. vmkernel.log Black List

 

Introduction to vmkernel.log

 

All the hypervisor/vmkernel messages are posted in the Console OS log:

 

/var/log/vmkernel.log

 

Good ways to access this log:

 

  1. Centralize it (doc TBD) via syslog, and us a tool like Splunk!

  2. Use less|more to view it live

  3. tail -f /var/log/vmkernel.log

 

vmkernel.log Red List

 

Messages on the Red List should be picked up by the log monitoring automation and alerted as a high priority outage for immediate investigation.

 

1. KB 1003615 - host attached to share storage failures

 

Message


Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.325 cpu3:1032)SCSI: 3753: AsyncIO timeout (5000); aborting cmd w/ sn 941463, handle 1472/0x40211a28
Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.325 cpu3:1032)LinSCSI: 3616: Aborting cmds with world 1024, originHandle 0x40211a28, originSN 941463 from vmhba0:0:5
Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.325 cpu3:1032)<6>qla24xx_abort_command(0): handle to abort=857
Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.326 cpu3:1032)LinSCSI: 2604: Forcing host status from 2 to SCSI_HOST_OK
Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.326 cpu3:1032)LinSCSI: 2606: Forcing device status from SDSTAT_GOOD to SDSTAT_BUSY
Dec  8 20:06:01 esx013 vmkernel: 29:16:44:10.326 cpu3:1032)SCSI: 3753: AsyncIO timeout (5000); aborting cmd w/ sn 1073299, handle 2415/0x4020c038

 

Impact

Storage event that has caused an outage to every host accessing the shared storage.

 

Action

Review the storage controller or processor logs on the array for any events or messages occurring around the date the instance occurred. There are number of reasons why this kind of outage occurs like, controller problem, failing hard drive, SAN Copy operation being initiated. Also review the switch logs for the same time frame to see if the switches played a factor in this outage.

 

vmkernel.log Orange List

 

Messages on the Orange List should be picked up by the log monitoring automation and alerted as a warning for review.

 

vmkernel.log Black List

 

Messages on the Black List should be collected by the log monitoring automation and trended for future analysis.

 

Resources

 

Authors

 

 

 

 

 

 

 

Average User Rating
(0 ratings)




There are no comments on this document

More Like This

  • Retrieving data ...

Incoming Links