Proven Practice: Capacity Management Reporting for VMware Clusters with Metron Athene

VERSION 3 Published

Created on: Sep 8, 2008 7:12 AM by Nick Varley - Last Modified:  Sep 8, 2008 8:25 AM by Nick Varley

Introduction

 

This document explains the features and techniques within Athene v8.20 that can be used to generate cluster-level reports using standard Athene facilities.

 

Athene provides capture, reporting and model-building with performance and capacity data retrieved from VMware hosts and guests. Analyst and Define A Report can be used to examine this data by hand, and it is relatively simple to include Athene Updateable Charts into standard Advisor bulletins.

 

Many customers group their VMware hosts into clusters, to allow for easier management, perhaps aligned with various business or service lines. At present, Athene does not capture all the information to automatically determine which cluster a given host belongs to. However by using the methodology shown below it is possible to report at the VMware cluster level.

 

Intended Audience

 

This proven practice is suitable for any organization use VMware Clusters to organize collections of ESX Servers into one homogeneous group, but is particularly useful for large enterprise organizations with many clusters to manage.

 

Targeted at Capacity Management and Service Management professionals, this will also be of interested to VCPs.

 

Outline

 

  1. Why can't you use Athene 'Groups'?

  2. Define a Report

  3. Define a Bulletin

  4. Cluster-level Metric Considerations

  5. Summary

A. Appendix: List and Description of Cluster Reports

 

Author

 

Metron is a privately owned limited company which was founded in 1986. Metron-Athene Inc is a wholly owned subsidiary of Metron technology Ltd. The company is Europe's foremost Capacity Planning and Systems Performance Management specialist. Metron's flagship product, Athene, provides fully integrated ITIL-compliant capacity management, automatic performance analysis and reporting for UNIX, Linux, Windows and Mainframe Servers .

Find out more about Metron

 

Nick Varley

 

Resources

 

- You can find this document on the VIOPS portal at http://viops.vmware.com/home/docs/DOC-1147

 

Disclaimer

 

You use this proven practice at your discretion. VMware and the author do not guarantee any results from the use of this proven practice. This proven practice is provided on an as-is basis and is for demonstration purposes only.

 

Capacity Management Reporting for VMware Clusters with Metron Athene

 

1. Why can't you use Athene 'Groups'?

 

Athene 8 introduced the “Groups” facility, which allows the selection of targets by comprehensive SQL-like selection criteria. Why can these not be used for generating cluster reports ?

 

The words “selection of targets” in the first sentence of this section explain why – a group is there to return a list of targets to Athene components, including System Manager, Data Management and Advisor. Advisor cannot currently be told to include all the targets in a group on a single chart – it will generate a separate chart for each target. In other circumstances this is a useful and powerful technique, as simply making a target “belong” to a group provides a dynamic reporting capability. In this case, it is not what is required – we want to include all members of a group on one chart.

 

To achieve this means we step outside of the common mode of using Advisor and make use of the “Platform=None” facility.

 

When you add a user-defined chart to a bulletin, you are presented with a small dialog to select the chart name, the updateable platform and the Advisor metric. By default the platform value will be derived from the type of data being charted, as in this example:

add_report.png

 

The updateable platform relates to the type of data that can be incorporated into the chart by Advisor. Remember that you can build a chart for any target and then use it for any other target of the same platform in an Advisor bulletin, so you could not push data for a Windows target into a chart built to represent a UNIX machine, or vice-versa.

 

If you pull down the handle on the right of the Updateable Platform box, the alternative is 'None'. Selecting this option fixes the context of the chart as it enters a bulletin – you cannot run this chart for any other set of targets than this one. It will be able to pick up data from different days, otherwise this facility would be pointless, as the same data would be generated over and over.

 

This is the basis of cluster-level reporting in Athene – you create charts that reflect the hosts that belong to a given cluster and include them individually in your bulletins.

 

 

2. Define A Report

 

This component of Athene is used to generate the template charts for Advisor. Metron will be happy to provide a selection of charts to facilitate the implementation of VMware reporting. These charts should be run through the PDB Path Editor utility (AXM623) to convert their PDB names to one you wish to use. You can then open and modify them.

 

A subfolder is provided called Clusters which can be used as a basis for cluster reporting.

Charts in this folder are called ‘VCxxxn.auc’, where xxx represents the contents of the chart and n represents a cluster number – see Appendix A for details of each report.

 

You must update the context of each chart to refer to the correct grouping of hosts.

 

Using the standard aggregation feature of Define A Report, the user can join several series of data into one to represent a cluster-level view of a metric, however some do not lend themselves to such aggregation – please see the section below for details of what not to aggregate (at least not without understanding what it means to do so).

 

If it is appropriate to aggregate metrics into a single series, for example the total amount of memory used across all hosts in a cluster, then set the instance aggregation to “All VMware Hosts” to generate a single line on the chart. You can then set a Description of the name of the cluster, and use this in the Updateable Text function for the chart heading. These examples show this being done:

define_report.png

 

Here you can see the selection of three hosts, “instance aggregated” together. In addition the description was set as follows:

 

selection.png

 

The heading for the chart was changed with the Updateable Text dialog box to contain words ‘Memory usage for cluster Context’ – the Context variable will be set from the description in this case.

 

The final chart then looked like this:

 

final_chart.png

Use of the description field is only generally a good idea when there is one series being plotted on chart. The description is a Selection level item, so if you select four metrics to plot and provide a description, that description will be appended to each of the four metric names.

 

This chart also reminds us that if you want to have threshold lines on a user-defined chart in a bulletin, you must put them there. The actual values used for the thresholds are set in the properties of a report in a bulletin. You do not need to define threshold lines in your chart for the Advisor analysis to take place, but if you want them to look like “regular” Advisor charts, then you can add them in yourself.

 

3. Define A Bulletin

 

So, we need to have a set of charts to include in a bulletin that contain the hosts that belong to a given cluster, and (at the moment) we need to know which hosts they are. When they look like they need to, they can be added to an Advisor Bulletin, being sure to set Updateable Platform = None.

 

Once you have added a chart to a bulletin like this the remainder of the definitions you would expect to make are unchanged, with one exception. You will see that the Target Type box is greyed out and the Name field is blank – a consequence of setting Updateable Platform = None. This allows you to enter any name of your choosing in the dialog box, as in the example here:

 

report_properties.png

 

Here we have asked Athene to generate a name of ‘QA Test Cluster’ for Advisor and also set a more sensible name for the report. This report will go into the CPU column in the bulletin.

 

By repeating this short process for each of the supplied charts (or of course creating your own), you can build up a set of reports for a given cluster.

Once this is done, you can repeat the whole process for another cluster. You must first copy the charts you used to a new set of names (hence the cluster number in the supplied samples), editing each one to set the correct context – the selection of hosts to include - and adding this set to the same, or a new bulletin.

 

4. Cluster-level metric considerations

 

Reporting at the cluster level requires a little thought as to what it is you are aiming to achieve. Some metrics can be plotted from several hosts on the same graph and stacked up to represent a cluster total – memory used is a good example, or disk activity. Some are better displayed side-by-side to show the utilization of the cluster members separately, for example CPU busy or memory used %. Some are fine to display as an overall average. The danger of stacking or aggregating metrics like these is the assumption that the underlying machines are all the same. CPU busy is a notoriously poor indicator of how much “power” a machine is supplying - there is a considerable difference between the work being done by a 3 GHz quad-core Intel Xeon monster machine running at 25% busy as compared to a Pentium III 1.3 GHz machine reporting the same utilization.

 

In general, counts and rates can be stacked, but utilizations should not, unless you are absolutely sure that the underlying hardware is similar enough to make it a valid proposition.

 

Some of the supplied charts use user-defined metrics to augment or to modify the ones stored by Athene. These are created using the Reporting Configuration Manager (RCM) application, and two examples are shown below.

 

Count of guest systems

 

A useful measure of how much consolidation or virtualization of systems has been done is simply the count of the number of guest systems running on a host. The way to do this with Athene is to open RCM and navigate to the level in the metrics tree that contains the items you wish to count, then create a new metric with a value of ‘1’, defined as a count, like this:

 

new_metric.png

 

This metric is then available in Define A Report at the same level in the metrics tree as you defined it.

Converting metrics to more useful units

 

Another use for a user-defined metric is where Athene stores data in one unit but you wish to display it in another. The example used here is VMware host memory. This is provided by Virtual Center as a value in KB (kilobytes), but as server memory is purchased in GB (gigabytes). You can take this metric and using RCM create a new one based on it as in the following example:

 

avg_metric.png

Note that RCM only takes into account the name of the metric and not its units. For example, you could not create a metric called ‘Average Memory used by the VMs’ with a unit type of GB, as this will be a duplicate of the one you are trying to base your new metric on.

 

 

5. Summary

 

This document has described how to carry out VMware cluster-level reporting using the standard features and facilities of Athene.

 

Future planned enhancements to Athene will further extend this capability and allow automatic inclusion of hosts by their VMware-defined cluster name.

Appendix A. List and Description of Cluster Reports

 

The reports that are available cover the following metrics:

 

Report name

“Nice name”

Metrics (advisor metric in bold)

Aggregation

Comments

VCACP1

Average CPU across all members of a cluster

Total VM CPU% utilization

Instance, all VMware hosts, time, 1 hour averages

Description set to cluster name, heading uses this as Context

VCAMP1

Average cluster memory usage %

Avg memory usage of total available memory %

Instance, all VMware hosts, time, 1 hour averages

Description set to cluster name, heading uses this as Context

VCICP1

CPU of individual members of a cluster

Total VM CPU% utilization

Time, 1 hour averages

VCIGC1

Guest systems per host

Guest count

Time aggregation to 1 hour averages

User-defined metric with value of ‘1’ at the VMware Guest Machine level

VCODK1

Total disk activity of cluster

Total data rate/KB /sec //Number of reads/writes /sec

Instance, all VMware hosts, time, 1 hour averages

Description set to cluster name, heading uses this as Context

VCOMG1

Total cluster used memory in GB

Avg memory used GB

Instance, all VMware hosts, time, 1 hour averages

User-defined metric to convert PDB metric in KB to GB

VCTCP1

Average CPU busy trend

Total VM CPU% utilization

Instance, all VMware hosts, time, 1 hour averages

Description set to cluster name, heading uses this as Context

VCTDS1

Average datastore free space trend

Free space %

All instances, time aggregation to 6 hour average

User defined metric of Free Space MB / Capacity MB * 100
Description set to cluster name, heading uses this as Context

| VCTGC1 | Guest count trend | Guest count | All instances, time aggregation to 6 hour averages | User-defined metric with value of ‘1’ at the VMware Guest Machine level |

 

Important note: Before they can be used, these charts will need to be run through the PDB Path Editor (AXM623) to alter references to the Metron PDB to the user’s own, then each will need editing with Define A Report to set the context correctly for the hosts that belong to a given cluster.

 

If these supplied charts do not provide a sufficient variety of reports it is a simple task to take one and use the format to build a new chart. If the metric(s) you wish to use are at the same “level” in the metrics tree, you can simply change them with a right-click, take Selections then Edit Selections, and finally Edit Metrics to select the new one you want to replace the current one. Be sure to save the updated chart as a new item unless you mean to overwrite the supplied one.

 

If you wish to plot metric(s) from somewhere else in the metrics tree, then in the open report, right-click, take Selections, then Add Selection… From here you can go through the process of adding metrics to your chart. Before pressing the Finish button, delete the first selection, so you are only left with your new one. This way you will retain the style of the chart so it fits in with other Advisor reports.

 

 

 

 

 

Average User Rating
(0 ratings)




There are no comments on this document

More Like This

  • Retrieving data ...

More by Nick Varley