This document explains the features and techniques within Athene v8.20 that can be used to generate cluster-level reports using standard Athene facilities.
Athene provides capture, reporting and model-building with performance and capacity data retrieved from VMware hosts and guests. Analyst and Define A Report can be used to examine this data by hand, and it is relatively simple to include Athene Updateable Charts into standard Advisor bulletins.
Many customers group their VMware hosts into clusters, to allow for easier management, perhaps aligned with various business or service lines. At present, Athene does not capture all the information to automatically determine which cluster a given host belongs to. However by using the methodology shown below it is possible to report at the VMware cluster level.
This proven practice is suitable for any organization use VMware Clusters to organize collections of ESX Servers into one homogeneous group, but is particularly useful for large enterprise organizations with many clusters to manage.
Targeted at Capacity Management and Service Management professionals, this will also be of interested to VCPs.
Why can't you use Athene 'Groups'?
Define a Report
Define a Bulletin
Cluster-level Metric Considerations
Summary
A. Appendix: List and Description of Cluster Reports
Metron is a privately owned limited company which was founded in 1986. Metron-Athene Inc is a wholly owned subsidiary of Metron technology Ltd. The company is Europe's foremost Capacity Planning and Systems Performance Management specialist. Metron's flagship product, Athene, provides fully integrated ITIL-compliant capacity management, automatic performance analysis and reporting for UNIX, Linux, Windows and Mainframe Servers .
Nick Varley
- You can find this document on the VIOPS portal at http://viops.vmware.com/home/docs/DOC-1147
You use this proven practice at your discretion. VMware and the author do not guarantee any results from the use of this proven practice. This proven practice is provided on an as-is basis and is for demonstration purposes only.
Athene 8 introduced the “Groups” facility, which allows the selection of targets by comprehensive SQL-like selection criteria. Why can these not be used for generating cluster reports ?
The words “selection of targets” in the first sentence of this section explain why – a group is there to return a list of targets to Athene components, including System Manager, Data Management and Advisor. Advisor cannot currently be told to include all the targets in a group on a single chart – it will generate a separate chart for each target. In other circumstances this is a useful and powerful technique, as simply making a target “belong” to a group provides a dynamic reporting capability. In this case, it is not what is required – we want to include all members of a group on one chart.
To achieve this means we step outside of the common mode of using Advisor and make use of the “Platform=None” facility.
When you add a user-defined chart to a bulletin, you are presented with a small dialog to select the chart name, the updateable platform and the Advisor metric. By default the platform value will be derived from the type of data being charted, as in this example:
The updateable platform relates to the type of data that can be incorporated into the chart by Advisor. Remember that you can build a chart for any target and then use it for any other target of the same platform in an Advisor bulletin, so you could not push data for a Windows target into a chart built to represent a UNIX machine, or vice-versa.
If you pull down the handle on the right of the Updateable Platform box, the alternative is 'None'. Selecting this option fixes the context of the chart as it enters a bulletin – you cannot run this chart for any other set of targets than this one. It will be able to pick up data from different days, otherwise this facility would be pointless, as the same data would be generated over and over.
This is the basis of cluster-level reporting in Athene – you create charts that reflect the hosts that belong to a given cluster and include them individually in your bulletins.
This component of Athene is used to generate the template charts for Advisor. Metron will be happy to provide a selection of charts to facilitate the implementation of VMware reporting. These charts should be run through the PDB Path Editor utility (AXM623) to convert their PDB names to one you wish to use. You can then open and modify them.
A subfolder is provided called Clusters which can be used as a basis for cluster reporting.
Charts in this folder are called ‘VCxxxn.auc’, where xxx represents the contents of the chart and n represents a cluster number – see Appendix A for details of each report.
You must update the context of each chart to refer to the correct grouping of hosts.
Using the standard aggregation feature of Define A Report, the user can join several series of data into one to represent a cluster-level view of a metric, however some do not lend themselves to such aggregation – please see the section below for details of what not to aggregate (at least not without understanding what it means to do so).
If it is appropriate to aggregate metrics into a single series, for example the total amount of memory used across all hosts in a cluster, then set the instance aggregation to “All VMware Hosts” to generate a single line on the chart. You can then set a Description of the name of the cluster, and use this in the Updateable Text function for the chart heading. These examples show this being done:
Here you can see the selection of three hosts, “instance aggregated” together. In addition the description was set as follows:
The heading for the chart was changed with the Updateable Text dialog box to contain words ‘Memory usage for cluster Context’ – the Context variable will be set from the description in this case.
The final chart then looked like this:
Use of the description field is only generally a good idea when there is one series being plotted on chart. The description is a Selection level item, so if you select four metrics to plot and provide a description, that description will be appended to each of the four metric names.
This chart also reminds us that if you want to have threshold lines on a user-defined chart in a bulletin, you must put them there. The actual values used for the thresholds are set in the properties of a report in a bulletin. You do not need to define threshold lines in your chart for the Advisor analysis to take place, but if you want them to look like “regular” Advisor charts, then you can add them in yourself.
So, we need to have a set of charts to include in a bulletin that contain the hosts that belong to a given cluster, and (at the moment) we need to know which hosts they are. When they look like they need to, they can be added to an Advisor Bulletin, being sure to set Updateable Platform = None.
Once you have added a chart to a bulletin like this the remainder of the definitions you would expect to make are unchanged, with one exception. You will see that the Target Type box is greyed out and the Name field is blank – a consequence of setting Updateable Platform = None. This allows you to enter any name of your choosing in the dialog box, as in the example here:
Here we have asked Athene to generate a name of ‘QA Test Cluster’ for Advisor and also set a more sensible name for the report. This report will go into the CPU column in the bulletin.
By repeating this short process for each of the supplied charts (or of course creating your own), you can build up a set of reports for a given cluster.
Once this is done, you can repeat the whole process for another cluster. You must first copy the charts you used to a new set of names (hence the cluster number in the supplied samples), editing each one to set the correct context – the selection of hosts to include - and adding this set to the same, or a new bulletin.
Reporting at the cluster level requires a little thought as to what it is you are aiming to achieve. Some metrics can be plotted from several hosts on the same graph and stacked up to represent a cluster total – memory used is a good example, or disk activity. Some are better displayed side-by-side to show the utilization of the cluster members separately, for example CPU busy or memory used %. Some are fine to display as an overall average. The danger of stacking or aggregating metrics like these is the assumption that the underlying machines are all the same. CPU busy is a notoriously poor indicator of how much “power” a machine is supplying - there is a considerable difference between the work being done by a 3 GHz quad-core Intel Xeon monster machine running at 25% busy as compared to a Pentium III 1.3 GHz machine reporting the same utilization.
In general, counts and rates can be stacked, but utilizations should not, unless you are absolutely sure that the underlying hardware is similar enough to make it a valid proposition.
Some of the supplied charts use user-defined metrics to augment or to modify the ones stored by Athene. These are created using the Reporting Configuration Manager (RCM) application, and two examples are shown below.
A useful measure of how much consolidation or virtualization of systems has been done is simply the count of the number of guest systems running on a host. The way to do this with Athene is to open RCM and navigate to the level in the metrics tree that contains the items you wish to count, then create a new metric with a value of ‘1’, defined as a count, like this:
This metric is then available in Define A Report at the same level in the metrics tree as you defined it.
Converting metrics to more useful units
Another use for a user-defined metric is where Athene stores data in one unit but you wish to display it in another. The example used here is VMware host memory. This is provided by Virtual Center as a value in KB (kilobytes), but as server memory is purchased in GB (gigabytes). You can take this metric and using RCM create a new one based on it as in the following example:
Note that RCM only takes into account the name of the metric and not its units. For example, you could not create a metric called ‘Average Memory used by the VMs’ with a unit type of GB, as this will be a duplicate of the one you are trying to base your new metric on.
This document has described how to carry out VMware cluster-level reporting using the standard features and facilities of Athene.
Future planned enhancements to Athene will further extend this capability and allow automatic inclusion of hosts by their VMware-defined cluster name.
The reports that are available cover the following metrics:
Report name | “Nice name” | Metrics (advisor metric in bold) | Aggregation | Comments |
|---|---|---|---|---|
VCACP1 | Average CPU across all members of a cluster | Total VM CPU% utilization | Instance, all VMware hosts, time, 1 hour averages | Description set to cluster name, heading uses this as Context |
VCAMP1 | Average cluster memory usage % | Avg memory usage of total available memory % | Instance, all VMware hosts, time, 1 hour averages | Description set to cluster name, heading uses this as Context |
VCICP1 | CPU of individual members of a cluster | Total VM CPU% utilization | Time, 1 hour averages | |
VCIGC1 | Guest systems per host | Guest count | Time aggregation to 1 hour averages | User-defined metric with value of ‘1’ at the VMware Guest Machine level |
VCODK1 | Total disk activity of cluster | Total data rate/KB /sec //Number of reads/writes /sec | Instance, all VMware hosts, time, 1 hour averages | Description set to cluster name, heading uses this as Context |
VCOMG1 | Total cluster used memory in GB | Avg memory used GB | Instance, all VMware hosts, time, 1 hour averages | User-defined metric to convert PDB metric in KB to GB |
VCTCP1 | Average CPU busy trend | Total VM CPU% utilization | Instance, all VMware hosts, time, 1 hour averages | Description set to cluster name, heading uses this as Context |
VCTDS1 | Average datastore free space trend | Free space % | All instances, time aggregation to 6 hour average | User defined metric of Free Space MB / Capacity MB * 100 |
| VCTGC1 | Guest count trend | Guest count | All instances, time aggregation to 6 hour averages | User-defined metric with value of ‘1’ at the VMware Guest Machine level |
Important note: Before they can be used, these charts will need to be run through the PDB Path Editor (AXM623) to alter references to the Metron PDB to the user’s own, then each will need editing with Define A Report to set the context correctly for the hosts that belong to a given cluster.
If these supplied charts do not provide a sufficient variety of reports it is a simple task to take one and use the format to build a new chart. If the metric(s) you wish to use are at the same “level” in the metrics tree, you can simply change them with a right-click, take Selections then Edit Selections, and finally Edit Metrics to select the new one you want to replace the current one. Be sure to save the updated chart as a new item unless you mean to overwrite the supplied one.
If you wish to plot metric(s) from somewhere else in the metrics tree, then in the open report, right-click, take Selections, then Add Selection… From here you can go through the process of adding metrics to your chart. Before pressing the Finish button, delete the first selection, so you are only left with your new one. This way you will retain the style of the chart so it fits in with other Advisor reports.
There are no comments on this document