GemStone/S 64 Bit Supplemental Documentation

Cache Monitoring using Prometheus

Overview

GemStone supports Shared Page Cache monitoring via Promentheus, on Linux on x86/64.

GemStone’s statprom executable starts a process that can be queried from the Prometheus monitoring software, to retrieve GemStone cache statistics values. A statprom process may connect to either the Stone’s cache or a remote cache; you will need multiple instances to monitor multiple caches.

When statprom is started, it takes an argument configuration file that is customized for a specific cache and the specific monitoring requirements; this includes the port number, the Stone name, and a list of statistics to report. This configuration file is in JSON format.

While statprom can report any GemStone statistic originating in the shared page cache, host system statistics are not available.

Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit, that collects and stores metrics (numeric data) as time series, along with key-value tags. Prometheus is widely used, and the Prometheus Github project has a active developer and user community.

In addition to Prometheus, Grafana can be installed and used for live monitoring of Prometheus data; Grafana provides out of the box support for Prometheus, and no additional configuration is needed to collect the GemStone data from Prometheus.

statprom is built on the open-source github project Prometheus Client Library for Modern C++ (jupp0r.github.io/prometheus-cpp). This imbeds a web server from the CivetWeb project (civetweb.github.io/civetweb), which handles http requests from Prometheus. statprom in turn uses the GemStone C Statistics Interface (GCSI) to access cache statistics. The GCSI attaches the process to the shared page cache as read-only; it does not create a gem session, and therefore has no view of the repository.

Configuring Statprom and Prometheus

Configure statprom

The statprom process must be started with a configuration file that specifies the stone or cache name, the port to expect queries from prometheus, and the specific cache statistics that can be returned to prometheus. The statprom configuration file is in JSON format and requires specific keys to be present in a specific structure.

For details on the file format, see Configuration file JSON format. An example configuration file is included in the distribution, $GEMSTONE/examples/Prometheus/stats.json.

The statprom configuration file allows you to monitor the Stone, the Shared Page Cache Monitor, or Gems. Any GemStone cache-based statistic of these process can be accessed; however, host process statistics cannot be monitored directly.

Additional configuration requirements for monitoring Gems

The Stone and Shared Page Cache Monitor are singletons within a cache, and thus straightforward to monitor. However, a GemStone system contains multiple gems with different purposes and monitoring requirements. Prometheus does not handle multiple instances, so there is additional configuration required for monitoring Gems.

When multiple Gems match the criteria, the statistics values for all of them are added together. The monitoring criteria should be designed carefully so that either a single Gem process is identified, or if there is a chance that multiple Gem processes will be matched, that the specific statistic being monitored can logically be summed over all the matched Gems.

There are two options for filtering on the Gem you wish to monitor:

  • ProcessName

Cache statistics data includes an entry for a String ProcessName, which is set by the System; for example, Gem or TopazL. This can be set to a specific value in your application in several ways: using the -u option on the topaz command line, executing System cacheName: gemName, or set cachename gemName in topaz (note that topaz set cachename takes effect on the next login, and does not affect any current sessions).

statprom matches the Gem ProcessName using regular expressions, e.g. MyGemName*.

  • GemKind

Starting with this version, cache statistics data includes an entry for an integer GemKind. This is 0 by default, and negative for system Gems. This can be set using System setGemKind: anInt. Negative values are reserved for use by GemTalk.

statprom matches the Gem processes’ GemKind, using a high and low value, forming an inclusive range.

Statprom usage

statprom -f cfgFile [-c] [-d] [-r]
statprom -h | -v
-c	Check the JSON file (-f argument) for errors and exit.
	Requires -f.
-d	Enable printing debug output to stderr.
-f <cfgFile>
	Specifies a configuration file in JSON format which
	determines which processes and statistics are collected.
-h	Print this help screen and exit.
-r	Retry if the shared page cache not running. If the cache
	connection is lost, sleep and attempt to reattach.
	Without -r the process exits if the cache connection is
	lost or not present at startup time.
-v	Print the program version and exit.

Configure Prometheus

Prometheus in turn must be started with a configuration file that specifies the node on which statprom is running and the configured port.

For example, the prometheus configuration file may include an additional entry such as:

- job_name: 'gemstone'
    static_configs:
      - targets: ['nodename.gemtalksystems.com:9985']

Monitoring

Once Prometheus is started with its updated configuration file, and statprom has been started with its configuration file, Prometheus will starting querying and recording information for the specified statistics.

Prometheus will retrieve and store values with the name provided in the statprom configuration file. In addition to data tags such as the job name, the name of the Stone is provided as a tag with the key StoneName.

Configuration file JSON format

A sample JSON file, that could be used as an argument to statprom, is included in $GEMSTONE/examples/Prometheus/stats.json.

This provides an example of the main features of statprom configuration.

“http” Section
  • listen_addresses
    This argument povides informatoin on the port that statprom will listen on for connections from Prometheus. It is an Array of strings representing port numbers and interfaces/addresses. Both IPv4 and IPv6 addresses are supported. The default port number for statprom is 9995. The address format is specific to the civetweb webserver, see: https://civetweb.github.io/civetweb/UserManual.html

For example:

"http" : { 
   "listen_addresses" : ["[::]:9985","9985"] 
   }, 

specifies that either IVp6 or IVP4 connection is accepted on port 9985 on localhost.

Note that statprom does not support ssl connections at this time.

“gemstone” Section
  • cache_name
    a String which represents the name of the remote shared page cache to monitor or NULL. Only relevant when monitoring a remote shared page cache; this must be NULL when monitoring the Stone’s cache. The cache name may be obtained from gslist. The cache name is derived from a GemStone-generated hostId and should not change over the life of a host.
  • stone_name
    a String which represents the name of the Stone to monitor or NULL. Only relevant when monitoring a primary shared page cache; should be NULL when monitoring a remote cache.
  • sample_interval
    an Integer representing the sample interval, in seconds.

For example:

"gemstone" : { 
   "cache_name" : null, 
   "stone_name" : "gs64stone", 
   "sample_interval": 60 
   }, 
“metrics” Section

Three types of metrics are supported: monitor (shared page cache monitor), stone, and gem. You do not need to include all three types.

For each type, there should be an array of specific metrics to be monitored for that type. The JSON metrics objects have the following members:

  • vsd_name (required)
    a String which represents the vsd name of the statistic to monitor. Must exactly match the name of the statistic. It is an error if the statistic name does not exist for the given metric type.
  • metric_name (required)
    a String which represents the name of the statistic in Prometheus. All values of metric_name in the configuration file must be unique. This is a limitation of Prometheus which does not allow duplicate names.
  • metric_type (required)
    a String representing the Prometheus type for the stat. Accepted values are:
  • Counter – values that only increase and never decrease
  • Gauge – values that may either increase or decrease
  • Histogram – values that will be plotted as a histogram.
  • metric_help (required)
    a String which describes the function of the statistic.
  • metric_units (required)
    a String which describes the measurement units of the statistic.

For example,

"metrics" : { 
   "stone" : [ 
   { 
      "vsd_name" : "CommitRecordCount", 
      "metric_name" : "gemstone_stone_commit_records", 
      "metric_type" : "Gauge", 
      "metric_help" : "Number of commit records.", 
      "metric_units" : "Commit Records" 
     }, 

In addition to the above object members, entries for Gems require a filter criteria; either cache_name_regex, or both gem_kind_min and gem_kind_max, or all three.

  • cache_name_regex
    a String which represents a regex expression used to match against the cache names of all gems in the cache.
  • gem_kind_min
    an Integer indicating the minimum value of the GemKind statistic used to match against gems in the cache. Requires gem_kind_max be specified.
  • gem_kind_max
    an Integer indicating the maximum value of the GemKind statistic used to match against gems in the cache. Must be greater than or equal to gem_kind_min. Requires gem_kind_min be specified.

All Gems in the cache that match the filter criteria have their statistics values added together for return to Prometheus. You must be careful to ensure that either a unique Gem can be confidently matched to the filter criteria, or that the particular statistic values are meaningful if added together for multiple Gems.

If you have multiple Gems with statistics that need to be separately reported, you will need separate entries, each with a unique Prometheus statistic name, for each Gem/statistic.

Gem matching examples

Since Gems matching the filter criteria have values summed, you can make use of it, for example, if tasks are divided over multiple Gems.

For example:

"gem" : [ 
  { 
  "cache_name_regex" : "Widget.*", 
  "vsd_name" : "SessionStat01", 
  "metric_name" : "gemstone_gem_widgets_produced", 
  "metric_type" : "Counter",  
  "metric_help" : "Number of widgets produced.", 
  "metric_units" : "Widgets" 
  }] 

With this example, if there are multiple Gems with names that match Widget, the values will be summed. E.g. if there are the following Gems in the cache:

Gem Widget1, SessionStat01 == 6 
Gem Widget2, SessionStat01 == 8 

In this case, Prometheus will show the aggregate value of 14 (6 + 8) for the statistic gemstone_gem_widgets_produced.

Note that this monitor example would also match Gems named WidgetLogging, WidgetDefects, and so on; you must be aware of your application’s Gem cacheName conventions.

If you do not wish to sum the values, you must be able to identify the specific Gem or Gems for reporting, and create specific entries for each individual value. In the above example, if you wished to monitor Widget1’s 6 and Widget2’s 8 separately, rather than summed, you could create the following entries:

"gem" : [ 
  { 
  "cache_name_regex" : "Widget1", 
  "vsd_name" : "SessionStat01", 
  "metric_name" : "gemstone_gem_widgets1_produced", 
  "metric_type" : "Counter",  
  "metric_help" : "Number of widgets produced by Widget1.", 
  "metric_units" : "Widgets" 
  }, 
  { 
  "cache_name_regex" : "Widget2", 
  "vsd_name" : "SessionStat01", 
  "metric_name" : "gemstone_gem_widgets2_produced", 
  "metric_type" : "Counter",  
  "metric_help" : "Number of widgets produced by Widget2.", 
  "metric_units" : "Widgets" 
  }] 

Validating JSON configuration file

After editing your .json file, you may use statprom to verify that it is valid for use with statprom, using the -c option. With this option, the configuration file is validated, but statprom is not started.

statprom -c -f promfile.json