GemStone/S 64 Bit 3.2.6 is a new version of the GemStone/S 64 Bit object server. This release adds new features, new cache statistics, and fixes a number of bugs in v3.2.4; we recommend everyone using or planning to use GemStone/S 64 Bit v3.2.x upgrade to this new version.
These release notes provide changes between the previous version of GemStone/S 64 Bit, version 3.2.4, and version 3.2.6. Versions 3.2.4.1, 3.2.4.2, 3.2.4.3, and 3.2.5 were limited distribution special releases. All changes in these releases are included in these release notes. If you are upgrading from a version prior to 3.2.4, review the release notes for each intermediate release to see the full set of changes.
For details about installing GemStone/S 64 Bit 3.2.6 or upgrading from earlier versions of GemStone/S 64 Bit, see the GemStone/S 64 Bit Installation Guide for v3.2.6 for your platform.
GemStone/S 64 Bit version 3.2.6 is supported on the following platforms:
Note that on Linux, GemStone/S v3.2.6 has been compiled on a later kernel; Red Hat 6.1 and 6.4 are not supported with this version.
For more information and detailed requirements for each supported platforms, please refer to the GemStone/S 64 Bit v3.2.6 Installation Guide for that platform.
The following versions of GBS are supported with GemStone/S 64 Bit version 3.2.6. You must use GBS version 7.6.1 or later for VisualWorks, or 5.4.2 or later for VA Smalltalk, with GemStone/S 64 Bit v3.2.6.
The GemStone/S 64 Bit v3.2.6 distribution includes VSD version 4.0.2. The previous version of GemStone/S 64 Bit, v3.2.4, included VSD v4.0.
Changes between v4.0 and v4.0.2 include:
For more details, see the Release Notes for VSD v4.0.1 and v4.0.2
The version of OpenSSL used by GemStone/S 64 Bit v3.2.6 has been updated to 1.0.2a.
When the startcachewarmer script is used to warm a remote cache, it can now be configured to use or create a mid-level cache. With this configured, the cachewarmer will either load pages into the remote cache using the mid-level cache, or warm the mid-level cache as it loads pages in the remote cache.
The following options have been added to startcachewarmer:
-M host name or IP address where the mid-level cache is running or will be created. The -H option (specifying the host name or IP of the Stone’s host) must also be specified with this option.
-C size of the mid-level cache in KB. If omitted, a value of 75000 is used. Only applies if the -M option is also specified and the mid-level cache does not exist.
-N The maximum number of processes that can use the mid-level cache. If omitted, a value of 50 is used. Only applies if the -M option is also specified and mid-level cache does not exist.
If a mid-level cache host name or IP address is specified (via -M), the mid-level cache will be created if it does not already exist. The -C and -N options will be used to specify the size and number of processes that can attach the mid-cache respectively. If the mid-cache already exists, the -C and -N options are ignored.
The default socket backlog for NetLDI, SPC Monitor, and Stone has been increased to 64.
To allow netldi to queue a configurable number of login requests, an option has been added to startnetldi. This addresses bug #45008, Connection refused errors on NetLDI connect backlog.
The netldi has a new option, -b, to specify the maximum backlog on the listening socket.
Usage: startnetldi [-b backlog] [-h] [-d] [-g|-s] [-n]
[-a account] [-l logFile] [-t seconds] [-P portNumber]
[-A address] [name]
Note that if a value passed in with the -b argument is larger than the OS configuration allows, as on Linux per /proc/sys/net/core/somaxconn, it will be truncated to that limit.
When a Linux system runs low in memory, each process’s oom_score_adj setting is used to determine which processes are killed first. Applications can adjust this value; if the Unix user has the CAP_SYS_RESOURCE capacity, the oom_score_adj can be set to a lower value, otherwise it can only be increased.
The previous behavior had shortcomings:
In GemStone/S 64 Bit, read authorization checks occur when the object is faulted into the VM, rather than when the actual read occurs. This changes the timing of the error relative to 32-bit GemStone/S, and creates conditions under which you could get a SecurityError even when the operations you were performing should not trigger that error. (#45040)
A variation occurred under some conditions of updating an RcKeyValueDictionary. (#45054).
This release includes changes and a new configuration parameter to change the way unauthorized objects are handled. By default, there is no change in behavior from that in previous GemStone/S 64 versions. A new configuration parameter, GEM_READ_AUTH_ERR_STUBS, has been added. This defaults to FALSE; when this is set to TRUE, instead of triggering a SecurityError, when a read authorization error is encountered, an instance of the new class UnauthorizedObjectStub is created.
These changes require corresponding changes in GBS, and should not be used for GBS sessions, unless directed otherwise by GBS Engineering when running with certain versions of GBS.
The following configuration parameter has been added:
GEM_READ_AUTH_ERR_STUBS
If TRUE, an in-memory instance of UnauthorizedObjectStub is constructed for an object fault instead of signalling a SecurityError for read authorization denied.
Runtime equivalent: #GemReadAuthErrStubs
Default: FALSE
This should remain FALSE for GBS sessions, unless directed otherwise by GBS Engineering when running with certain versions of GBS.
The name of a gem in cache statistics can be set using the method cacheName:, and this name is visible in VSD and when using the programmatic interface to cache statistics. In version 3.0, it was disallowed to assign a name to a Gem when that name was already in use, and attempting to do so would raise an error.
While it is more unambiguous to avoid duplicate cache names, this error was inconvenient in practice, and duplicate names are now allowed.
Note that accessing statistics via System >> cacheStatsForGemWithName: will return statistics for the first gem with a given name. If your cache naming does not create unique names, use other cache statistics lookup methods that rely on PID or sessionId, to ensure that you get predictable results.
When determining the amount of physical space required to hold objects on disk, the method physicalSize, which returns the space required for the in-memory representation, may overstate the requirement for the object on disk. For more accurate calculation, the following method has been added:
Object >> physicalSizeOnDisk
Returns the number of bytes required to represent the receiver on disk. If the receiver is in special format (which implies that its representation is the same as its OOP), returns zero.
The following bugs in v3.2.4 have been fixed in v3.2.6:
When the OopNumberHighWaterMark grows by a very large amount (more than 50M objects) during MFC, internal structure resizing was not done correctly. This could result in a SEGV, or depending on the specifics of the occurrence, OOPs may be handled incorrectly and introduce corruption. (#45106)
The method Object>>printRecursiveRepresentationOn: called self asString, which in some cases could cause unwanted recursion. (#44825)
Version 1.0.2 of OpenSSL has better retry logic handling than that in 1.0.1x. In addition, the retry logic in GemStone’s SSL login (GciLogin) did not correctly handle retries in all cases. (#44927)
A number of fixes and changes have been made in v3.2.6 in handling of problems in mid-level caches. Systems using mid-level caches are now more tolerant of the loss of connection to a mid-level cache host, and of problems with individual processes supporting a mid-level cache.
If a mid-level cache terminated or the mid-level cache machine became unavailable, all remote sessions using that mid-level cache would encounter a fatal error. Now, these sessions will continue running without the mid-level cache; a message is printed to the Gem’s stdout. (#44993)
Under some conditions, the death of the Shared page cache Monitor on a mid-level cache host may not be handled properly, resulting in errors attempting to connect to the mid-level cache. (#45005).
When a session’s page server on the mid-level cache dies, the session may encounter an error as the remote session attempts to recover the connection. (#44992).
A Gem process's GEMSTONE_NRS_ALL may includes a setting for #log, indicating a directory to which the associated logs should be written. If the Gem uses a mid-level cache, it has a page server process on that machine, and the log for this page server should be but was not using the Gem's #log setting. Instead, this log was using a #log setting from the mid-level cache machines NetLDI environment, or to the unix user's home directory. (#45004)
If the DNS server cannot resolve an address, the execution of GsSocket class >> getHostNameByAddress: may hang, depending on your configuration. Now, it will try five times before reporting an error (#45077)
Some obsolete classes, such as ObsoleteSymbol, were moved to the "ObsoleteClasses" dictionary within Globals as part of the 3.0 upgrade; by leaving the class, existing references would continue to be functional. However, recompile of GemStone kernel classes (other than by upgrade) resulted in unresolved symbols. While this is not generally needed or recommended, tools such as STORE may have caused recompile. (#44990)
Changes in handling of asynchronous events to avoid recursive handling were made in 3.2. There was a case in which this code would suppress a second sigAbort to a GCI or GBS applications performing certain GCI executions. (#45067)
It was possible for internal repository I/O time tracking code to execute thread-unsafe, with a risk of a null pointer and SEGV. (#44511)
When the transaction logs are full, the stone performs special handling of commits and other operations, pausing until space becomes available. If a Gem was performing a commit, some state was cleared by the code that handled the tranlog full condition. In this case, the Gem would terminate with a invalid stone command error. (#44894)
When the Gem terminated due to loss of its page server, the error was not handled correctly. The timing of the close of the socket to the client was incorrect, which for some cases in GBS logins, could cause the GBS client image to hang. (#45065)
An internal value was computed incorrectly, producing a number usually in the vicinity of 250. If the number of remote caches attached to a system grew larger than this, the Stone crashed with a UTL_GUARANTEE error. (#45064)
A operation during reclaim is not thread-safe, so under some conditions, one thread may remove a page from the reclaim pages list, while another thread puts the page back on the list. This results in an attempt to reclaim the same page more than once. The conditions for this bug appear to be rare; however, while this bug may not trigger an immediate error, page errors may occur subsequently. (#45041)
When a reclaim thread finds no pages in an extent that need reclaim, it sleeps for one second before continuing, even if other extents have a large amount of reclaim. This may cause slow performance in systems configured with sequential allocation mode and that have a large amount of free space. In such system, the data (and therefore pages needing reclaim) may be entirely in the first extents, leaving later extents in the sequence empty. (#45015)
When the number of Reclaim Gem sessions is changed, the new number of Reclaim Gem sessions is logged in both the Stone log and the Reclaim Gem log. The number in the Stone log is correct, but the number reported in the Reclaim Gem log is one lower than the actual number. (#45013).
On login, gems connect to the netldi on its listening socket. The backlog for this socket is set at 20, and if the number of login requests is much higher than the netldi can process and the backlog exceeds 20, the login will error with “Connection refused”. (#45008)
Now, the default socket backlog for NetLDI, SPC Monitor, and Stone has been increased to 64. The netldi has a new option, -b, to specify the maximum backlog on the listening socket. see startnetldi added option to configure socket listening backlog.
When the cache warmer completed and detached from the shared cache, the main cache warmer thread's disconnect was not clean, resulting in the need for slot recovery. (#45003)
When the number of sessions was greater than 2034, the results returned by System currentSessions may have included nils. (#45012)
When a socket to a remote cache disconnects while the page server was holding the free PCE spin lock, it was possible for the page server to exit without releasing the spin lock, leaving it stuck. (#45009)
Attempting to restore from a backup that was located on an NFS-mounted drive could error. (#45019)
This release includes a number of bug fixes related to cache statistics, and a number of new statistics.
The following cache statistics were not updated for reads done by a mid-level cache on behalf of a remote gem:
BitmapPageReads
DataPageReads
ObjectTablePageReads
OtherPageReads
PageIoCount
PageIoTimeOverallAvg
PageIoTime10SampleAvg
PageIoTime100SampleAvg
PageReads
The stone, logsender, and logreceiver processes incremented these statistics by the number of read operations, rather than by the number of pages read. (#45006)
Previously, some SPC Monitor statistics were calculated as the sum of the corresponding processes for all active processes in the cache. When a session logged out, the SPC Monitor statistics value could drop, counter-intuitively.
Now, the following SPC Monitor statistics are cumulative for the life of the cache, and will not decrease:
TotalLocalPageCacheHits
TotalLocalPageCacheMisses
TotalWaitsForOtherReader
TotalPageReads TotalPageWrites
TotalFramesFromFreeList
TotalFramesFromFindFree
TotalFramesAddedToFreeList
TotalOtPageReads
TotalDataPageReads
TotalBmPageReads
TotalMiscPageReads
TotalPcesRemovedFromFreeList
TotalPcesAddedToFreeList
The -J flag to statmonitor specifies to collect statistics for the Stone, Shared page cache, and Page Manager only. However, the Page Manager statistics were not being collected. (#45081)
The statistics related to network performance were incorrect for Linux hosts; some values were unreasonably high, others were zero. (#45051)
The following cache statistics were expressed on AIX as the number of 4KB pages, rather than KB, and so were understated 4x in the statmonitor data. This has been corrected, and these statistics are now correctly recorded in KB. (#45058)
Reclaim Gem cache statistics for PinnedPagesCount is incorrect. (#45020)
On Linux, cache statistics are collected from /proc/pid/status rather than from /proc/pid/statm. This provides some additional statistics, which are available for all GemStone processes.
MaxImageSize (All on Linux)
The maximum (high water) size of the process's image in kilobytes.
MaxRSS (All on Linux)
The high water mark of the processes resident set size. Note that this counter is always 0 on Solaris.
RSSStack (All on Linux)
The stack resident set size.
PageTablesMemoryKB (All on Linux)
The amount of memory dedicated to low-level page tables.
ThreadCount (All on Linux)
Number of threads currently active in this process. An instruction is the basic unit of execution in a processor, and a thread is the object that executes instructions. Every running process has at least one thread.
VolCSW (All on Linux)
The number of voluntary context switches done by the process. Note that this counter is always 0 on HP-UX.
IVolCSW (All on Linux)
The number of times the process was forced to do a context switch. Note that this counter is always 0 on HP-UX.
As a result of this change, SharedKBytes and RSSDirty are no longer collected on Linux.
The following GemStone process cache statistics have been added:
CommitRecordPageReads (All)
The number of commit record pages read by the process since it was started.
PagesAddedToCacheFromDisk (All)
Number of pages added to the shared cache by this process which were read from disk.
PagesAddedToCacheFromMidCache (All)
Number of pages added to the shared cache by this process which were copied from a mid-level shared page cache.
PagesAddedToCacheFromPrimaryCache (All)
Number of pages added to the shared cache by this process which were copied from the primary shared page cache.
PagesAddedToCacheNewlyCreated (All)
Number of pages added to the shared cache by this process which were newly created. For gems and the stone, the pages were created by the process. For page servers, the pages were created by gem connected to the page server.
PagesInCacheCreatedInLeafCache (ShrPcMonitor)
Number of pages present in the shared cache which were created in a remote shared page cache.
PagesInCacheCreatedInPrimaryCache (ShrPcMonitor)
Number of pages present in the shared cache which were created in the primary shared page cache.
PagesInCacheFromDisk (ShrPcMonitor)
Number of pages present in the shared cache which were read from disk.
PagesInCacheFromMidCache (ShrPcMonitor)
Number of pages present in the shared cache which were copied from a mid-level shared page cache.
PagesInCacheFromPrimaryCache (ShrPcMonitor)
Number of pages present in the shared cache which were copied from the primary shared page cache.
TotalCommitRecordPageReads (ShrPcMonitor)
Total number of commit record pages read into the shared page cache by all processes since the cache was created.
TotalPagesAddedToCacheFromDisk (ShrPcMonitor)
Total number of pages which were read from disk and added to the shared page cache by all processes since the cache was created.
TotalPagesAddedToCacheFromMidCache (ShrPcMonitor)
Total number of pages which were copied from a mid-level shared cache and added to the shared page cache by all processes since the cache was created.
TotalPagesAddedToCacheFromPrimaryCache (ShrPcMonitor)
Total number of pages which were copied from the primary shared cache and added to the shared page cache by all processes since the cache was created.
TotalPagesAddedToCacheNewlyCreated (ShrPcMonitor)
Total number of pages which were newly created and added to the shared page cache by all processes since the cache was created.
The following system stats may now be collected on Linux.
ActiveAnonMemoryKB
The amount of non-file backed memory that has been used more recently.
ActiveFileMemoryKB
The amount of memory used for buffering files that has been used recently.
ActiveMemoryKB
The amount of memory that has been used more recently and usually not reclaimed unless absolutely necessary.
AnonHugePagesKB
The amount of non-file back memory backed by huge memory pages.
AnonymousMemoryKB
The amount of non-file backed memory mapped into userspace page tables.
BounceMemoryKB
The amount of memory used for bounce buffers for block devices.
CachedMemoryKB
The amount of memory used as cache memory.
CachedSwapKB
The amount of swap used as cache memory.
CommitLimitKB
The total amount of memory currently available to be allocated on the system.
CommittedAsKB
The amount of memory presently allocated on the system, including memory allocated by processes that has not yet been used.
FileBufferSizeKB
The amount of memory used in file buffers.
HardwareCorrupted
A boolean indicating if the system has detected a memory failure.
HugePagesFreeKB
The amount of memory in the huge pages pool that has not yet been allocated.
HugePagesRsvdKB
The amount of memory in the huge pages pool for which a commitment to allocate from the pool has been made, but no allocation has yet been made.
HugePageSize
The size of a huge memory page in bytes.
HugePagesSurpKB
The amount of memory in the huge pages pool above the value in /proc/sys/vm/nr_hugepages.
HugePagesTotalKB
The total amount of memory in the huge pages pool.
InactiveAnonMemoryKB
The amount of non-file backed memory that has not been used recently.
InactiveFileMemoryKB
The amount of memory used for buffering files that has not been used recently.
InactiveMemoryKB
The amount of memory which has been less recently used. It is more eligible to be reclaimed for other purposes.
KernelDataMemoryKB
The amount of memory used by the kernel for caching data structures.
KernelDataReclaimableMemoryKB
The amount of memory used by the kernel for caching data structures that may be reclaimed.
KernelDataUnreclaimableMemoryKB
The amount of memory used by the kernel for caching data structures that cannot be reclaimed.
KernelStackMemoryKB
The amount of memory used by the kernel stack.
LockedMemoryKB
The amount of memory that has been locked using mlock(2) or similar calls. Locked memory cannot be swapped.
MappedMemoryKB
The amount of memory which has been mapped to files.
NfsUnstableMemoryKB
The amount of memory used by NFS pages sent to the server, but not yet committed to stable storage.
PageTablesMemoryKB
The amount of memory dedicated to low-level page tables.
SharedMemoryKB
The amount of memory enabled for sharing between multiple processes via shmat(2) and mmap(2) with the MAP_SHARED attribute set
UnevictableMemoryKB
The amount of memory that cannot be swapped.
WritebackMemoryKB
The amount of memory which is actively being written back to disk.
WritebackTmpMemoryKB
Amount of memory used by FUSE (Filesystem in Userspace) filesystems.
The following statistic has been removed:
Also note SharedKBytes and RSSDirty are no longer collected on Linux; see Change in GemStone process statistics on Linux