The following bugs in v3.3.6 are fixed in v3.4.
With the configuration parameter STN_GEM_TIMEOUT configured to a non-zero value, Idle Gems should be terminated once the timeout has elapsed; this was not occurring. (#46723 )
A gem in transactionless mode automatically responds to sigAbort by aborting. However, if this gem was entirely idle, it did not have a chance to perform this processing, and it could be killed by lostOT handling. (#46290).
When the AIO page servers encounter an error during fsync of the extents, such as running out of disk space, the stone could hang. fsync is a critical operation; now, the stone will shut down under this circumstance.
Additional details on I/O errors are also now printed to the stone log. (#46734, #46735)
When the out of disk space error occurs during a checkpoint, it was possible for the stone to hang. In this case, the final tranlog was not completely written and could not be read on stone restart. (#46727)
When a tranlog record contains only one record, a single selective abort, the size of the record is small enough that it is skipped by restore. (#46695)
On a heavily loaded system, in which swapping is taking place, there a race condition in acquiring an exclusive lock on the .LCK file. This can allow multiple ShrPcMonitor processes to be started; the Stone assumes that the earlier one(s) have timed out, and starts another one. Once the Stone connects to one of these ShrPcMon processes, it runs correctly, but the extra ShrPcMon processes remain and use resources. (#46928)
When a very large number of new symbols are being created by multiple gem sessions, it was possible for the commands related to SymbolGem communications to intersect in such a way that the Gem is waiting for a response while the SymbolGem is not aware there is work to do. This window is narrow and the problem is rare; but can result in up to a 10-15 second delay in committing. (#46825)
AllSymbols and instances of Symbol are protected against changing the objectSecurityPolicy; however the internal buckets that implement the AllSymbols collection were not protected. Changing the security policy for these objects caused the SymbolGem to die with error 2115. (#46410)
When checkpoints have been suspended, if an Epoch GC happened to run during this time window, the suspension was cancelled and checkpoints resumed, without any warning messages in the stone log. If extent copy backups were taking place, which is the usual reason for suspending checkpoints, this could result in the backups being corrupt. (#47133)
Symbol garbage collection can be run manually to remove unreferenced Symbols, per the instructions in the System Administration Guide. This code did not identify and mark for removal any symbols that contain characters over 255 (whose class was therefore DoubleByteString or QuadByteString). (#46614)
When the stone is in startup, it was possible for the waitstone to return an error rather than waiting. (#46518)
When cache warming does not complete because the shared page cache became full, it was logged as an error. This is an expected scenario, so this is now a warning. (#46493)
If the keyfile permits a limited number of CPUs (CPU affinity), but the Stone’s machine has fewer CPUs than this limit, remote gems were restricted to this number; remote gems did not use available CPUs on the remote host within the license limit. (#46207)
Keyfile limitations on the number of CPUs did not allow for machines with many CPUs, and would fail if the number of CPUs was 32 or more. The code has been adjusted and can handle up to 1024 CPUs. (#46204)
GemStone upgrade creates temporary categories to install the class comment. It was possible for the removal of this category at the end of the upgrade process to error.
On linux, for pstack to work correctly you may be required to update the kernel parameter kernel.yama.ptrace_scope=0; by default, it may be set to 1. This introduced a security hole.
In v3.4, GemStone’s pstack will work with a kernel configuration of kernel.yama.ptrace_scope=1, on distributions using Linux kernel 3.4 or later: Ubuntu 14.04 or later, Redhat 7.x, and SUSE 12. (#46539)
The first gem on a remote node triggers the creation of a remote shared cache, and the creation of a page server on the stone's node. The page server on the stone's node is multithreaded and shared between all gems on that remote node.
If a subsequent gem on that remote node fails to connect to the shared page server on the Stone's node within the timeout of 20 seconds, previously it would create a private pageserver. This could result in excessive page servers on the stone's node. (#46946)
Two new configuration parameters have been added to address this behavior:
STN_GEM_PGSVR_CONNECT_TIMEOUT provides control over the timeout specifically for the connection between the remote gem and the page server on the stone's node.
STN_GEM_PRIVATE_PGSVR_ENABLED, if false, prevents the remote gem from starting a private page server if the connection to the shared page server fails or times out. In this case, the remote gem's login would fail.
The connection between the gem's pgsvr on the mid cache and the gem's pgsvr on the stone cache uses a random port number; it should connect to the well known port number for the pgsvr on the stone cache. (#46382)
There are code paths in which page manager thread processing can be delayed based on the cache timeout. This can result in operations such as stopZombieSession: to take an unreasonably long time for a session on an overloaded remote host. (#46956)
When a remote cache died, the multithreaded page server for that host on the stone’s node was not entirely cleaned up. Entries for the page servers continues to use a slot in the shared page cache monitor client table, although they did not have a process table entry. (#47117)
Sending a message to the results of the private primitive method Object >> _primitiveAt: has a risk of SEGV with instances of internal, hidden classes LargeObjectNode or NscNode under some specific circumstances. (#47107)
The new class PrivateObject is now the superclass for internal hidden classes, and sending messages to PrivateObjects other than those implemented in PrivateObject will signal a MessageNotUnderstood rather than SEGV or other undesireable behavior. See PrivateObject.
GsExternalSession >> lastResult is used to fetch the results of execution. lastResult previously fetched the result from the session that had the most recent previous access, which in an environment with multiple instance of GsExternalSession performing work, could be a different session than the receiver. (#47021)
When the GEMSTONE_NRS_ALL was set to a value that included a #dir:%D, and the NetLDI was started in that environment using the -D argument, it errored and did not start the NetLDI. (#47126)
The results of allSelectors could return duplicate symbols, if the superclass implemented the same method. (#46621)
If UserGlobals dictionary was not present, errors occurred on several methods, including Behavior>>methodCategories invoked by GBS browsers. The correction is in the underlying invocation of GsPackagePolicy currentOrNil. (#46478)
If an object is in an IdentityBag or IdentitySet with more than about 1015 or 2030 elements, respectively, a listReferences: or fastListReferences: operation did not detect the reference. (#46645)
If a large UnorderedCollection (NSC) contained an object that referenced a search object, but did not contain the search object directly, the results of a findReferences: or fastFindReferences: could still have included the NSC, in addition to the correct referencing object. (#47187)
Some indexing processes incremented the ProgressCount statistic but initialized IndexProgressCount. (#45609)
It was possible for the method ExecBlock >> selfValue to return an out of range error, rather than an object or nil. This method is invoked to get process frame contexts by debugger methods, and for GsDevKit continuations contexts. (#46661)
Instances of Classes that are defined as DbTransient can be persisted, but their instance variable data is not written to disk. When setting the objectSecurityPolicy of a committed DbTransient object, the change to the security policy was not visible outside of the session that made the change. (#46655)
The handling of commit transactionConflict keys did not correctly handle synchronized commit failures; the commitResult key was incorrectly used as the conflict details key in recent versions. (#46768)
Invoking GsSecureSocket >> useCertificateFile:withPrivateKeyFile: privateKeyPassphrase:, with a keyfile that did not require a passphrase and a nil privateKeyPassphrase argument, resulted in a prompt to stdin for the passphrase from within OpenSSL code. (#46913)
If the last line of a configuration file did not include an end of line indicator (CR or LF), that line was ignored when reading the configuration file. (#46716)
Before a reclaim operation, the configuration settings are saved, and restored after the reclaim is complete. The ReclaimGem was saving and restoring the complete set of GsUser parameters, not just the ones for Reclaim, which had a risk of overwriting any updates to AdminGem settings. (#46273)
The stone configuration file that is used for extent names must be writable, to allow extents to be added programmatically without creating inconsistency. To avoid risk, the stone should not startup if the configuration file is read-only. This situation is not handled correctly for all cases where configuration files are passed in using the -e and/or -z argument. The problems include unclear error messages and starting up but erroring if an extent is added. (#47054)
If a String ends with characters with codePoint zero, and the repository is in Unicode Comparison Mode, hash was computed incorrectly. (#46932)
The primitive failure handling code for the at: anIndex argument was incorrect, resulting in a meaningless error message. (#46537)
When sending String >> withAll: with an argument of some particularly structured DoubleByteString argument, codePoints in the result were truncated to less than 256, and the result was an instance of String. (#46879)
If an instance of Utf8 includes encoded Characters with codePoints in the range 128..255, Utf8 >> decodeToString produced a DoubleByteString instead of a String. (#46877)
Invoking the method String>>findPatternNoCase:startingAt:, with one of the pattern arguments an instance of Unicode16 or other Unicode string class, resulted in an error if the repository was not in Unicode Comparison Mode. (#46975)
Symbols containing non-alphanumeric characters (other than underscore) normally require quoting, but this rule does not apply to legal binary selectors (which may contain only non-alphanumeric characters). Binary selectors with more than one character that included the $- character incorrectly required quoting to evaluate. (#46603)
When the repository is in Unicode Comparison Mode (StringConfiguration is Unicode16), GsFile methods that return file names outside the ASCII range should decode the file names from UFT8 into Unicode strings. The method GsFile >> contentsAndTypesOfDirectory:onClient: did not do this correctly when onClient: was false. (#46894)
The searchlogs script returned all entries when the sessionid was used for a filter.
It also did not accept IPv4 addresses for the client filter. (#44458)
The argument for GsSocket >> read: is now required to be greater than zero. An argument of zero, which previously returned nil (although no error string was set), will now signal an ArgumentError. (#42322)
This method incorrectly used a 1-based offset for a 0-based C array. (#46919)
The system manages reclaim activity vs. free space, by relying on these two settings, to avoid using up all free space when performing reclaim. However, if these two settings are set inappropriately, the system can get stuck where it cannot acquire free space by performing reclaim. In v3.4, the system will generate an error if you attempt to set such a configuration, and in cases where these checks are bypassed, will print a warning in the reclaim gem log.
If a transaction log was manually gzipped before being transmitted (e.g. while the logsender and logreceiver were not connected), on reconnect the transmit would error. Now, manually gzipped tranlogs are read by the logsender. (#46284)
If transaction logs written with record-level compression (using copydbf -c or after being transmitted to the slave by the logreceiver) were manually gzipped, these .gz files were not usable by restore or copydbf. (#46213)
When a Float in the range of SmallDouble was passivated then reactivated, the result was a SmallDouble rather than the original Float. (#44082)