2. Bug Fixes

Previous chapter

The following bugs were present in v3.7.4.3 and are fixed in this version.

Risk of corruption after restart during large commit record backlog

If the stone is stopped or shuts down while there is a commit record backlog, on restart it disposes of the commit record backlog. If there is a low free space condition in the extent space while it is disposing of the commit record backlog that was found on restart, such that the Stone attempts to grow the extents, this can cause repository corruption. This is a risk mainly when there is a very large commit record backlog and when there are commits before the disposal process has completed. (#51768)

After tranlog full condition, risk of missing logout record

A case was encountered where after a series of tranlog full conditions during which a session terminated in the middle of a commit, the Stone did not recognize that the session was gone, and the oldest tranlog required for restore remained at the one current at the time this session started the commit. No information was lost; the ability to restore from backup, however, became dependent on old tranlog retention. Deficiencies in the Stone’s buffering of these records have been fixed. (#51708)

If Stone killed while opening new tranlog, restart may fail

There is a timing window when the Stone is opening a new tranlog but before it writes the tranlog’s root record, during which if the Stone is killed, restart will fail with "readLastRecord fileId ... cannot find Root record". (#51700)

Risk of SEGV if SIGTERM during commit

When a process is in the middle of a commit and receives a SIGTERM, it may SEGV. (#51604)

Printing the Smalltalk stacks did not handle fatal errors properly

When printing the smalltalk stack using pstack or kill -USR1, if the Gem had a fatal error, such as termination due to a lostOT, this was not handled correctly; gdb may fail to detach, hanging the gem. (#51478).

SIGUSR1 is also now ignored if sent less than 2 seconds since the previous such signal.

Upgrade cleared Deprecated action

The Deprecated class supports automatically writing to a log or erroring if a deprecated method is executed, as well as doing nothing (the default). Upgrade cleared this status to remove a non default deprecated action. Now, upgrading a repository with a non-default deprecated action preserves the deprecated action (#51520)

GsObjectSecurityPolicy dynamicInstanceVariables failed on an upgraded repository

Sending #dynamicInstanceVariables to an instance of GsObjectSecurityPolicy in a repository upgraded from v3.5.6 or earlier reported a corrupt object error. (#51411).

SymbolGem may get commit conflicts in administrative update during user login

When a User with password features enabled logs in, triggers the SymbolGem to update the user’s security data, for example, with the last login time. If an administrative change was made at the same time in another session that also updated security data, this may cause a commit conflict in the SymbolGem. Previously, the SymbolGem restarted. Now, the SymbolGem aborts the transaction. The login succeeds, but this user’s security data is not updated by the login, which could affect account aging or other limits. (#51611).

When stopstone was prompting for input, it did not respond to Control-C

When stopstone is executed without arguments, it prompts for the Stone name, user name, and password. While in the prompt, the process did not respond to control-C to terminate. (#51428)

ProfMonitor profiling object creation may crash gem

When ProfMonitor monitors object creation (using the #objCreation option or report), and there is a scopes overflow, the Gem becomes unusable and will eventually either hard hang the Gem, or get a HostCallDebugger. (#51822)

Small memory leak on host password validation failure

There is a small memory leak in validating a host password when the result is not PAM_SUCCESS. (#51597)

Risk of lostOT during reclaimAll

Repository >> reclaimAll waited for an increasing time when no progress in reclaim was occurred. This created a risk of a lostOT when the ReclaimGem was committing at a high rate. (#51506)

In-memory garbage collection not aggressive enough

Cases have been fixed in which the in-memory GC of perm_gen and code_gen were not aggressive enough, resulting in out of memory issues earlier than necessary. (#51577)

Warming leaf caches could have caused commit record backlog

Cache warming runs in transaction, and when performing a large amount of warming over a slower connection, it could result in a commit record backlog. Now, while still running in transaction, leaf cache warming monitors the commit record backlog and aborts if necessary. (#50103)

With GEM_REMOTE_COMMIT and mid-cache, commits broken

When a remote Gem is using a mid-level cache, and is configured with:
GEM_PGSVR_USE_SSL = true;
GEM_REMOTE_COMMIT=true;

then commits are broken, due to in-memory corruption in the pgsvr for that session. GEM_REMOTE_COMMIT was added in v3.7.4.

Performance of GsSecureSocket secureAccept and close

The performance of GsSecureSocket >> secureAccept and of GsSecureSocket >> close were unreasonably slow. (#51782)

Copydbf on backup could miss reporting oldest tranlog required

copydbf on a programmatic .dbf backup reports the oldest tranlog required to restore this backup. If the oldest tranlog itself required an earlier tranlog, that was not correctly reported by copydbf of the backup file. This may be the case when a single commit spans two tranlogs. The oldest tranlog required for a tranlog is correctly reported by copydbf on that tranlog. (#51703)

Issues related to Hot Standby

Risk of Gem protocol error if session is signaled after logsenderSessionId

After executing logsenderSessionId, interrupt handling may sometimes result in a protocol error, 'Unexpected packet received from Stone'. (#41448)

stoplogreceiver may have silently failed to stop the logreceiver

It is possible for stoplogreceiver to return 0 and does not print an error, but to leave the logreceiver process running. (#51472)

Restarting tranlog restore after a stopContinuousRestore has risk of error requiring Stone restart

If stopContinuousRestore is executed while the Stone thread running recovery is processing a large work queue, there is a risk that a subsequent continuousRestoreFromArchiveLogs:, or restoreFromArchiveLogs:, may fail with a detected Fork in Time error. To continue, the Stone must be restarted. (#51548

GsFile issues

GsFile opening with mode 'ab+' did not allow read positioning.

When a GsFile is opened for read and append, the mode may be specified either 'a+', 'ab+', or 'a+b'. The specification 'ab+' was not handled correctly; the resulting file was open for append but could not be positioned for read. (#51525)

GsFile open with nil path or mode failed to set error message

GsFile operations that error return nil, rather than signalling an error, and put the error message in the server or client error buffer. This was not done for nil argument cases.

GciTsLogin() not thread safe from a GCI main program

There may be initialization errors if multiple C threads in a GCI program call GciTsLogin at the same time. (#51600)

Parser did not always correctly handle true, false, nil, and self as selectors

The ANSI standard states that #true, #false, #nil, #self and #super are disallowed as selectors. These can be used in GemStone, although they are not tested. The parser was unreliable in recognizing #true, #false, #nil, #self as selector tokens. (#51669)

Configuration file update could be incorrect when adding an extent

When an extent is added programmatically, the new DBF_EXTENT_NAMES and DBF_EXTENT_SIZES are automatically written to the config file. When the existing extent size was defined in units of GB, the programmatic argument was incorrectly interpreted as being in GB rather than MB. (#51605)

NRS precedence incorrect for remote Gem

A #dir: or #log: directive in the Stone’s NRS in the login parameters for a remote Gem, did not correctly take precedence over the GEMSTONE_NRS_ALL provided to the remote Gem via the remote NetLDI. (#51507)

If listen on :: fails, Netldi will only listen on localhost

If the listening port provided to startnetldi already was in use by an existing socket, startnetldi succeeded, but was only listening on that port on localhost. (#51651)

Passivate of subnormal LargeIntegers was incorrect

Normally integers that are within the range of SmallIntegers are instances of SmallInteger; however, in some cases in upgraded applications, LargeIntegers may exist that are in the SmallInteger range. The passivated form was incorrect, and errored on activation. (#51625)

roundedHalfToEven sent to NaN or Infinity gets stack overflow

If Number >> roundedHalfToEven is sent to a kind of NaN or Infinity, it failed with a stack overflow. (#51658)

Symbol GC may leave internal buckets without back reference

After very large number of symbols are garbage collected, such that the internal buckets shrink from a large to a small object, the reference back to the KeyValueDictionary may be reset to nil. This is not expected to cause issues; it is caught by AllSymbols audit. It is recommended to audit AllSymbols after Symbol GC. (#51678)

Issues related to continueTransaction

Incorrect conflict results after a failed continueTransaction

When a continueTransaction fails, the methods that report on commit conflicts are incorrect. (#51720)

continueTransaction does not clear a dirty write lock

If an object has a write lock that is dirty, a successful continueTransaction was failing to refresh the view to allow a clean view of the lock. (#51647)

Restore from backup did not clear NotTranloggedGlobals

The NotTranloggedGlobals holds objects that are committed, but for which changes are not recorded in tranlogs. This was not getting cleared during restoreFromBackup. (#51672)

After mid-level cache restarted, gems may fail to connect

If a mid-level cache goes down and restarts via a Gem login, sessions may intermittently fail to reconnect to the mid-level cache. (#51664)

Deleting a UserProfileGroup may fail on upgraded repository with few security policies

A repository upgraded from v2.1 or earlier with fewer than 20 object security policies has nils in the SystemRepository (the collection that holds object security policies) for slots under 20. This causes UserProfileGroup >> deleteGroup: to fail. (#51518)

systemLocksDetailedReport did not handle in-logout sessions

System class >> systemLocksDetailedReport could have reported "session does not exist", if a session logs out during execution of the method. (#51474)

Seaside scripts invoked deleted method

The method Breakpoint>> trappable: was removed in recent releases. This method was in use by several Seaside scripts. The seaside scripts in $GEMSTONE/seaside/bin: startMaintenance, startMaintenance30, startSeaside_FastCGI, and startSeaside30_Adaptor, have been updated. (#51711)

Incorrect FileReference printOn: output

The printed form of a FileReference previously omitted the @ and quotes, and thus could not be used as-is to recreate the FileReference.

PrimitiveNumber cache statistic incorrect for MFC

The statistic for PrimitiveNumber was recorded incorrectly for a process running MFC. (#51636)

Wrong error message for invalid TimeZone lookup

In some configuration, particularly on Windows, using TimeZone named: with an invalid path resulted in an incorrect error such as a GciTransportError or an MNU. (#51679)

Some Page Cache stats not correct on remote caches

Several cache statistics for the Gem and Shrpcmon for pages in or added to the cache are not correct for remote caches. (#51535)

 

Previous chapter