6. Running GemStone

Previous chapter

Next chapter

This chapter shows you how to perform some common GemStone/S 64 Bit system operations:

Starting the GemStone Server
How to startup the GemStone repository, and troubleshooting Stone startup failures.

Starting a NetLDI
How to start a NetLDI, and troubleshooting NetLDI startup failures.

Starting a GemStone Session
How to login and troubleshooting login failures

Shutting Down Sessions, the Object Server, and NetLDI
How to stop sessions and shut down server processes

Logins without a stone running
How to login a solo session in the absence of a Stone

Recovering from an Unexpected Shutdown
Troubleshooting unexpected shutdowns.

6.1  Starting the GemStone Server

In order to start a Stone repository monitor, the following must be identified through your operating system environment:

The GEMSTONE environment variable must point to the directory where GemStone is installed, such as /users/gemstone. The directory $GEMSTONE/bin should be in your search path for commands.

The repository monitor must find a configuration file. The default is $GEMSTONE/data/system.conf. Other files can supplement or replace the default file; for information, see How GemStone Uses Configuration Files.

The configuration file must supply the path to one or more repository files (extents) and to the location/s to read and write transaction logs. The default configuration file specifies $GEMSTONE/data/extent0.dbf for the extent file, and places transaction logs in $GEMSTONE/data/. For further information, see Choosing the Extent Location.

To Start GemStone

Follow these steps to start GemStone following installation or an orderly shutdown. (To recover from an abnormal shutdown, refer to Recovering from an Unexpected Shutdown.)

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.6.0-x86_64.Linux (depending on your platform). For example:

$ GEMSTONE=/users/GemStone64Bit3.6.0-x86_64.Linux
$ export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of these environment variables:

  • GEMSTONE
  • GEMSTONE_SYS_CONF
  • GEMSTONE_EXE_CONF
  • GEMSTONE_NRS_ALL

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. There is one version for users of the Bourne and Korn shells and another for users of the C shell. These scripts also set your man page path to include the GemStone man pages. Note that these scripts append to the end of your path or man path; you will need to manually remove references to older versions of GemStone.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 3. Start GemStone by using the startstone command:

% startstone [gemStoneName]

where gemStoneName is optional and is the name you want the repository monitor to have. The default name is gs64stone. If using encrypted extents, the arguments to supply the keys and passphases are required. See startstone for additional information.

To Troubleshoot Stone Startup Failures

If the Stone repository monitor fails to start in response to a startstone command, it’s likely that the cause is one of the following. Inspect the Stone log for clues (the default location is $GEMSTONE/data/gs64stone.log).

  • The GemStone key file is missing or invalid.
  • The shared page cache cannot be attached, usually do to OS limits or a previous shutdown did not complete cleanly.
  • A problem with the extents: a missing extent, one that is in use by another process, the current user does not have write permission, or the authorization for encrypted extents failed.
  • A problem with transaction logs: a log needed for recovery is missing, or the log directory or device does not exist.
  • The repository has become corrupted.

Missing or Invalid Key File

The Stone repository monitor must be able to read the GemStone key file. By default, this is $GEMSTONE/sys/gemstone.key. The location and filename can be configured by the KEYFILE configuration parameter.

Ordinarily, you create the key file during installation from information provided by GemStone. Be careful to enter the information correctly. GemStone key files are platform-specific, and key files for earlier versions may not work with new major releases.

If you do not have a valid key file, contact GemStone Technical Support as described under Technical Support.

Shared Page Cache Cannot Be Attached

The shared page cache monitor must be able to create and attach to the shared memory segment that will serve as the shared page cache. Several factors may prevent this from happening:

  • On some platforms, shared memory is not enabled in the kernel by default, or its default maximum size is too small to accommodate the GemStone configuration. GemStone’s default configuration requires a shared memory segment somewhat larger than 75 MB. For specifics about configuring shared memory, refer to the GemStone/S 64 Bit Installation Guide for your platform.
  • If the size of the shared page cache has been increased, the operating system’s limit on shared memory regions may need to be increased accordingly. GemStone includes a utility, $GEMSTONE/install/shmem, that will help you check the configuration; this is described here.
  • The repository executables (the Stone, Gems, and Page servers) must have permission to read and write the shared page cache. Ways to set up access are described in How To Set Up a Raw Partition. In general, users must belong to the same group as the Stone repository monitor. If the Stone is running as root, it is unlikely that other users will be able to access the shared page cache.

Extent Missing or Access Denied

If the Stone repository monitor cannot access a repository extent file, it logs a message like the following:

GemStone is unable to open the file $GEMSTONE/data/extent0.dbf 
       reason = File does not exist 
 
    An error occurred opening the repository for exclusive access.
    Stone startup has failed.

The reason should provide enough information: the extent file could be missing, the permissions on the file or directory could be set incorrectly, or there may be an error in the configuration file that points to the extents. Correct the problem, then try starting GemStone again.

Extent Open by Another Process

If another process has an extent file open when you attempt to restart GemStone, a message like the following appears in the Stone log (by default, $GEMSTONE/data/gs64stone.log):

GemStone is unable to open the file $GEMSTONE/data/extent0.dbf 
       reason = exclusive open:  File is open by another process. , file /gshost/GemStone3.6/data/extent0.dbf,  failed with  EWOULDBLOCK 
 
    An error occurred opening the repository for exclusive access.
    Stone startup has failed.

Close any other Gem sessions (including Topaz sessions) that are accessing the repository you are trying to restart. Use ps -ef (the options on your system may differ) to identify any pgsvrmain processes that are still running, and then use kill processid to terminate them. Try again to start GemStone.

Extent Already Exists

If GemStone attempts to recover from a system crash that occurred just after an extent was created, and GemStone was not able to write a checkpoint when the extent was added, you will find an error message like the following in the Stone log:

Repository was not shutdown cleanly, recovery needed.
    fileName !@::1#netldi:51234#dbf!/gshost/GemStone3.6/data/extent1.dbf
      already exists, delete it and restart recovery.
    Stone startup has failed.

Check that an extent was being added to the repository at or shortly before the crash. If necessary, look for a message near the end of the Stone log file.

  • If an extent was being added, there is no committed data in the extent file yet. Delete the specified file and do not replace it with anything. Try to start GemStone again. The recovery procedure will recreate the extent file.
  • If an extent was NOT being added, it is possible that an existing extent has been corrupted. For instance, extent0.dbf of a multiple-extent repository may have been overwritten. Try to determine the cause and whether the action can be rectified. You may have to restore the repository from a backup.

Other Extent Failures

At startup, the GemStone system performs consistency checks on each extent listed in DBF_EXTENT_NAMES.

All extents must have been shut down cleanly with a repository checkpoint the last time the system was run. This consistency check is the only one for which GemStone attempts automatic recovery.

The following consistency checks, if failed, cause the startup sequence to terminate. These failures imply corruption of the disk or file system, or that the extents were modified at the operating system level (such as by cp or copydbf) outside of GemStone’s control and in a manner that has corrupted the repository.

  • Extents must be in proper sequence within DBF_EXTENT_NAMES.
  • Extents must be properly sequenced in time.
  • The last checkpoint must have occurred earlier than or at the same time as the current system time (in GMT).
  • Extents must belong to the correct repository.

Transaction Log Missing

If GemStone cannot find the transaction log file for the period between the last checkpoint and an unexpected shutdown, it puts a message like this in the Stone log:

Extent 0 was not cleanly shutdown.
<Repository startup statistics>
 
Repository startup from checkpoint fileId 2 blockId 16, needs recovery
 
ERROR: cannot find log file(s) to recover repository.
To proceed without tranlogs and lose transactions committed
since the last checkpoint use "-N" switch on your startstone
command.
 
An error occurred when attempting to start repository recovery.
Waiting for aiowrites to complete
 
Stone startup has failed.

If the log file was archived and removed from the log directory, restore the file.

If the log file is no longer available, you can use startstone -N to restart from the most recent checkpoint in the repository. However, any transactions that occurred during the intervening period cannot be recovered.

NOTE
When you use startstone with the -N option, any transactions occurring after the last checkpoint are permanently lost.

Other Startup Failures

  • Check /opt/gemstone/locks (or equivalent location, as discussed here) and delete old lock files. On Solaris systems, also check /tmp/gemstone for stoneName..FIFO.
  • Certain unexpected shutdowns may leave UNIX interprocess communication facilities allocated, which can block attempts to restart the repository monitor. Use the command ipcs to identify the shared memory segments and semaphores allocated, then use ipcrm to free those resources allocated to a repository monitor that is no longer running. For information about ipcs and ipcrm, consult your operating system’s documentation.
  • If it takes more than 5 minutes for your cache to complete initialization, the startup timeout may be expiring. Set the environment variable $GEMSTONE_SPCMON_STARTUP_TIMELIMIT.
  • Check your installation configuration and make sure that all required files and libraries are present and uncorrupted.
  • Try to run pageaudit on the repository. (See Repository Page and Object Audit.)

If you are still unable to start GemStone or determine the reason that startup is failing, contact your local GemStone administrator or GemStone Technical Support.

If this is an existing GemStone repository and the problems reported on startup attempts indicate that the repository is corrupt, you may need to restore from backups, as described in Chapter 11. See “How to Restore from Backup.

Listing Running Servers

The gslist utility lists all Stone repository monitors, shared page cache monitors, and NetLDIs that are running. The gslist command by itself checks the locks directory (/opt/gemstone/locks, /usr/gemstone/locks, or $GEMSTONE_GLOBAL_DIR/locks) for entries. The -v option causes it to verify that each process is alive and responding. For example:

% gslist -v
Status Version Owner     Started      Type   Name
------ ------- --------- ------------ ------ ----
 OK   3.6.0    gsadmin   Aug 24 12:02 cache  gs64stone~1c9fa07f0412665
 OK   3.6.0    gsadmin   Aug 24 12:02 Stone  gs64stone
 OK   3.6.0    gsadmin   Aug 24 10:13 Netldi gs64ldi
 

By default, gslist lists servers on the local node. The -m host option performs the operation on node host, which must have a compatible NetLDI running.

6.2  Starting a NetLDI

You will usually need to start a GemStone Network Long Distance Information (NetLDI) server when starting a Stone repository monitor. NetLDI servers are needed to start up Gem processes for RPC logins, and for starting up caches on behalf of Gems that are on other nodes.

If you are running distributed configurations, you will need to perform these steps on each node that requires a NetLDI.

To start a NetLDI server, perform the following steps on the node where the NetLDI is to run:

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.6.0-x86_64.Linux (depending on the platform). For example:

$ GEMSTONE=/installDir/GemStone64Bit3.6.0-x86_64.Linux
$ export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of the $GEMSTONE_NRS_ALL environment variable

Step 2. Use one of the gemsetup scripts to set your UNIX path. There is one version for users of the Bourne and Korn shells and another for the C shell. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 3. Start the NetLDI by using the startnetldi command.

% startnetldi
% startnetldi -g -aname

See startnetldi for additional command arguments and further detail. For information about the authentication modes, see under Configuration Decisions.

To Troubleshoot NetLDI Startup Failures

If the NetLDI service fails to start in response to a startnetldi command, it’s likely that the cause is one of the following:

  • The NetLDI is to run as root but the guest mode option is specified. This combination is not allowed.
  • The account starting the NetLDI does not have permission to create or append to its log file.
  • The account starting the NetLDI does not have read and execute permission for $GEMSTONE/sys/netldid.

Check the NetLDI log for clues. By default, the NetLDI log (netLdiName.log) is located in /opt/gemstone/log/. On some systems, this file may be located in /usr/gemstone/log/, and may be overridden using the startnetldi -l argument, or by setting $GEMSTONE_GLOBAL_DIR.

6.3  Starting a GemStone Session

This section tells how to start a GemStone session and log in to the repository monitor. The instructions apply to all logins from the node on which the Stone repository monitor is running.

This section begins with a brief discussion of environmental variables, and then presents two examples. The first example starts a linked application and logs in to GemStone. The second example starts an RPC application, which in turn spawns a separate Gem session process that communicates with the GemStone server.

The examples use Topaz as the application because it is part of the standard GemStone Object Server distribution. Other applications may use different steps to accomplish the same purpose. Some users may prefer to make these steps part of an initialization file.

For an explanation of the difference between linked and RPC sessions, see Linked and RPC Applications.

To Define a GemStone Session Environment

In order to start a GemStone session, the following must be defined through your operating system environment:

  • Where GemStone executables and libraries are installed.

All GemStone users must have a GEMSTONE environment variable that points to the GemStone installation directory, such as
/installDir/GemStone64Bit3.6.0-x86_64.Linux (depending on your platform). The directory $GEMSTONE/bin should be in your search path for commands.

  • Which configuration parameters to use.

While system defaults, or a system-wide configuration file, can be used to configure Gem sessions, you may want to configure individualized environments and configuration files for specific sessions. This may involve setting an environmental variable, such as GEMSTONE_EXE_CONF. For further information, see How GemStone Uses Configuration Files.

To Start a Linked Session

The following steps show how to start a linked application (here, the linked version of Topaz). The steps for setting the GEMSTONE environment variable and the operating system path for a session are the same as those given here for starting a repository monitor. They are repeated here for convenience.

The procedure assumes that the Stone repository monitor has already been started and has the default name gs64stone.

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.6.0-x86_64.Linux (depending on your platform). For example:

$ GEMSTONE=/installDir/GemStone64Bit3.6.0-x86_64.Linux
$ export GEMSTONE

If you have been using another version on GemStone, be sure you update or delete previous settings of these environment variables:

  • GEMSTONE
  • GEMSTONE_SYS_CONF
  • GEMSTONE_EXE_CONF
  • GEMSTONE_NRS_ALL

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. There is one version for users of the Bourne and Korn shells and another for users of the C shell. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 3. Start linked Topaz:

% topaz -l

Step 4. Set the UserName login parameter:

topaz> set username DataCurator

Step 5. Log in to the Gem session. It will query you for the password.

topaz> login
GemStone Password?
[Info]: LNK client/gem GCI levels = 36000/36000
--- 08/24/20 15:16:35.199 PDT Login
[Info]: User ID: DataCurator
[Info]: Repository: gs64stone
[Info]: Session ID: 5
[Info]: GCI Client Host: <Linked>
[Info]: Page server PID: -1
[Info]: using libicu version 58.2
[Info]: Gave this process preference for OOM killer: wrote to
/proc/26091/oom_score_adj value 250
[08/24/20 15:16:35.762 PDT]
  gci login: currSession 1  linked session 
successful login
topaz 1> 

At this point, you are logged in to a Gem session process, which is linked with the application. The session process acts as a server to Topaz and as a client to the Stone. Information about Topaz is in the manual GemStone Topaz Programming Environment.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz in one step by invoking the Topaz exit command:

topaz 1> exit

To Start an RPC Session

The following steps show how to start an RPC application (here, the RPC version of Topaz) on the server node. The procedure assumes that the Stone is running under the default name gs64stone and that you are already set up to run a GemStone session as described in Step 1 and Step 2 of the previous example (“To Start a Linked Session”).

Sessions that login RPC use SRP (Secure Remote Password) and SSL to authenticate passwords for login. If the Gem is running on the server node, the connection reverts to normal socket communication after login completes.

The following steps demonstrate an RPC login from topaz:

Step 1. Use gslist to find out if a NetLDI is already running. The default name for the NetLDI is gs64ldi.

% gslist
Status Version  Owner      Started    Type  Name
------ -------- --------- ------------ ------ ----
exists 3.6.0    gsadmin   Aug 24 12:02 cache  gs64stone~1c9fa07f041665
exists 3.6.0    gsadmin   Aug 24 12:02 Stone  gs64stone
exists 3.6.0    gsadmin   Aug 24 10:13 Netldi gs64ldi
 

If necessary, start a NetLDI following the instructions under Starting a NetLDI.

Step 2. Start the RPC application (such as Topaz), then set the UserName.

topaz> set username DataCurator

Step 3. Unless the NetLDI is running in guest mode with a captive account, set the application login parameters, such as HostUserName and HostPassword, after you start the application. For example:

topaz> set hostusername yourUnixId
topaz> set hostpassword yourPassword

Step 4. Set GemNetId (the name of the Gem service to be started) to gemnetobject. This script starts the separate Gem session process for you. For example:

topaz> set gemnetid gemnetobject

Step 5. Log in to the GemStone session.

topaz> login
GemStone Password?
[08/24/20 15:16:35.762 PDT]
  gci login: currSession 1 rpc gem processId 6943 socket 6
successful login
topaz 1> 

At this point, you are logged in through a separate Gem session process that acts as a server to Topaz RPC and as a client to the Stone repository monitor.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz by in one step by invoking the Topaz exit command:

topaz 1> exit

To Troubleshoot Session Login Failures

Several factors may prevent successful login to the repository:

  • Your GemStone key file may establish a maximum number of user sessions that can simultaneously be logged in to GemStone. (Note that a single user may have multiple GemStone sessions running simultaneously.) The limit itself is encoded in the keyfile used to start the stone (by default, $GEMSTONE/sys/gemstone.key), and reported in the stone log on startup. Look for a line like this:
SESSION MAX: The licensed concurrent session max is 10.
  • The STN_MAX_SESSIONS configuration option can restrict the number of logins to fewer than a particular key file allows. An entry in the Stone log file shows the maximum at the time the Stone started. Look for a line like this:
SESSION CONFIGURATION: The maximum number of concurrent sessions is 40
  • The SHR_PAGE_CACHE_NUM_PROCS configuration option restricts the number of sessions that can attach to a particular shared page cache. This is normally computed based on the setting for STN_MAX_SESSIONS.

Multi-threaded operations use additional slots for their working threads while they are executing. If you are close to your session limit, these operations may prevent other sessions from logging in.

  • The UNIX kernel must provide sufficient semaphores and file descriptors for each logged in session. See your Installation Guide for information on UNIX kernel tuning that may be necessary.
  • The owner of the Gem or a linked application process must have write access to the extent file and to the shared page cache. Use the UNIX command ipcs -m to display permissions, owner, and group for shared memory. For example:
server% ipcs -m
 
------ Shared Memory Segments --------

key
0xc8010015

shmid
278462466

owner
gsadmin

perms
660

bytes
132120576

nattch
5

status

Typical problems occur with linked applications, which may be installed without the S bit and therefore rely on group access to the shared page cache and the repository.

Identifying and Stopping Logged-in Sessions

Privileges required: SessionAccess.

To identify the sessions currently logged in to GemStone, send the message System class>>currentSessionReport. This message returns an array of internal session numbers and the corresponding UserId, executable, and PID. For example:

topaz 1> printit
System currentSessionsReport 
%
2 SymbolUser symbolgem 32103
3 GcUser admingcgem 32210
4 DataCurator gem 21589 on localhost
5 GcUser reclaimgcgem 32213

The session number can be used with other System class methods to stop a particular session. To get the sessionId for the current executing session, use System class >> session.

To get the UserProfile for a given session, execute:

System userProfileForSession:aSessionId

To get the UserProfile for the current session, execute:

System myUserProfile

The method System class>>descriptionOfSession:aSessionId returns an array of descriptive information, which can be used to find out details information and status for any session. This method returns an Array; the values in each slot are defined as follows:

1. The UserProfile of the session; nil if the UserProfile is recently created and not visible from this session's transactional view or the session is in login or processing, or has logged out.

2. A SmallInteger, the process ID of the Gem or topaz -l process .

3. The hostname of the machine running the Gem process. Specifically, the peer's hostname as seen by stone, for the gem to stone network connection used for login. (a String, limited to 127 bytes).

4. Primitive number in which the Gem is executing, or 0 if it is not executing in a long primitive.

5. Time of the session's most recent beginTransaction, commitTransaction, or abortTransaction (from System timeGmt).

6. The session state (a SmallInteger).

7. A SmallInteger whose value is -1 if the session is in transactionless mode, 0 if it is not in a transaction and 1 if it is in a transaction.

8. A Boolean whose value is true if the session is currently referencing the oldest commit record, and false if it is not.

9. The session's serial number (a SmallInteger).

10. The session's sessionId (a SmallInteger).

11. A String containing the IP address of host running the GCI process. If the GCI application is remote, the peer address as seen by the gem of the GCI application to gem network connection. If the GCI application is linked (using libgcilnk*.so or gcilnk*.dll) this is the peer's IP address as seen by stone, for the gem to stone network connection used for login.

12. The priority of the session (a SmallInteger).

13. Unique host ID of the host where the session is running (an Integer)

14. Time of the session's most recent request to stone (from System timeGmt)

15. Time the session logged in (from System timeGmt)

16. Number of commits which have occurred since the session obtained its view.

17. Nil or a String describing a system or gc gem.

18. Number of temporary (uncommitted) object IDs allocated to the session.

19. Number of temporary (non-persistent) page IDs allocated to the session.

20. A SmallInteger, 0 session has not voted, 1 session voting in progress, 2 session has voted, or voting not active.

21. A SmallInteger, processId of the remote GCI client process, or -1 if the session has no remote GCI client.

22. The KerberosPrincipal object used for passwordless login to the session, or nil if passwordless login was not used.

23. The sessionId of the hostagent session through which this session is communicating to stone, or -1 if session is not using a hostagent.

24. SmallInteger listening port if this session is a hostagent, or -1.

Refer the image method comment for the most recent details as elements are added at the end of the array.

6.4  Shutting Down Sessions, the Object Server, and NetLDI

Stopping Logged-in Sessions

Privileges required: SessionAccess and SystemControl

There are a number of methods on System class that can be used to stop a specific session, or all sessions:

stopSession: aSessionId
Stop the specified session; any transactions that the session was in are aborted, and the session is terminated. This method does not stop the GcGems or SymbolGem.

terminateSession: aSessionId timeout: timeoutSeconds
Stop the specified session; any transactions that the session was in are aborted, and the session is terminated. Waiting up to timeoutSeconds for the session to complete terminating before returning. This method can be used to stop the GcGems. but not the SymbolGem.

stopUserSessions
Stops all sessions other than system Gems; does not stop the GcGems nor SymbolGem. Any transactions that any of the sessions were in are aborted.

NOTE
Be aware that it may take as long as a minute for a session to terminate after you send stopSession:.If the Gem is responsive, it usually terminates within milliseconds. However, if a Gem is not active (for example, sleeping or waiting on I/O), the Stone waits one minute for it to respond before forcibly logging it out. You can bypass this timeout by sending terminateSession:timeout:

To verify all user sessions have logged out or been terminated, send the message currentSessionNames to System. For example, using Topaz:

topaz 1> printit
System currentSessionNames 
%
session number: 2    UserId: GcUser
session number: 3    UserId: GcUser
session number: 4    UserId: SymbolUser
session number: 5    UserId: DataCurator

The SymbolUser and GcUser sessions are system session and will be shut down cleanly when the stone is shut down. The above example includes session 5, which is the user executing the example code.

Stopping the Stone

After all user sessions have logged out, use the stopstone command, which performs an orderly shutdown in which all committed transactions are written to the extent files.

% stopstone [StoneName] [gemstoneUserName] [gemstoneUserPassword] [-i]

If you do not supply the name of the Stone repository monitor, GemStone username, or password, stopstone prompts for this information. The user must have the SystemControl privilege (initially, this privilege is granted to SystemUser and DataCurator).

The -i option aborts all current (uncommitted) transactions and terminates all active user sessions. If you do not specify this option and other sessions are logged in, GemStone will not shut down and you will receive a message to that effect.

Stopping the NetLDI

There is a similar command to shut down the NetLDI network service.

% stopnetldi [netLdiName]

For more information, see the command reference in Appendix B; stopstone and stopnetldi.

If you are logged in to a GemStone session, you can invoke System class>>shutDown, which also requires the SystemControl privilege.

Using OS kill

If you must halt a specific Gem session process or GemStone server processes, be sure to use only kill or kill -term so that the Gem or other process can perform an orderly shutdown.

kill -usr1 will not kill the process, but will cause a GemStone process to write its C and Smalltalk call stacks to the process log file. For linked logins, which do not have a separate process, the stack is written to the application’s stdout.

Do NOT use kill -9 or another uncatchable signal, which does not result in a clean shutdown, unless it is unavoidable. On some platforms, particular failures in disk I/O can result in a process that does not respond to kill.

If for some reason you do need to send kill -9 to a shared page cache monitor, use ipcs and ipcrm to identify and free the shared memory and semaphore resources for that cache. If you send kill -9 to a Stone, use ipcs to determine whether ipcrm should be invoked.

Handling “Zombie” Sessions

Very rarely, an unexpected error can occur that leaves a Gem in an unresponsive state, where it is not shut down in by a stopSession: or similar method. These are often referred to as a “zombie” sessions. The actual cause and symptoms of a zombie session can vary widely. If you encounter issues with a zombie session, check for bugnotes, and contact GemTalk Technical Support for further diagnosis.

A session may be unresponsive for short periods during certain types of execution. This is normal, and not a cause for concern.

  • A session that has encountered an error and is waiting for a debugger to attach is not a true zombie, but may require using kill to terminate.
  • It may be possible to cleanup a zombie session by using kill -TERM on the Gem or linked process.
  • The method System stopZombieSession: aSessionId bypasses some safeguards in stopSession: and may allow the session to complete logout.

6.5  Logins without a stone running

Read-only GemStone operations can be performed when a Stone is not running, by using a "solo" session. This makes it simple to set up Smalltalk-based scripting without needing to configure or start a Stone. More details on scripting is provided in the Topaz Users Guide.

Solo logins require access to an extent file, which can be the read-only empty distribution extent. You may also use an extent containing application code, data, or other modifications, provided the following are true for the repository extent:

The configuration parameter GEM_SOLO_EXTENT specifies the extent file to be used by a Solo session. This defaults to the clean, read-only extent within the distribution, $GEMSTONE/bin/extent0.dbf.

Methods that require a connection to a Stone are disallowed in a Solo session; this includes a number of methods in System class and Repository. For example, methods such as markForCollection, reclaimAll, and methods that make and restore backups all require a running Stone. Attempting to execute these methods in a Solo session results in an ImproperOperation Error (#2050).

Solo login from topaz

To login Solo from topaz linked or RPC, execute set solologin on, then login.

For example:

topaz> set solologin on
topaz> set username DataCurator password swordfish
topaz> login
[08/24/20 15:16:35.762 PDT]
  gci login: currSession 1  rpc gem processId 20617 socket 6
    ReadOnly session
[Info]: Read-Only Repository:
    /gshost/GemStone3.6/bin/extent0.dbf
successful Solo login
topaz 1> 

The username and password are required; the setting for gemstone is not used. In topaz RPC, you may perform a solo login while also logged into a GemStone Stone, provided the extent file used by the Solo session (by default, $GEMSTONE/bin/extent0.dbf) is not in use.

Object creation and memory use

Each Solo RPC or linked Gem also opens a 10MB read-write temporary file, /tmp/gemRO_pid_extent1.dbf, which is deleted on logout or process exit.

Object creation in a Solo session is limited to temporary object memory, but you may create objects as needed up to the limit of memory. To ensure there is sufficient memory, you may:

  • Set a larger value for GEM_TEMPOBJ_CACHE_SIZE in the configuration file used by the topaz or Gem session.
  • For linked sessions, use -T cachesize on the topaz command line.
  • For RPC sessions, include -T cachesize in the NRS gemnetid login parameter.

Solo sessions other than from topaz

When the GCI flag #GCI_LOGIN_SOLO is used in the login parameters, any GCI application may create a solo login.

6.6  Recovering from an Unexpected Shutdown

GemStone is designed to shut down in response to certain error conditions as a way of minimizing damage to the repository. If GemStone stops unexpectedly, it probably means that one of the following situations has occurred:

GemStone may also shut down if it runs out of extent or transaction log disk space. These are purposeful shutdowns, but since GemStone cannot perform the final writes to disk, it will require recovery on restart. Handling out of space issues is described under Recovering from Disk-Full Conditions.

When GemStone shuts down unexpectedly, check the message at the end of the Stone log file to begin diagnosing the problem. By default, the Stone log is $GEMSTONE/data/gemStoneName.log, but there are a number of ways that this can be configured. The names and locations of the Stone and other process log files is described under GemStone Process Logs.

Once the problem is identified, your recovery strategy should take into account the interdependence of GemStone system components. For instance, if an extent becomes unavailable, to restart the system and recover you may have to kill the Stone repository monitor if it is still running. The stopstone command won’t work in this situation, since the orderly shutdown process requires the Stone to clean up the repository before it stops.

Clean Shutdown Message

If you see a shutdown message in the system log file, GemStone has stopped in response to a stopstone command or a Smalltalk System shutdown method, or in response to a kill -TERM:

--- 08/24/20 15:16:35.056 PDT ---
    Starting checkpoint for clean shutdown.
    Waiting for all tranlog writes to complete before shutdown.
    <other shutdown messages>
    Waiting for NetWrite thread to stop
    Waiting for Page Manager thread to stop
 
--- 08/24/20 15:16:35.961 PDT ---
    Now stopping GemStone.

After a clean shutdown, restart GemStone in the usual manner. For instructions, see Starting the GemStone Server of this chapter.

Disk Failure or File System Corruption

GemStone prints several different disk read error messages to the GemStone log file. For example:

Repository Read failure,
fileName = !#dbf!/gshost/GemStone3.6/data/extent0.dbf
PageId = 94
File = /gshost/GemStone3.6/data/extent0.dbf
too few bytes returned from read()
DBF Operation Read; DBF record 94, UNIX codes: errno=34,...
	"A read error occurred when accessing the repository."

If you see a message similar to the above, or if your system administrator identifies a disk failure or a corrupted file system, try to copy your extents to another node or back them up immediately. The copies may be bad, but it is worth doing, just in case. If you’re lucky, you may be able to copy them back after the underlying problem is solved and start again with the current committed state of your repository.

Otherwise, you may need to restore the repository. For details, see the restore procedures in Chapter 11.

Shared Page Cache Error

If you find a message similar to the following in the GemStone log, the shared page cache (SPC) monitor process (shrpcmonitor) died. The SPC monitor log, $GEMSTONE/data/gemStoneName_pcmonnnnn.log,may indicate the reason.

--- 08/24/20 15:16:35.762 PDT ---
    The stone’s connection to the local shared cache monitor was lost.
    Error Text: ’Network partner has disconnected.’

The unexpected shutdown of a Gem process may, in rare cases, result in a “stuck spin lock” error that brings down the shared page cache monitor and the Stone. GemStone uses spin locks to coordinate access to critical structures within the cache. In most cases, the monitor can recover if a Gem dies while holding a spin lock, but not all spin locks can be recovered safely. Stuck spin locks may result from a Gem crash, but a typical cause is the use of kill -9 to kill an unwanted Gem process. If you must halt a Gem process, be sure to use only kill or kill -TERM so that the Gem can perform an orderly shutdown.

Use startstone to restart GemStone. For instructions, see Starting the GemStone Server.

Fatal Error Detected by a Gem

If a Gem session process detects a fatal error that would cause it to halt and dump a core image, the Stone repository monitor may do the same when it is notified of the event. This response on the part of the Stone is configurable through the STN_HALT_ON_FATAL_ERR configuration option. When that option is set to True and a Gem encounters a fatal error, the Stone prints a message like this in its log file:

Fatal Internal Error condition in Gem
   when halt on fatal error was specified in the config file

By default, STN_HALT_ON_FATAL_ERR is set to False. That setting causes the Stone to attempt to keep running if a Gem encounters a fatal error; it is the recommended setting for GemStone in a production system. You can set STN_HALT_ON_FATAL_ERR to True during development and testing to provide additional checks for potential risks.

Some Other Shutdown Message

In the event of other shutdown messages in the GemStone log:

1. Consider whether the shutdown might have been caused by a disk failure or a corrupt file system, especially if you see an unexpected message such as Object not found. If you suspect one of these conditions, start with a page audit of the repository file (see Repository Page and Object Audit).

If the page audit fails, refer to Disk Failure or File System Corruption, and consult your operating system administrator.

If the audit succeeds, continue to the next step.

2. If you don’t suspect disk failure or a corrupt file system, try using startstone to restart GemStone. For instructions, see Starting the GemStone Server.

3. If the restart fails, you may have to restore the repository. For details, see the restore procedures in Chapter 11.

No Shutdown Message

If the GemStone log doesn’t contain a shutdown message, there has probably been a power failure or an operating system crash. In that event, the Stone repository monitor automatically recovers committed transactions the next time it starts. Use startstone to restart GemStone, as described under Starting the GemStone Server. See startstone for more information on this command.

Previous chapter

Next chapter