6. Running GemStone

Previous chapter

Next chapter

This chapter shows you how to perform some common GemStone/S 64 Bit system operations:

Starting the GemStone Server
How to startup the GemStone repository, and troubleshooting Stone startup failures.

Starting a NetLDI
How to start a NetLDI, and troubleshooting NetLDI startup failures.

Starting a GemStone Session
How to login and troubleshooting login failures

Shutting Down Sessions, the Object Server, and NetLDI
How to stop sessions and shut down server processes

Logins without a stone running
How to login a solo session in the absence of a Stone

Recovering from an Unexpected Shutdown
Troubleshooting unexpected shutdowns.

6.1  Starting the GemStone Server

In order to start a Stone repository monitor, the following must be identified through your operating system environment:

The GEMSTONE environment variable must point to the directory where the GemStone product is installed. The directory $GEMSTONE/bin should be in your search path for commands.

The repository monitor must be able to find a configuration file that supplies key information. This can be specified by a startstone argument, environment variables, or by using default location, including $GEMSTONE/data/system.conf. For more on how GemStone locates and uses configuration files, see Stone configuration files.

The configuration file must include the path to one or more extents and to the location/s to read and write transaction logs. The default configuration file specifies a single extent at $GEMSTONE/data/extent0.dbf, which must exist and be writable, and places transaction logs in $GEMSTONE/data/. For further information, see Choosing the Extent Location.

To Start GemStone

Follow these steps to start GemStone following installation or an orderly shutdown. (To recover from an abnormal shutdown, refer to Recovering from an Unexpected Shutdown.)

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.7.0-x86_64.Linux (depending on your platform). For example:

$ GEMSTONE=/users/GemStone64Bit3.7.0-x86_64.Linux
$ export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of these environment variables:

  • GEMSTONE
  • GEMSTONE_SYS_CONF
  • GEMSTONE_EXE_CONF
  • GEMSTONE_NRS_ALL

There are other environment variables that can affect the location of log files; see Appendix E for the complete list.

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. There is one version for users of the Bourne and Korn shells and another for users of the C shell. These scripts also set your man page path to include the GemStone man pages. Note that these scripts append to the end of your path or man path; you will need to manually remove references to older versions of GemStone.

(Bourne or Korn shell)
os$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
os$ source $GEMSTONE/bin/gemsetup.csh

Step 3. Start GemStone by using the startstone command:

os$ startstone [gemStoneName] [-z configFile] [-l stonelogfile] <otherOptions>

where gemStoneName is the name you want the repository monitor to have; the default name is gs64stone. -z supplies the repository configuration file and -l specifies the location for the log files written by system processes.

If using encrypted extents, the arguments to supply the keys and paraphrases are required. See startstone for additional information.

To Troubleshoot Stone Startup Failures

If the Stone repository monitor fails to start in response to a startstone command, it’s likely that the cause is one of the following. Inspect the Stone log for clues. The Stone’s log file will be reported in the startup failure error on the command line.

Startup problems reported in the stone log include:

  • The GemStone key file is missing or invalid.
  • The shared page cache cannot be attached, usually do to OS limits or a previous shutdown did not complete cleanly.
  • A problem with the extents: a missing extent, one that is in use by another process, the current user does not have write permission, or the authorization for encrypted extents failed.
  • A problem with transaction logs: a log needed for recovery is missing, or the log directory or device does not exist.
  • The repository has become corrupted.

Missing or Invalid Key File

The Stone repository monitor must be able to read the GemStone key file. By default, this is $GEMSTONE/sys/gemstone.key; if this is not found, on some platforms the web edition keyfile $GEMSTONE/sys/community.starter.key (this keyfile has lower limits for repository size and number of users). The location and filename can be configured by the KEYFILE configuration parameter.

Ordinarily, you create the key file during installation from information provided by GemStone. Be careful to enter the information correctly. GemStone key files are platform-specific, and key files for earlier versions usually do not apply to new major releases.

If you require a key file, contact GemStone Technical Support as described under Technical Support.

Shared Page Cache Cannot Be Attached

The shared page cache monitor must be able to create and attach to the shared memory segment that will serve as the shared page cache. Several factors may prevent this from happening:

  • On some platforms, shared memory is not enabled in the kernel by default, or its default maximum size is too small to accommodate the GemStone configuration. For specifics about configuring shared memory, refer to the GemStone/S 64 Bit Installation Guide for your platform.
  • If the size of the shared page cache has been increased, the operating system’s limit on shared memory regions may need to be increased accordingly. GemStone includes a utility, $GEMSTONE/install/shmem, that will help you check the configuration; this is described here.
  • The repository executables (the Stone, Gems, and Page servers) must have permission to read and write the shared page cache. Ways to set up access are described in How To Set Up a Raw Partition. In general, users must belong to the same group as the Stone repository monitor. If the Stone is running as root, it is unlikely that other users will be able to access the shared page cache.

Extent Missing or Access Denied

If the Stone repository monitor cannot access a repository extent file, it logs a message like the following:

	GemStone is unable to open the file $GEMSTONE/data/extent0.dbf 
       reason = File does not exist 
	An error occurred opening the repository for exclusive access.
	Stone startup has failed.

The reason should provide enough information: the extent file could be missing, the permissions on the file or directory could be set incorrectly, or there may be an error in the configuration file that points to the extents. Correct the problem, then try starting GemStone again.

Extent Open by Another Process

If another process has an extent file open when you attempt to restart GemStone, a message like the following appears in the Stone log:

	GemStone is unable to open the file $GEMSTONE/data/extent0.dbf 
       reason = exclusive open:  File is open by another process. , file /gshost/GemStone3.7/data/extent0.dbf,  failed with  EWOULDBLOCK 
	An error occurred opening the repository for exclusive access.
	Stone startup has failed.

Close any other Gem sessions (including Topaz sessions) that are accessing the repository you are trying to restart. Use ps -ef (the options on your system may differ) to identify any pgsvrmain processes that are still running, and then use kill processid to terminate them. Try again to start GemStone.

Extent Already Exists

If GemStone attempts to recover from a system crash that occurred just after an extent was created, and GemStone was not able to write a checkpoint when the extent was added, you will find an error message like the following in the Stone log:

	Repository was not shutdown cleanly, recovery needed.
    fileName !@::1#netldi:51234#dbf!/gshost/GemStone3.7/data/extent1.dbf
      already exists, delete it and restart recovery.
    Stone startup has failed.

Check that an extent was being added to the repository at or shortly before the crash. If necessary, look for a message near the end of the Stone log file.

  • If an extent was being added, there is no committed data in the extent file yet. Delete the specified file and do not replace it with anything. Try to start GemStone again. The recovery procedure will recreate the extent file.
  • If an extent was NOT being added, it is possible that an existing extent has been corrupted. For instance, extent0.dbf of a multiple-extent repository may have been overwritten. Try to determine the cause and whether the action can be rectified. You may have to restore the repository from a backup.

Other Extent Failures

At startup, the GemStone system performs consistency checks on each extent listed in DBF_EXTENT_NAMES.

All extents must have been shut down cleanly with a repository checkpoint the last time the system was run. This consistency check is the only one for which GemStone attempts automatic recovery.

The following consistency checks, if failed, cause the startup sequence to terminate. These failures imply corruption of the disk or file system, or that the extents were modified at the operating system level (such as by cp or copydbf) outside of GemStone’s control and in a manner that has corrupted the repository.

  • Extents must be in proper sequence within DBF_EXTENT_NAMES.
  • Extents must be properly sequenced in time.
  • The last checkpoint must have occurred earlier than or at the same time as the current system time (in GMT).
  • Extents must belong to the correct repository.

Transaction Log Missing

If GemStone cannot find the transaction log file for the period between the last checkpoint and an unexpected shutdown, it puts a message like this in the Stone log:

Extent 0 was not cleanly shutdown.
<Repository startup statistics>
 
Repository startup from checkpoint fileId 2 blockId 16, needs recovery
 
ERROR: cannot find log file(s) to recover repository.
To proceed without tranlogs and lose transactions committed
since the last checkpoint use "-N" switch on your startstone
command.
 
An error occurred when attempting to start repository recovery.
Waiting for aiowrites to complete
 
Stone startup has failed.

If the log file was archived and removed from the log directory, restore the file.

If the log file is no longer available, you can use startstone -N to restart from the most recent checkpoint in the repository. However, any transactions that occurred during the intervening period cannot be recovered. Any transactions occurring after the last checkpoint are permanently lost.

Other Startup Failures

  • Check /opt/gemstone/locks (or equivalent location, as discussed here) and delete old lock files. On Solaris systems, also check /tmp/gemstone for stoneName..FIFO.
  • Certain unexpected shutdowns may leave UNIX interprocess communication facilities allocated, which can block attempts to restart the repository monitor. Use the command ipcs to identify the shared memory segments and semaphores allocated, then use ipcrm to free those resources allocated to a repository monitor that is no longer running. For information about ipcs and ipcrm, consult your operating system’s documentation.
  • If it takes more than 5 minutes for your cache to complete initialization, the startup timeout may be expiring. Set the environment variable $GEMSTONE_SPCMON_STARTUP_TIMELIMIT.
  • Check your installation configuration and make sure that all required files and libraries are present and uncorrupted.
  • Try to run pageaudit on the repository. (See Repository Page and Object Audit.)

If you are still unable to start GemStone or determine the reason that startup is failing, contact your local GemStone administrator or GemStone Technical Support.

If this is an existing GemStone repository and the problems reported on startup attempts indicate that the repository is corrupt, you may need to restore from backups, as described in Chapter 11. See “How to Restore from Backup.

Listing Running Servers

The gslist utility lists all Stone repository monitors, shared page cache monitors, and NetLDIs that are running. The gslist command by itself checks the locks directory (/opt/gemstone/locks, /usr/gemstone/locks, or $GEMSTONE_GLOBAL_DIR/locks) for entries. The -v option causes it to verify that each process is alive and responding. For example:

os$ gslist -v
Status Version Owner     Started      Type   Name
------ ------- --------- ------------ ------ ----
 OK   3.7.0    gsadmin   Aug 04 12:02 cache  gs64stone~1c9fa07f0412665
 OK   3.7.0    gsadmin   Aug 04 12:02 Stone  gs64stone
 OK   3.7.0    gsadmin   Aug 04 10:13 Netldi gs64ldi
 

By default, gslist lists servers on the local node. The -m host option performs the operation on node host, which must have a compatible NetLDI running.

Cache Warming

When the Stone is first started up, the shared page cache will be empty; pages will be read into the cache from the disk extents as they are accessed by processes. This means that performance will be slower as pages are ready from disk rather than found in memory.

To avoid this temporary performance issue, you may perform cache warming. This can be done by executing the utility startcachewarmer, which is executed after startstone. More conveniently, you can configure cache warming to start automatically when the repository is started, using the configuration parameter STN_CACHE_WARMER_ARGS. Startstone normally then will not return until the cache is warmer; you can adjust the behavior using STN_CACHE_WARMER_WAIT_MODE.

6.2  Starting a NetLDI

You will usually need to start a GemStone NetLDI (Network Long Distance Information) server when starting a Stone repository monitor. NetLDI servers are needed to start up Gem processes for RPC logins, and for starting up caches on behalf of Gems that are on other nodes.

If you are running distributed configurations, you will need to perform these steps on each node that requires a NetLDI.

To start a NetLDI server, perform the following steps on the node where the NetLDI is to run:

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.7.0-x86_64.Linux (depending on the platform). For example:

os$ GEMSTONE=/installDir/GemStone64Bit3.7.0-x86_64.Linux
os$ export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of the $GEMSTONE_NRS_ALL environment variable.

Step 2. Set your UNIX path. You may use one of the gemsetup scripts, gemsetup.sh or gemsetup.csh. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
os$ . $GEMSTONE/bin/gemsetup.sh

(C shell)
os$ source $GEMSTONE/bin/gemsetup.csh

Step 3. Start the NetLDI by using the startnetldi command.

os$ startnetldi
os$ startnetldi -g -aname

See startnetldi for additional command arguments and further detail. For information about the authentication modes, see under Configuration Decisions.

To Troubleshoot NetLDI Startup Failures

If the NetLDI service fails to start in response to a startnetldi command, check the NetLDI log for clues. The log file path will be reported in the startnetldi error message; by default, the NetLDI log (netLdiName.log) is located in /opt/gemstone/log/.

Possible problems include:

  • The NetLDI is to run as root but the guest mode option is specified. This combination is not allowed.
  • The account starting the NetLDI does not have permission to create or append to its log file.
  • The account starting the NetLDI does not have read and execute permission for $GEMSTONE/sys/netldid.

6.3  Running GemStone and NetLDI as services using Linux systemd

You can use Linux’s systemd to administer GemStone and NetLDI as services.

The GemStone distribution includes the following examples, under $GEMSTONE/examples/admin/systemd. Use these as a template with your specific Stone name, path, etc.

netldi.service
netldi.env
gemstone.service
gemstone.env

These examples include the required parameters, with placeholders for the required paths, user names, etc.

6.4  Starting a GemStone Session

This section tells how to start a GemStone session and log in to the repository monitor. The instructions apply to all logins from the node on which the Stone repository monitor is running.

The examples include a linked application and an RPC example, using Topaz as the client application. Topaz running on the same node as the Stone provides the simplest interface. For an explanation of the difference between linked and RPC sessions, see Linked and RPC Applications.

To Define a GemStone Session Environment

In order to start a GemStone session, the following must be defined through your operating system environment:

  • Where GemStone executables and libraries are installed.

All GemStone users must have a GEMSTONE environment variable that points to the GemStone installation directory, such as
/installDir/GemStone64Bit3.7.0-x86_64.Linux (depending on your platform). The directory $GEMSTONE/bin should be in your search path for commands.

  • Which configuration parameters to use.

Gem sessions do not require a configuration file; they can use default values. By default, they will use the system configuration file at $GEMSTONE/data/system.conf, or another default file in a specific location. For more on the defaults, and the options to setup customized configuration files, see Linked Gem configuration files and RPC Gem Configuration Files.

To Start a Linked Session

The following steps show how to start a linked application (here, the linked version of Topaz). The steps for setting the GEMSTONE environment variable and the operating system path for a session are the same as those given here for starting a repository monitor. They are repeated here for convenience.

The procedure assumes that the Stone repository monitor has already been started and has the default name gs64stone.

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. For example:

os$ GEMSTONE=/installDir/GemStone64Bit3.7.0-x86_64.Linux
os$ export GEMSTONE

If you have been using another version on GemStone, be sure you update or delete previous settings for all GEMSTONE* environment variables.

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
os$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
os$ source $GEMSTONE/bin/gemsetup.csh

Step 3. Start linked Topaz:

os$ topaz -l

Step 4. Set the UserName login parameter:

topaz> set username DataCurator

Step 5. Log in to the Gem session. It will query you for the password.

topaz> login
GemStone Password?
[Info]: LNK client/gem GCI levels = 37000/37000
--- 08/04/23 15:16:35.043 PDT Login
[Info]: User ID: DataCurator
[Info]: Repository: gs64stone
[Info]: Session ID: 5 login at 08/04/23 15:16:35.048 PST
[Info]: GCI Client Host: <Linked>
[Info]: Page server PID: -1
[Info]: using libicu version 58.2
[08/04/23 15:16:35.051 PDT]
  gci login: currSession 1  linked session 
successful login
topaz 1> 

At this point, you are logged in to a Gem session process, which is linked with the application. The session process acts as a server to Topaz and as a client to the Stone. Information about Topaz is in the manual GemStone Topaz Programming Environment.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz in one step by invoking the Topaz exit command:

topaz 1> exit

To Start an RPC Session

The following steps show how to start an RPC application (here, the RPC version of Topaz) on the server node. The procedure assumes that the Stone is running under the default name gs64stone and that you are already set up to run a GemStone session as described in Step 1 and Step 2 of the previous example (“To Start a Linked Session”).

Sessions that login RPC use SRP (Secure Remote Password) and SSL to authenticate passwords for login. If the Gem is running on the server node, the connection reverts to normal socket communication after login completes.

The following steps demonstrate an RPC login from topaz:

Step 1. Use gslist to find out if a NetLDI is already running. The default name for the NetLDI is gs64ldi.

os$ gslist
Status Version  Owner      Started    Type  Name
------ -------- --------- ------------ ------ ----
exists 3.7.0    gsadmin   Aug 04 12:02 cache  gs64stone~1c9fa07f041665
exists 3.7.0    gsadmin   Aug 04 12:02 Stone  gs64stone
exists 3.7.0    gsadmin   Aug 04 10:13 Netldi gs64ldi
 

If necessary, start a NetLDI following the instructions under Starting a NetLDI.

Step 2. Start the RPC application (such as Topaz), then set the UserName.

topaz> set username DataCurator

Step 3. Unless the NetLDI is running in guest mode with a captive account, set the application login parameters, such as HostUserName and HostPassword, after you start the application. For example:

topaz> set hostusername yourUnixId
topaz> set hostpassword yourPassword

Step 4. Set GemNetId (the name of the Gem service to be started) to gemnetobject. This script starts the separate Gem session process for you. For example:

topaz> set gemnetid gemnetobject

Step 5. Log in to the GemStone session.

topaz> login
GemStone Password?
[08/04/23 15:16:35.762 PDT]
  gci login: currSession 1 rpc gem processId 6943 socket 6
successful login
topaz 1> 

At this point, you are logged in through a separate Gem session process that acts as a server to Topaz RPC and as a client to the Stone repository monitor.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz by in one step by invoking the Topaz exit command:

topaz 1> exit

To Troubleshoot Session Login Failures

Several factors may prevent successful login to the repository:

  • Your GemStone key file may establish a maximum number of user sessions that can simultaneously be logged in to GemStone. (Note that a single user may have multiple GemStone sessions running simultaneously.) The limit itself is encoded in the keyfile used to start the stone, and reported in the stone log on startup. Look for a line like this:
SESSION MAX: The licensed concurrent session max is 10.
  • The STN_MAX_SESSIONS configuration option can restrict the number of logins to fewer than a particular key file allows. An entry in the Stone log file shows the maximum at the time the Stone started. Look for a line like this:
SESSION CONFIGURATION: The maximum number of concurrent sessions is 40
  • The SHR_PAGE_CACHE_NUM_PROCS configuration option restricts the number of sessions that can attach to a particular shared page cache. This is normally computed based on the setting for STN_MAX_SESSIONS.

Multi-threaded operations use additional slots for their working threads while they are executing. If you are close to your session limit, these operations may prevent other sessions from logging in.

  • The UNIX kernel must provide sufficient semaphores and file descriptors for each logged in session. See your Installation Guide for information on UNIX kernel tuning that may be necessary.
  • The owner of the Gem or a linked application process must have write access to the extent file and to the shared page cache. Use the UNIX command ipcs -m to display permissions, owner, and group for shared memory. For example:
os$ ipcs -m
 
------ Shared Memory Segments --------

key
0xc8010015

shmid
278462466

owner
gsadmin

perms
660

bytes
132120576

nattch
5

status

Typical problems occur with linked applications, which may be installed without the S bit and therefore rely on group access to the shared page cache and the repository.

Identifying and Stopping Logged-in Sessions

Privileges required: SessionAccess.

To identify the sessions currently logged in to GemStone, send the message System class>>currentSessionReport. This message returns an array of internal session numbers and the corresponding UserId, executable, and PID. For example:

topaz 1> printit
System currentSessionsReport 
%
2 SymbolUser symbolgem 32103
3 GcUser admingcgem 32210
4 DataCurator gem 21589 on localhost
5 GcUser reclaimgcgem 32213

The session number can be used with other System class methods to stop a particular session. To get the sessionId for the current executing session, use System class >> session.

To get the UserProfile for a given session, execute:

System userProfileForSession:aSessionId

To get the UserProfile for the current session, execute:

System myUserProfile

The method System class>>descriptionOfSession:aSessionId returns an array of descriptive information, which can be used to find out details information and status for any session. This method returns an Array; the values in each slot are defined as follows:

1. The UserProfile of the session; nil if the UserProfile is recently created and not visible from this session's transactional snapshot view or the session is in login or processing, or has logged out.

2. A SmallInteger, the process ID of the Gem or topaz -l process .

3. ‘localhost’ if the Gem/topaz -l process is on the same host as the Stone; if the Gem is remote from the Stone, the IP address of the machine running the Gem.

4. Primitive number in which the Gem is executing, or 0 if it is not executing in a long primitive.

5. Time of the session's most recent beginTransaction, commitTransaction, or abortTransaction (from System timeGmt).

6. The session state (a SmallInteger).

7. A SmallInteger whose value is -1 if the session is in transactionless mode, 0 if it is not in a transaction and 1 if it is in a transaction.

8. A Boolean whose value is true if the session is currently referencing the oldest commit record, and false if it is not.

9. The session's serial number (a SmallInteger).

10. The session's sessionId (a SmallInteger).

11. A String containing the IP address of host running the GCI process. If the GCI application is remote, the peer address as seen by the gem of the GCI application to gem network connection. If the GCI application is linked (using libgcilnk*.so or gcilnk*.dll) this is the peer's IP address as seen by stone, for the gem to stone network connection used for login.

12. The priority of the session (a SmallInteger).

13. Unique host ID of the host where the session is running (an Integer)

14. Time of the session's most recent request to stone (from System timeGmt)

15. Time the session logged in (from System timeGmt)

16. Number of commits which have occurred since the session obtained its snapshot view.

17. Nil or a String describing a system or GC gem.

18. Number of temporary (uncommitted) object IDs allocated to the session.

19. Number of temporary (non-persistent) page IDs allocated to the session.

20. A SmallInteger, 0 session has not voted, 1 session voting in progress, 2 session has voted, or voting not active.

21. A SmallInteger, processId of the remote GCI client process, or -1 if the session has no remote GCI client.

22. The KerberosPrincipal object used for passwordless login to the session, or nil if passwordless login was not used.

23. The sessionId of the hostagent session through which this session is communicating to stone, or -1 if session is not using a hostagent.

24. SmallInteger listening port if this session is a hostagent, or -1.

25. gcLockKind; 0, or the type of gcLock or repository scan lock held by the session.

26. UserProfile which created the onetime password for the session, or nil if no onetime password was used.

27. Process ID of the gem that created the onetime password for the session, or -1 if no onetime password was used.

Refer the image method comment for the most recent details as elements are added at the end of the array.

6.5  Shutting Down Sessions, the Object Server, and NetLDI

Stopping Logged-in Sessions

Privileges required: SessionAccess and SystemControl

There are a number of methods on System class that can be used to stop a specific session, or all sessions:

stopSession: aSessionId
Stop the specified session; any transactions that the session was in are aborted, and the session is terminated. This method does not stop the GcGems or SymbolGem.

terminateSession: aSessionId timeout: timeoutSeconds
Stop the specified session; any transactions that the session was in are aborted, and the session is terminated. Waiting up to timeoutSeconds for the session to complete terminating before returning. This method can be used to stop the GcGems. but not the SymbolGem.

stopUserSessions
Stops all sessions other than system Gems; does not stop the GcGems nor SymbolGem. Any transactions that any of the sessions were in are aborted.

NOTE
Be aware that it may take as long as a minute for a session to terminate after you send stopSession:. If the Gem is responsive, it usually terminates within milliseconds. However, if a Gem is not active (for example, sleeping or waiting on I/O), the Stone waits one minute for it to respond before forcibly logging it out. You can bypass this timeout by sending terminateSession:timeout:

To verify all user sessions have logged out or been terminated, send the message currentSessionNames to System. For example, using Topaz:

topaz 1> printit
System currentSessionNames 
%
session number: 2    UserId: GcUser
session number: 3    UserId: GcUser
session number: 4    UserId: SymbolUser
session number: 5    UserId: DataCurator

The SymbolUser and GcUser sessions are system session and will be shut down cleanly when the stone is shut down. The above example includes session 5, which is the user executing the example code.

Stopping the Stone

After all user sessions have logged out, use the stopstone command, which performs an orderly shutdown in which all committed transactions are written to the extent files.

os$ stopstone [StoneName] [gemstoneUserName] [gemstoneUserPassword] [-i]

If you do not supply the name of the Stone repository monitor, GemStone username, or password, stopstone prompts for this information. The user must have the SystemControl privilege (initially, this privilege is granted to SystemUser and DataCurator).

The -i option aborts all current (uncommitted) transactions and terminates all active user sessions. If you do not specify this option and other sessions are logged in, GemStone will not shut down and you will receive a message to that effect.

Stopping the NetLDI

There is a similar command to shut down the NetLDI network service.

os$ stopnetldi [netLdiName]

For more information, see the command reference in Appendix B; stopstone and stopnetldi.

If you are logged in to a GemStone session, you can invoke System class>>shutDown, which also requires the SystemControl privilege.

Using OS kill

If you must halt a specific Gem session process or GemStone server processes, be sure to use only kill or kill -term so that the Gem or other process can perform an orderly shutdown.

kill -usr1 will not kill the process, but will cause a GemStone process to write its C and Smalltalk call stacks to the process log file. For linked logins, which do not have a separate process, the stack is written to the application’s stdout.

Do NOT use kill -9 or another uncatchable signal, which does not result in a clean shutdown, unless it is unavoidable. On some platforms, failures in disk I/O can result in a process that does not respond to kill.

If for some reason you do need to send kill -9 to a shared page cache monitor, use ipcs and ipcrm to identify and free the shared memory and semaphore resources for that cache. If you send kill -9 to a Stone, use ipcs to determine whether ipcrm should be invoked.

Handling “Zombie” Sessions

Very rarely, an unexpected error can occur that leaves a Gem in an unresponsive state, where it is not shut down in by a stopSession: or similar method. These are generally referred to as a “zombie” sessions. The actual cause and symptoms of a zombie session can vary widely. If you encounter issues with a zombie session, check for bugnotes, and contact GemTalk Technical Support for further diagnosis.

A session may be unresponsive for short periods during certain types of execution. This is normal, and not a cause for concern.

  • A session that has encountered an error and is waiting for a debugger to attach is not a true zombie, but may require using kill to terminate.
  • It may be possible to cleanup a zombie session by using kill -TERM on the Gem or linked process.
  • The method System stopZombieSession: aSessionId bypasses some safeguards in stopSession: and may allow the session to complete logout.

6.6  Logins without a stone running

Read-only GemStone operations can be performed when a Stone is not running, by using a "solo" session. This makes it simple to set up Smalltalk-based scripting without needing to configure or start a Stone. More details on scripting is provided in the Topaz Users Guide.

Solo logins require access to an extent file, which can be the read-only empty distribution extent. You may also use an extent containing application code, data, or other modifications, provided the following are true for the repository extent:

The configuration parameter GEM_SOLO_EXTENT specifies the extent file to be used by a Solo session. This defaults to the clean, read-only extent within the distribution, $GEMSTONE/bin/extent0.dbf.

Methods that require a connection to a Stone are disallowed in a Solo session; this includes a number of methods in System class and Repository. For example, methods such as markForCollection, reclaimAll, and methods that make and restore backups all require a running Stone. Attempting to execute these methods in a Solo session results in an ImproperOperation Error (#2050).

Solo login from topaz

To login Solo from topaz linked or RPC, execute set solologin on, then login.

For example:

topaz> set solologin on
topaz> set username DataCurator password swordfish
topaz> login
[08/04/23 15:16:35.762 PDT]
  gci login: currSession 1  rpc gem processId 20617 socket 6
    ReadOnly session
[Info]: Read-Only Repository:
    /gshost/GemStone3.7/bin/extent0.dbf
successful Solo login
topaz 1> 

The username and password are required; the setting for gemstone is not used. In topaz RPC, you may perform a solo login while also logged into a GemStone Stone, provided the extent file used by the Solo session (by default, $GEMSTONE/bin/extent0.dbf) is not in use.

Object creation and memory use

Each Solo RPC or linked Gem also opens a 10MB read-write temporary file, /tmp/gemRO_pid_extent1.dbf, which is deleted on logout or process exit.

Object creation in a Solo session is limited to temporary object memory, but you may create objects as needed up to the limit of memory. To ensure there is sufficient memory, you may:

  • Set a larger value for GEM_TEMPOBJ_CACHE_SIZE in the configuration file used by the topaz or Gem session.
  • For linked sessions, use -T cachesize on the topaz command line.
  • For RPC sessions, include -T cachesize in the NRS gemnetid login parameter.

Solo sessions other than from topaz

When the GCI flag #GCI_LOGIN_SOLO is used in the login parameters, any GCI application may create a solo login.

Superdoit scripting include the option to execute as a solo session. See the Topaz Users Guide for more details.

6.7  Recovering from an Unexpected Shutdown

GemStone is designed to shut down in response to serious error conditions, to minimize the risk of damage to the repository. If GemStone stops unexpectedly, most commonly it is one of the following situations:

When GemStone shuts down unexpectedly, check the message at the end of the Stone log file to begin diagnosing the problem.

If the GemStone log does not contain a shutdown or error message, there has probably been a power failure or an operating system crash, or kill -9 was sent to the Stone process.

Use startstone to restart the Stone in the usual way, which will automatically recovers committed transactions, as described in the next section.

Automatic recovery

Repositories that are in full and in partial transaction logging mode will automatically recover committed transactions on a restart after a crash or unclean shutdown. This process is called recovery (note that this is distinct from restore from backup, which is a manual process).

If the unexpected shutdown is due to a benign cause, you can startstone to restart the Stone in the usual way. The Stone will automatically recover committed transactions that are in transaction logs. Provided the Stone reports successful startup, your repository is complete to the point of the last commit before shutdown.

If you are using encrypted extents and transaction logs, all the transaction logs required for recovery must be using the same key as the extents for automatic recovery to succeed. See Restart and Recovery.

If the Stone does not successfully startup, check the Stone log for messages to help diagnose the problem.

Clean shutdown

If you see a shutdown message in the system log file, GemStone has stopped intentionally. This may be due to a stopstone command, or a Smalltalk System shutdown method:

SHUTDOWN command was received from user DataCurator session 5 gem processId 24149.

or in response to a kill -TERM of the Stone process:

signal 15, SIGTERM, received from process 13768 userId 531
<signal details>
SHUTDOWN command was generated by  SIGTERM handler.

followed by:

--- 08/04/23 15:16:35.056 PDT ---
    Starting checkpoint for clean shutdown.
    Waiting for all tranlog writes to complete before shutdown.
    <other shutdown messages>
    Waiting for Page Manager thread to stop...done.
    Waiting for NetRead thread to stop...done.
 
--- 08/04/23 15:16:35.961 PDT ---
    Now stopping GemStone.

This indicates there was no error condition that caused the shutdown; you may need to investigate if you have automated processes that shut down the Stone for some reason.

After a clean shutdown you can restart GemStone as usual; no recovery is needed. See Starting the GemStone Server.

Disk failure or file system corruption

GemStone prints several different disk read error messages to the GemStone log file. For example:

Repository Read failure,
fileName = !#dbf!/gshost/GemStone3.7/data/extent0.dbf
PageId = 94
File = /gshost/GemStone3.7/data/extent0.dbf
too few bytes returned from read()
DBF Operation Read; DBF record 94, UNIX codes: errno=34,...
	"A read error occurred when accessing the repository."

If you see a message similar to the above, or if your system administrator identifies a disk failure or a corrupted file system, try to copy your extents to another node or back them up immediately. The copies may be bad, but it is worth doing, just in case. If you’re lucky, you may be able to copy them back after the underlying problem is solved and start again with the current committed state of your repository.

Otherwise, you may need to restore the repository. For details, see the restore procedures in Chapter 11.

Shared Page Cache error

If you find a message similar to the following in the GemStone log, the shared page cache (SPC) monitor process (shrpcmonitor) died. The SPC monitor log, $GEMSTONE/data/gemStoneName_pcmonnnnn.log,may indicate the reason.

--- 08/04/23 15:16:35.762 PDT ---
    The stone’s connection to the local shared cache monitor was lost.
    Error Text: ’Network partner has disconnected.’

The unexpected shutdown of a Gem process may, in rare cases, result in a “stuck spin lock” error that brings down the shared page cache monitor and the Stone. GemStone uses spin locks to coordinate access to critical structures within the cache. In most cases, the monitor can clean up such spin locks without shutting down if a Gem dies while holding a spin lock, but not all spin locks can be recovered safely. Stuck spin locks may result from a Gem crash, but a typical cause is the use of kill -9 to kill an unwanted Gem process. If you must halt a Gem process, be sure to use only kill or kill -TERM so that the Gem can perform an orderly shutdown.

Use startstone to restart GemStone and perform automatic recovery. For instructions, see Starting the GemStone Server.

Fatal error detected by a Gem

When a Gem session process detects a fatal error, it terminates, to avoid risk of corruption. If the Stone configuration parameter STN_HALT_ON_FATAL_ERR is set to True, the Stone will also immediately terminate. This is designed to avoid the small risk of corruption to the repository that might originate from the Gem’s fatal error condition.

By default, STN_HALT_ON_FATAL_ERR is set to False. That setting causes the Stone to keep running if a Gem encounters a fatal error; it is strongly recommended for this to be set to False in production systems. You can set STN_HALT_ON_FATAL_ERR to True during development and testing to provide additional responsiveness to potential risks.

When the Stone shuts down due to a Fatal error in a Gem, the Stone prints a message like this in its log file:

Fatal Internal Error condition in Gem
   when halt on fatal error was specified in the config file

Use startstone to restart GemStone and perform automatic recovery. For instructions, see Starting the GemStone Server.

Out of disk space for extents or transaction logs

If GemStone runs out of extent or transaction log disk space, the Stone will shut down in an orderly way, but since it cannot write to the extents or transaction logs, it is not a clean shutdown. Before restarting, you will need to make more space available.

See Recovering from Disk-Full Conditions.

Other errors

If the shutdown is due to a minor disk or file system corruption, or corruption in GemStone, the error message may not be obvious; for example, an error message such as Object does not exist.

If the cause of the error and Stone shutdown is not clear, start with a page audit of the repository file (see Repository Page and Object Audit).

  • If the page audit fails, refer to Disk failure or file system corruption
  • If the audit succeeds, attempt to restart the Stone using startstone and allow automatic recovery to complete. If that succeeds, run an objectAudit to confirm there is no corruption in the GemStone extents.

If the restart fails or if the object audit shows corruption, you may have to restore the repository. For details, see the restore procedures in Chapter 11. Contact GemTalk Technical Support for more detailed analysis of the problem and more specific advice.

 

Previous chapter

Next chapter