Object State Change Tracking

Previous chapter

A.1  Overview

GemStone transaction logs (tranlogs), which provide a way to recover all changes made to the repository in the case of repository crashes, or allow warm standbys to apply changes made to a primary repository, include detailed records of all changes in a encoded and compressed form. Converting the tranlog information to human-readable form, and analyzing this output, provides invaluable information for debugging and testing. It allows us to determine exactly what changes have been made to any of the objects in the repository over time. The tools used to perform this have been used internally to GemStone for many years, and have been provided to customers on occasion when needed to analyze specific application problems.

The ability to track changes to objects may be useful for customers who need to identify details on changes that have been made to application data objects. For this reason, we are making these scripts available as part of the GemStone product. Additional information has been added to the tranlogs to allow tracking of the specific user or machine that originated the object changes.

The scripts used to perform tranlog analysis are located in the $GEMSTONE/bin/ directory and are named:

printlogs – to output the complete contents (optionally filtered) of selected tranlogs in human-readable form.

searchlogs – to search all tranlogs in a directory for selected OOPs (Object Oriented Pointers, or Object Ids), and output the matching entries (optionally filtered) in human readable form.

A GemStone repository performs many automatic operations, including things like garbage collection and checkpoints, that are transparent to end users. The tranlogs, of course, must contain records of any changes made by these operations. Complete details on everything that tranlogs may contain is beyond the scope of this documentation. The information provided here is intended to allow the use of the tranlog analysis scripts to locate and identify the details of changes to application objects.

Object oriented design’s principle of encapsulation allows you to hide internal object complexity. However, to understand the data recorded in the tranlogs, you must have a detailed understanding of the actual structure of the objects. This includes both your own application classes, and the classes that are part of GemStone Smalltalk.

Also, note that since the tranlog analysis scripts are general purpose, used for a wide variety of purposes in which a detailed record of internal repository operation is required, the scripts may output much more information than is necessary for tracking object state changes. You may need to ignore this extraneous information as you perform your analysis.

A.2  Tranlog Analysis Scripts

Script Prerequisites

The environment variable $GEMSTONE must be set, and the $GEMSTONE/bin/ directory must be in your executable path.

The scripts perform the analysis on tranlogs that are in the current working directory. If you are using raw partitions for your tranlogs, locate disks on the file system with adequate space for both the tranlogs themselves, and the script output files, which may be larger than the original tranlogs. Use copydbf to copy the tranlogs from the raw partition to the file system. For more information on the copydbf utility, see Appendix B, “GemStone Utility Commands”.

Output

Output from the scripts goes directly to stdout. To preserve the output and allow it to be used in later steps of analysis, redirect this; for example:

$> printlogs tranlog1.dbf > tranlog1.out

Note that these scripts can produce very large amounts of output, so make sure that you have adequate disk space.

In some cases the resulting files may be too large for unix text editors such as vi or emacs to open. You may find it necessary to use the unix split utility to break up very large output files into more manageable chunks.

Tranlog Assumptions

By default, the tranlogs are assumed to be named using the GemStone convention tranlogNNN.dbf. If you are using a different naming convention, you can override this by setting the environment variable $GS_TRANLOG_PREFIX to the prefix you are using.

The scripts use the creation date of the tranlog file to determine the order in which the tranlogs are analyzed. If you copy or manipulate the tranlogs in a way that changes the creation date, this may cause the analysis to be done out of order. The output will warn of this with the message:

*** Warning:  scanning tranlogs out of order

 

Filter Criteria

The scripts both allow you to filter the results, to locate entries that are related to a particular UserId, GemHost, or ClientIP.

UserId – The userId (user name) of the UserProfile associated with this session: DataCurator, SystemUser, etc. The filter keyword is user.

GemHost – The name or IP address of the host machine running the gem process. For a linked session, which links the gem into the client, this is the same machine as the client.

If the gem is running on the same machine as the stone, use the name of the host machine. Otherwise, if the gem is on a different machine than the stone, use the IP address of the remote machine.

The filter keyword is host.

ClientIP – The IP address of the host machine running the client process.

For an RPC session, this is the machine running the client application. Clients may be topaz -r, GemBuilder for Smalltalk, or GemBuilder for C applications. For a linked session, this is the machine running the linked client/gem (the same machine as the GemHost). However, the ClientIP is always the IP address, even if it is on the same machine as the stone.

The filter keyword is client.

Effective UNIX user ID – The integer that is the effective UNIX user id of the gem process.

The filter keyword is euid.

UNIX user ID – The integer that is the real UNIX user ID of the gem process.

The filter keyword is ruid.

effective UNIX user name – The effective UNIX user name of the gem process.

The filter keyword is euidstr.

UNIX user name – The real UNIX user name running the gem process.

The filter keyword is ruidstr.

process ID – The integer process id (PID) of the gem process.

The filter keyword is gempid.

session ID – The integer session id of the session within GemStone.

The filter keyword is sessionid.

 

WARNING

Information about UserId, GemHost, and ClientIP are derived from a Login tranlog entry created when a session first logs in. This entry associates the UserId/GemHost/ClientIP with a particular sessionID, which is then used as a key for subsequent tranlog entries. If you start analysis from a later tranlog which does not include this Login entry, these fields will be left blank for that session, and printouts/searches using filters based on these fields will not locate any results. Likewise, scanning through tranlogs out of order may result in the wrong Login entry being associated with a given sessionID. This would set UserId/GemHost/ ClientIP incorrectly for that particular session, and produce incorrect results when filtering.

printlogs

This script prints out the contents of one or more tranlogs in the current working directory in a human-readable form.

Warning
This script produce a very large amount of output, which (unfiltered) will exceed the size of original tranlog/s, and depending on the contents may be twice as large as the original tranlogs. Consider disk space, the use of filters, and restricting the set of tranlogs before running this script. Use caution in including the full keyword.

Usage:

printlogs [<filters>] [full] [all] [<tlogA> ... <tlogZ>]

If <filters> are specified, only print out the tranlog entries that match the specified criteria. Filters may be combined. Possible filters are:

user <userId> – Filter by the userId (the user name) of the GemStone UserProfile
host <hostnameOrIP> – Filter by gem/topaz process host or IP address
client <N.N.N.N> – Filter by client IP Address
euid <integer> – Filter by gem's effective UNIX user ID
ruid <integer> – Filter by gem's real UNIX user ID
euidstr <string> – Filter by gem's effective UNIX user name
ruidstr <string> – Filter by gem's real UNIX user name
gempid <integer> – Filter by gem's process ID
sessionid <integer> – Filter by gem's session ID

full— Produce more detailed information. WARNING: this produces much larger output results.

all —Print out the contents of all tranlogs in the current working directory.

Examples

To print out the entire contents of all tranlogs in this working directory:

printlogs all

To print out all entries in a selected number of tranlogs (note that tranlogs in the sequence must be contiguous):

printlogs tranlog5.dbf tranlog6.dbf tranlog7.dbf

To print out all tranlog entries for the user DataCurator in any tranlog:

printlogs user DataCurator all

To print out detailed information for all entries in tranlog5.dbf for the user DataCurator:

printlogs full user DataCurator tranlog5.dbf

searchlogs

This script prints out tranlog entries associated with particular OOP values, according to the arguments in the command line. All tranlogs in the current working directory are scanned.

Usage:

searchlogs [<filters>] <oopA> ... <oopB>]

If <filters> are specified, only print out the tranlog entries that match the specified criteria. Filters may be combined. Possible filters are:

user <userId> — Filter by the userId (the user name) of the GemStone UserProfile
host <hostnameOrIP> — Filter by gem/topaz process host or IP address
client <N.N.N.N> — Filter by client IP Address
euid <integer> — Filter by gem's effective UNIX user ID
ruid <integer> — Filter by gem's real UNIX user ID
euidstr <string> — Filter by gem's effective UNIX user name
ruidstr <string> — Filter by gem's real UNIX user name
gempid <integer> — Filter by gem's process ID
sessionid <integer> — Filter by gem's session ID

When using more than one filter, you must list the filters in the listed order.

Examples

To print out all entries involving OOP 1234 and OOP 5678:

searchlogs 1234 5678

To print out all entries involving OOP 1234 performed by DataCurator:

searchlogs user DataCurator 1234

To print out all entries involving OOP 1234 and OOP 5678 performed by DataCurator while logged in from client machine 10.20.30.40:

searchlogs user DataCurator client 10.20.30.40 1234 5678

A.3  Tranlog Structure

In order to make sense of the output from the scripts, you need a basic understanding of how tranlogs are structured.

GemStone transaction logs consist of a sequence of tranlog records. Tranlog records are written on physical pages of 512 bytes (note that this is different from the larger page size used for extents); a tranlog record may extend over more than one page, but its size is always a multiple of 512 bytes.

Each tranlog record contains one or more tranlog entries (sometimes referred to internally as logical records). The tranlog entries contain the information of interest - the actual changes to objects in the repository (and any other repository operations).

Output from the scripts will include header information for the tranlog record (see Example A.1), followed by data from each of the tranlog entries held by that tranlog record.

Example A.1 Tranlog Record Header Output
Dump of record 11 hdr.pageId 42949672962 , Kind=(16)Tran Log Record
 thisRecordId:(file:2 rec:11) previousRecordId:(file:2 rec:10)
 recordSize: 1024 numLogicalRecs: 2 fileCreationTime: 1297200655
 

Tranlog records are identified by the pageId that they begin on. Since a tranlog record may extend over multiple pages, there may be a gap in the sequence of record ids in the output. For example, the first tranlog record in Example A.2, record 7, has a recordSize of 1536 (three 512-byte physical pages), and so the next tranlog record will be 10.

Example A.2 Gap in Tranlog Record Sequence
Dump of record 7 hdr.pageId 25769803778 , Kind=(16)Tran Log Record
 thisRecordId:(file:2 rec:7) previousRecordId:(file:2 rec:6)
 recordSize: 1536 numLogicalRecs: 2 fileCreationTime: 1297200655
 
2.7.0  BeginData session: 4 beginId:(100979 1)
  numObjs:3  pad1:0 pad2:0
 2195969  240385  20423937
 
2.7.1  Commit session: 4 beginId:(100979 1)  crPage: 1039 commitSeq: 0.158 
timeWritten: 02/08/11 13:31:52 PST
  seqType:normal beginLogRecord:(file:2 rec:7)
---------------------------------------
Dump of record 10 hdr.pageId 30064771074 , Kind=(16)Tran Log Record
 thisRecordId:(file:2 rec:10) previousRecordId:(file:2 rec:7)
 recordSize: 512 numLogicalRecs: 3 fileCreationTime: 1297200655
 
 

Tranlog Entries

Each tranlog entry contains a unique identifying code, a descriptive entry type, and information on the session that originated the tranlog entry.

The identifying code consists of three numbers in the form:

AAA.BBB.CCC 

where:

AAA – the fileId of the tranlog holding the entry
BBB – the pageId for the tranlog record holding the entry
CCC – the entryId: the number of the entry within the tranlog record

Each tranlog entry is of a particular type, according to the event that it represents and the information it contains. Types are indicated by short descriptive terms such as Login, Abort, Commit, and Data. There are a large number of entry types, most of which are not important for tracking object state changes and can be ignored (for example, a Checkpoint entry. The ones that are important are documented below.

Each tranlog entry is associated with an Integer sessionID. SessionsIDs are unique to a specific session at any point in time. However, when a session logs out, the sessionID becomes available again for reuse by a new session logging in, so over a period of time, a sessionId may be used by a number of different sessions.

A sessionID of zero is used for tranlog entries generated by the stone.

If the entry was not originated from the stone (that is, the tranlog entries sessionID is not 0), the tranlog entry header includes more information about the session: the UserId, the GemHost, and the ClientIP address. These are described in more detail under Filter Criteria.

Example A.3 Example Tranlog Entry
2.3.4  Login session: 4 beginId:(100973 1) userProfileOop: 208385
timeWritten: 02/08/11 13:30:56 PST userId: SymbolUser gemhost: myhost
clientIP: 10.20.30.40 processId: 12663 rUid: 531 eUid: 531 realU: gsadmin
effU: gsadmin
 
2.3.5  StartSymbolGem session: 4 beginId:(100973 5)
 

Example A.3 shows that is in the tranlog with fileId 2 (by the default naming convention, tranlog2.dbf), it is in the third physical page and in tranlog record 3, and these are the fourth and fifth tranlog entries in tranlog record 3.

These entries are of the tranlog entry types Login and StartReclaimGcGem. The session is logged in as the user named SymbolUser; the gem session is executing on the same machine as the stone, a machine named myhost; the client is executing on a machine with the IP address 10.20.30.40; and the OS process is owned by the OS userid gsadmin. (beginId contains transaction tracking information that you can ignore).

Other information will follow this basic data, depending on the type of entry.

Tranlog Entry Types

There are a large number of tranlog entry types. Most of these are not relevant to tracking object state history - they record internal operations of the system, such as garbage collection or checkpoints. These tranlog entry types are not documented, although their functions can often be deduced by their names.

Below are the entries important for tracking object state history:

Login

A new session is logging into GemStone. As mentioned earlier, this entry logs the UserId, GemHost, ClientIP and other data for this particular sessionID. If you start analysis on a tranlog that follows this event, these fields will be left blank for that session.

For example, if session 7 logs in while tranlog4 is in effect, and logs out when tranlog5 is in effect, and you begin analysis on tranlog5, entries for this session will not contain any session detail information. If sessionID 7 is reused by a new session logging in later during tranlog5, that login will be recorded in tranlog5, and subsequent entries for this new session will be displayed properly.

AbortLogout

This entry is written when a session logs out or the Stone detects that the session has been killed. This entry is not flushed to the transaction log until a commit occurs.

BeginData
Data
BeginStoreData
StoreData

GemStone uses the above four entry types for recording new or changed object data. The basic entry information is followed by additional information about the changes, including the OOP values of the changed objects. Using the optional “full” flag in the printlogs script will cause the output to include additional information on the exact changes made.

Commit

All the changes recorded in earlier BeginData, Data, BeginStoreData, or StoreData entries are now officially committed to the repository.

Abort

Any changes recorded earlier in this transaction are discarded.

BreakSerialization

This entry indicates that a transaction conflict occurred during an attempt to commit. Any changes recorded earlier in this transaction are not yet permanent. This event is usually followed by an Abort entry, although it's possible that the next event seen for this sessionID is a Login, if the original session logged out and a new session reuses the sessionID.

Very Large Objects

GemStone is designed so that all objects will fit on a single page of 16384 bytes. This means that a byte object can be no larger than about 16000 bytes (since page header information uses some space), and pointer objects can only have about 4000 elements. GemStone internally represents objects larger than this as a tree structure, where the root node object references 2 or more leaf node objects, which then reference the actual elements of the collection object. Extremely large objects, such as large collections, may have internal branch nodes, if the number of leaf objects needed exceeds 4000.

This internal structure is usually transparent to the user. So, for example, you may create and manipulate an Array containing 20,000 elements as if it was a single large object, while the actual representation is a root object that references five leaf objects, each containing a 4000-element chunk of the array. While this makes application development with GemStone much simpler, the entries in the tranlogs reflect the actual implementation; you will need to be aware of this to understand tranlog output relating to collections larger than ~4000 object references or ~16000 bytes. Adding an element to the large Array, for example, may mean the tranlog entry includes a change to an instance of LargeObjectNode (the leaf node object), rather than a change to the Array itself.

Full vs. Normal Mode

When using the printlogs script, you can optionally specify “full” mode to get more details on the changes made to objects in the repository. But this will greatly increase the size of the resulting log files. For example, using normal mode on our test tranlogs generated a log file that contained an entry that looked like this:

Example A.4 Tranlog entry, normal mode
2.11.0  BeginData session: 5 beginId:(100978 1)
  numObjs:3 pad1:0 pad2:0
 25567233  8532993 25880577 
 

which tells you that three objects were created or modified during this event, with oops 25567233, 8532993, and 25880577.

Using “full” mode will produce a much more detailed listing for this event:

Example A.5 Tranlog entry, full mode
2.11.0  BeginData session: 5 beginId:(100978 1)
  numObjs:3 pad1:0 pad2:0
 
objId 25567233 class 114177 segId:9 len 14 gcSz 14 psize 136 bits 0x1 page 1152
Oop values:  50(sI 6) 8532993  5042177  25881601  17979137  26620161 
20272897  25801985
8:  20322817  25559553  20324097  25559297  20423937  25869569
 
objId 8532993 class 111361 segId:9 len 14 gcSz 14 psize 136 bits 0x1 page 1152
Oop values:  162(sI 20) 122(sI 15) 20002(sI 2500) 42(sI 5) 20  18224129  20
25567233
8:  20  25566977  20  25801729  20  25566721
 
objId 25880577 isNewObj=1  class 73985 segId:9 len 23 gcSz 23 psize 208
bits 0x241 page 1152
Oop values:  18(sI 2) 18(sI 2) 20  2(sI 0) 25870081  25871105  20  20
8:  20  20  20  20  20  20  20  20
16:  20  20  20  20  20  20  20
 
 

Note that there is now a description of each individual created or modified object, containing these fields:

objId: The OOP of this object

isNewObj: A Boolean indicating if the object is a newly created object.

class: The OOP of the class of this object

segId: The id of the GsObjectSecurityPolicy (formerly Segment) associated with this object

len: The logical size of this object (in bytes for a byte object, OOPs for an OOP object)

gcSz: The logical size of this object (in bytes for a byte object, OOPs for an OOP object)

psize: The physical size of this object on disk in bytes (including 20 bytes of object header information)

bits: The format bits for this object (internal GS use)

page: The extent page that this object is written on

Bytes: The actual bytes that make up this object (if a byte object)
[or]

Oop values: The actual OOPs that make up this object (if an OOP object)

If the Oop values contain more than eight elements, they are broken into lines of eight items, each of which is prefixed by a counter. For example, an Array of 30 items might look like this:

objId 39056641 isNewObj=1  class 66817 segId:9 len 30 gcSz 30
psize 264 bits 0x201 page 1005
Oop values:  39056385  39056129  39055873  39055617  39055361 
39055105  39054849  39054593
8:  39054337  39054081  39053825  39053569  39053313  39053057 
39052801  39052545
16:  39052289  39052033  39051777  39051521  39051265  39051009 
39050753  39050497
24:  39050241  39049985  39049729  39049473  39049217  39048961
 

Bytes are broken up similarly, but the sections are 60 bytes rather than 8. For example, the source code string for the name: method might look like this:

objId 38711809 isNewObj=1  class 74753 segId:9 len 91 gcSz 0 psize
120 bits 0x280 page 1097
Bytes:  name: newValue
 
   "Modify the value of the instance variabl
     61: e 'name'."
   name := newValue

A.4  Example of Tranlog Analysis

For this example, we created a simple database containing some Employee information and performed some simple operations creating and modifying these Employee objects.

The structure of classes associated with the Employee data is as follows:

Employee:

name - a Name object
age - a SmallInteger
address - an Address object

Name:

last - a String object
first - a String object
middle - a String object

Address:

addr1 - a String object
addr2 - a String object
city - a String object
state - a String object
zip - a SmallInteger

After having User1 create five Employee objects (with associated Name and Address objects), we then had User1 and User2 make some minor changes to one of the Employees:

When completed, we had two tranlogs, tranlog2.dbf and tranlog3.dbf.

Tracking Changes to an Employee

Let's say we want to examine the change history of a particular Employee. Using the method #asOop, we find the OOP of the Employee object of interest is 38808321.

topaz 1> printit
| myEmployee |
myEmployee := <code to locate employee object>.
myEmployee asOop.
%
38808321

The data composing an Employee is contained in subobjects (such as address), as well as directly (such as age). So, we will also need to track changes to these subobjects. Again using #asOop, we find that the OOP of the Name object associated with this Employee is 155073, and that the OOP for the Address object is 155069.

myEmployee name asOop
%
38808577
myEmployee address asOop
%
38808833

You can use #asOop on any persistent object in the repository. For example,

73 asOop
%
586
nil asOop
%
20

We can now search our tranlogs for any events involving these objects. Using the command:

$> searchlogs 38808321 38808577 38808833

results in the following output:

Example A.6 seachlogs output for Employee
Searching for oops: 38808321 38808577 38808833
Searching tranlogs:
 
  tranlog2.dbf
  tranlog3.dbf
 
... header material omitted ...
 
2.118.0  BeginData session: 5 beginId:(101033 0)  newobj  38808321 cls
39017217 onPage 1081,  newobj  38808577 cls 38726401 onPage 1081,  newobj 
38808833 cls 38715649 onPage 1081
 
2.123.0  Commit session: 5 beginId:(101033 0)  timeWritten: 02/08/11
14:43:42 PST
 
2.215.0  BeginData session: 6 beginId:(101664 1)  object  38808321 cls
39017217 onPage 871,
 
2.215.1  Commit session: 6 beginId:(101664 1)  timeWritten: 02/08/11
16:14:48 PST
 
3.22.0  BeginData session: 8 beginId:(101779 1)  object 38808833 cls
38715905 onPage 900,
 
3.22.1  Commit session: 8 beginId:(101779 1)  timeWritten: 02/08/11
16:32:58 PST
 
 

From this output, we can see that in tranlog entry 2.118.0, Session 5 made changes to all three objects (in this case, when the Employee and associated subobjects were first created). In entry 2.215.0, session 6 made a change to the Employee, and then later in entry 3.22.0, session 8 made a change to 38808833, the Address object.

Note that the BeginData entries are each followed by a Commit. You should always confirm that a BeginData/Data/BeginStoreData/StoreData of interest is followed by a Commit. If it doesn't then the reported event was not made persistent in the repository.

Changed vs. new objects

In the above example, while the field of an Address object changed, the Address object itself was the same (had the same OOP). Depending on how the Smalltalk application is written, this may not always be the case. If application that was initiating these changes created a new Address object, and assigned the Employee’s address instance variable to this new object, then the Employee object would reference a new OOP, rather than OOP 38808833. This would make the analysis somewhat different. For example, in the initial stage of the analysis when you look up the OOP of the Address object in your application, you would find the new OOP rather than the original OOP. Looking back in time, you would see when this Address object was created and assigned to the Employee instance.

Details of Changes to an Employee

Having used the searchlogs script to get a general idea of which tranlogs are of interest, you can now use the printlogs script to get more details.

For example, you might want more details on the creation of Employee object 38808321 and its associated subobjects 38808577 and 38808833 in entry 2.118.0. The “2” in “2.118.0” indicates that tranlog2.dbf is the tranlog of interest. The command:

$> printlogs tranlog2.dbf 

will generate a condensed listing of all events in tranlog2.dbf. By searching the resulting file for the entry number 2.118.0 you can find the relevant entry:

Example A.7 Employee modification
3.61.0  BeginData session: 2 user: User1 gemhost: merlin clientIP:
10.20.30.40 beginId:(134 0)
  clusterId: 1,  extentId: 0  numObjs:35 
 
145021  155041  155045  155049  155053  155069  155073  155077  155089 
155093  155097  155101  155121  155273  155285  155317  156589 
156597  156609  156613  156661  156689  156729  156733  156749 
156825  156829  156833  156837  156857  156861  156885  179629 
180045  180653 
 

This shows the oops of *all* objects created during this event. If you want to see more details on the actual changes made, use the “full” argument in the printlogs command:

$> printlogs full tranlog3.dbf

This will produce a more detailed listing of all events. For tranlog entry 3.61.0, you'll find:

Example A.8 Example Employee modifications, output in full mode
2.118.0  BeginData session: 5 beginId:(101033 0)
  numObjs:11  pad1:0 pad2:0
 
[details for other objects omitted]
 
objId 38808321 isNewObj=1  class 39017217 segId:9 len 3 gcSz 3 psize 48
bits 0x201 page 1081
Oop values:  38808577  306(sI 38) 38808833
 
objId 38808577 isNewObj=1  class 38726401 segId:9 len 3 gcSz 3 psize 48
bits 0x201 page 1081
Oop values:  38809089  38809601  20
 
objId 38808833 isNewObj=1  class 38715649 segId:9 len 5 gcSz 5 psize 64
bits 0x201 page 1081
Oop values:  38809857  20  38810113  38810369  777714(sI 97214)
 
objId 38809089 isNewObj=1  class 74753 segId:9 len 7 gcSz 0 psize 32 bits
0x280 page 1081
Bytes:  Patrick
 
objId 38809601 isNewObj=1  class 74753 segId:9 len 5 gcSz 0 psize 32 bits
0x280 page 1081
Bytes:  Ohara
 
objId 38809857 isNewObj=1  class 74753 segId:9 len 13 gcSz 0 psize 40 bits
0x280 page 1081
Bytes:  2556 Fir Blvd
 
objId 38810113 isNewObj=1  class 74753 segId:9 len 7 gcSz 0 psize 32 bits
0x280 page 1081
Bytes:  Ashford
 
objId 38810369 isNewObj=1  class 74753 segId:9 len 2 gcSz 0 psize 32 bits
0x280 page 1081
Bytes:  OR
 
 

So, for the Employee object 38808321, we find:

objId 38808321 isNewObj=1  class 39017217 segId:9 len 3 gcSz 3
psize 48 bits 0x201 page 1081
Oop values:  38808577  306(sI 38) 38808833

Indicating that this is an instance of class 39017217, the Employee class, which has three instance variables: name, age, and address. By position, we can identify the data in the instance variables:

name: 38808577 - the OOP of an instance of Name, found later in the entry
age: 306 - the OOP of the SmallInteger (sI) 38
address: 38808833 - the OOP of a instance of Address, found later in the entry

Looking at the Name object 38808577 we find:

objId 38808577 isNewObj=1  class 38726401 segId:9 len 3 gcSz 3 psize 48 bits 0x201 page 1081
Oop values:  38809089  38809601  20

Indicating that this is an instance of the class with OOP 38726401 (Name). Name contains three instance variables, last, first, and middle. By position, we see the data is:

last: 38809089 - the OOP of a String, described below
first: 38809601 - the OOP of a String, described below
middle: 20 (the OOP of nil) - in this example, no middle name was set

For the last name object 38809089, we find:

objId 38809601 isNewObj=1  class 74753 segId:9 len 5 gcSz 0 psize
32 bits 0x280 page 1081
Bytes:  Ohara

Indicating the last name is the string “Ohara”.

By a similar process you can follow the trail of objects to examine the structure of other subobjects in the Name and Address objects.

A.5  Further Analysis

Class Operations

To find all objects created or modified that belong to a particular class, first generate printlogs output in full mode of the tranlogs of interest. Each time an object of that class is created or modified, the full tranlog entry includes the line

class <OOP>

Use the unix grep command to find all references to the OOP of the class.

For example, to find all creation or modification of any instance of our example class Employee in the printlogs full output.

Since the Employee class is OOP 39017217, execute the grep command:

$>  grep "class: 39017217" tranlogfull.txt

this will produce output of the form:

objId 38804481 isNewObj=1  class 39017217 segId:9 len 3 gcSz 3
psize 48 bits 0x201 page 1087
objId 38808321 isNewObj=1  class 39017217 segId:9 len 3 gcSz 3
psize 48 bits 0x201 page 1081
objId 40062465 isNewObj=1  class 39017217 segId:9 len 3 gcSz 3
psize 48 bits 0x201 page 1317
objId 38808321 class 39017217 segId:9 len 3 gcSz 3 psize 48 bits
0x1 page 871
objId 38808321 class 39017217 segId:9 len 3 gcSz 3 psize 48 bits
0x1 page 900

This gives the first line from the entry creating/modifying the object belonging to that class. From this, you can use other commands to search for and/or track the history of these objects.

Deleted Objects

An object-oriented system doesn’t actually delete objects; objects cease to be referenced and are eventually garbage-collected. Noting the removal of an object requires examining the references to that object (such as from a collection) and identifying when the referencing object was modified in such a way that the object of interest is no longer referenced. Meanwhile, as you examine the printlogs output, you may find references to the OOP of a dereferenced object in garbage collection tranlog entries.

Managing Volume

As noted above, the printlogs produce a very large amount of output. GemStone tranlogs may be multiple GB in size. The output of printlogs in normal mode will be somewhat larger than the original tranlog (the printlogs output, being human readable, is less dense). The output from this script in full mode is much larger.

To manage the volume:

  • Avoid configuring your system with very large tranlogs.
  • Ensure that you have plenty of disk space available before beginning analysis.
  • Print only the tranlogs containing data you need. Use the searchlogs script to identify exactly where the required information is located.
  • Make sure that only the relevant tranlogs are in the current directory; move the unneeded ones elsewhere. However, you must retain a continuous set of tranlogs without gaps in sequence, and you must include the tranlog with the original log entry, in order to have the UserId and other information provided.
  • Once you have printed the output, use the UNIX utility grep -n to locate the lines of interest, and the UNIX utility split -l to break the resulting file up into more manageable size chunks.

Previous chapter