This chapter describes GemStone Smalltalk’s indexing and querying mechanism, a system for efficiently retrieving elements of large collections.
Overview
Reviews the concept of relations.
Defining Queries
Describes the structure of query predicates, the types of queries, and how to construct a query.
Creating Indexes
Discusses GemStone Smalltalk’s facilities for creating indexes on collections.
Results of Executing a GsQuery
How to execute a query and the options for working with the results.
Enumerated and Set-valued Indexes
Describes how to create enumerated and collection-valued indexes and queries.
Managing Indexes
How to perform index management: find out about indexes in your system, remove existing indexes, handle errors, and audit indexes.
Indexing and Performance
Additional factors that can impact the performance of your queries.
Historic Indexing API differences
The older indexing API, using UnorderedCollection methods and select blocks.
Most applications use one or more databases containing business data, which may be very large. Individual records in these databases may be added, removed, and/or updated, and need to be queried in multiple ways for different purposes. All these operations must be performed quickly and efficiently.
In GemStone, a database is represented as an instance of a collection that holds instances of business objects. You may have thousands or millions of objects in a collection, and these objects may be complex composite objects holding many individual strings, dates, number and other basic data types.
The following example shows simple employee data in table form:
In Smalltalk, this can be represented as an Employee class, with instance variables firstName, job, age, and address; and an Address class, with street, city, and state instance variables.
The collection itself may be an instance of a number of different types of Collection subclasses. For scaling, and to support indexes, a subclass of UnorderedCollection is recommended. Hashed collections such as dictionaries may become unbalanced if too many elements hash to the same value, and as a collection grows, may require the entire collection to be rebuilt. Indexed collections such as Array have limitations on adding and removing elements without affecting the entire collection. UnorderedCollections, particularly IdentityBag, IdentitySet, RcIdentityBag, and RcIdentitySet, use an optimized internal tree structure to hold the elements and are the recommended Collection classes for use for large databases. Collection classes are described in Chapter 4.
To make it easy to associate behavior with your set of Employees, it is often useful to define a class SetOfEmployees that is a subclass of IdentitySet. An instance of SetOfEmployees then can contain instances of Employee, with a reference from UserGlobals or from a class variable.
Since UnorderedCollections aren’t ordered, lookup is by value. For example, to find a particular Employee, you use select:, detect: or similar messages. For example,
MyEmployees select: [:ea | ea addess state = 'OR']
MyEmployees detect: [:ea | ea firstName = 'Sophie']
These iterative messages may not scale well. For example, for the above select: expression, for each employee in the collection, the employee object and the address object must be faulted into memory, and the messages address, state, and = are sent. While this doesn’t matter for small collections, it can become unreasonably slow for very large collections; particularly if objects in the collection are not in the shared page cache, and need to be read from disk.
Indexes and indexed queries provide a way to locate specific objects in a collection by value. Indexes are created on specific named instance variables, either by identity or by equality. Creating an index on a collection (e.g. on the instance variable firstName), creates parallel internal structures which provide a mapping from the indexed value (such as the firstName ’Sophie’) to the root object in the collection (the employee). Using this index, only a few message sends are needed to lookup the collection element that is the same as, or less or greater than, a particular value.
Identity indexes support queries that are looking for identical values, while equality indexes support queries that compare using equality, or greater or less than, a particular value.
Indexes are created on objects based on instance variables, not on message sends; since the instance variable relationships are known by the system, indexes can be updated automatically as elements are added and removed from the collection, and when references on the path are changed. There are some exceptions to this which require manually updating the indexes.
Indexes may only be created for instance of subclasses of UnorderedCollection.
To take advantage of an index you have built on your collection, you must perform the query using GsQuery syntax, rather than select: or similar iteration methods. A query performed using GsQuery will use indexes, as long as an index exists for the particular instance variable involved in the query. If an index does not exist, then the GsQuery will be performed iteratively, with performance similar to the comparable select: or detect: operation.
When the collection is properly indexed, GsQueries can return results without having to iterate the collection, fault the intermediate objects into memory, or send messages to each object.
GsQueries can be used on most kinds of Collection, not only UnorderedCollection. However, the performance benefit only appears on instances of subclasses of UnorderedCollection for which the appropriate index or indexes exist.
As with any kind of optimization, it’s important to consider the application’s performance profile, performance requirements, and the entire context, rather than automatically creating indexes on all possible paths.
The process of creating indexes creates overhead. The additional internal objects created use some space, and building an index may take some time. As the data in the repository changes, including objects added to and removed from the collection itself as well as changes in actual values, the mappings in the index structures need to be updated. Periodically, indexes should be audit ed to ensure integrity, and rebuilt if necessary; rebuilds are required for some system upgrades. Indexes must be specifically removed when the collection is removed, to ensure the internal infrastructure is cleaned up.
While most collections with more than a few thousand objects will see better performance using indexed queries, it is wise to consider indexes with this overhead in mind. Before going through the trouble of creating an index, you should determine that the index provides value. There are a number of factors that strongly influence queries, both iterative queries and indexed queries. These factors interact with each other and there are other factors, such as caching, that also influence performance.
In order to take advantage of efficient indexed queries on your collection, the following steps need to be done:
a. Determine the queries that can benefit from optimization, and describe them using query syntax. Query syntax is described starting here.
For example, to query for employees under 21 who live in Oregon, the query string might be:
(each.age < 21) & (each.address.state = 'OR')
2. Create one or more indexes on the collection, that specify the particular instance variable path on which you will perform the query. Creating indexes is described starting here.
To support the above query, you may want to create two indexes, for example:
GsIndexSpec new
equalityIndex: 'each.age' lastElementClass: SmallInteger;
equalityIndex: 'each.address.state' lastElementClass: String;
createIndexesOn: myEmployees.
3. Execute the query on that indexed collection, using query protocol. How to define and execute queries is described starting here.
(GsQuery fromString: '(each.age < 21) & (each.address.state = ''OR'')')
on: myEmployees;
queryResult
In addition to creating indexes and queries, you will also need to do some management on your indexes and queries. For example, you should evaluate your indexes for performance, remove indexes that are no longer needed, and audit indexes to ensure the structures are correct. Many of these indexing tasks are handled by IndexManager.
GemStone indexing uses several syntactical elements that are either specific to, or primarily used for, index creation and indexed queries.
Indexes are created, and queries formed, using special syntactic structure called a path, which designates variables for indexing and describes certain features of the index. Path syntax uses a period to represent the object/instance variable name relationship.
For example, given a collection of Employees, in which each employee has an address instance variable, which refers to an Address that has a state instance variable, the path is:
address.state
account.order.address.state
In the simplest case, a path on an instance variable on the collection elements, this is just the instance variable name. For example:
firstName
You may also specify an empty path, meaning the elements of the root collection itself.
Each instance variable name on the path is a pathTerm. In the above example, address and state are each pathTerms. Paths can contain a long string of pathTerms, if the elements of the collection represent a deeply nested tree of objects.
Path-dot syntax can be used anywhere in GemStone code; it is required in index creation and queries, for which message sends are not allowed.
An initial 'each.', where each represents the elements of the collection, is recommended but optional for GsIndexSpec index creation, and required for GsQueries. For example:
each.address.state
A vertical bar | in the path indicates the presence of two alternate instance variables that will be indexed together, as if they were a single variable.
For example, you might want to search on both name and nickname in a single operation. This might look like this:
account.name|nickname
An asterisk * in the path indicates a collection, which must be an instance of an indexable class (an instance of a subclass of UnorderedCollection). A set-valued path term may not be the first term in the path.
For example, if the instance variable children contains an IdentityBag of instances of Child, and a child has the instance variable age:
children.*.age
The GsIndexSpec/GsQuery classes provide the general purpose indexing interface. An older syntax using UnorderedCollection methods to create indexes, and selection blocks with curly braces to define queries, is an alternate way to use indexes. This older syntax remains fully supported in order to ensure upgraded applications do not require changes. However, new features are not available using this historic API.
See Managing Indexes for information specific to the historic API.
Creating an equality index creates an internal btree that contains the ordered values of the instance variable that is indexed. For example, an index on firstName creates a btree containing ’Conan’, ’Fred’, and so on. This allows fast lookup of a position in this btree when performing the query, and values that are equal or greater or less than can be returned in order as needed.
Building this btree and providing predictable lookup requires that the values be comparable in well-known and efficient ways. When building indexes, there are choice to make in balancing the restrictivity of the indexed values vs. the impact of comparison on query performance.
Performing an identity query creates no such restrictions on the index, since the comparison is by identity (OOP), and any two objects can be compared this way.
To provide the definition of comparison, equality indexes require specifying the lastElementClass. This generally restricts the indexed values to instance of this class or of subclasses of this class, although string classes have some special handling.
The following classes, and subclasses of these classes, are optimized for indexes. In most cases, the final element you will create an index on will be one of the following. For legacy indexes, the index structures encode the value; for btreePlusIndexes, they can perform optimized comparisons. These classes are subclasses of Magnitude or CharacterCollection.
Character, SmallInteger, SmallDouble, SmallFraction,
String, DoubleByteString, QuadByteString,
Unicode7, Unicode16, Unicode32,
Symbol, DoubleByteSymbol, QuadByteSymbol,
Time, Date, DateTime, DateAndTime,
LargeInteger, Float, DecimalFloat, ScaledDecimal, FixedPoint, Fraction
Boolean is a special case; it is a special, and so does not require looking in legacy indexes. However, it does not support optimizedComparison.
You can create indexes where the indexed values are instances of classes other than the above, including classes you have defined yourself.
Identity indexes on instances of your own classes require no extra work, since they compare on the identity of the objects.
If you wish to create an index where the values that are instance of application classes that do not subclasses of basic classes, you must ensure these classes implement comparison operators, as described here.
Some cases of data type comparison have special handling in indexes.
A nil along the path to an indexed slot is a different issue; such missing sections of a reference tree are allowed without special handling.
Indexing on strings has complications, due to the different collation orders it is possible to configure. For more on collation, see Chapter 5.
To summarize, strings come in two "flavors":
Symbols (Symbol, DoubleByteSymbol and QuadByteSymbol) follow the same collation rules as Traditional strings.
A repository in Legacy String Comparison Mode disallows compare between Unicode strings and Traditional strings or symbols, to avoid unpredictable results. In this mode, you cannot mix Traditional and Unicode strings; it is difficult to avoid errors when using Unicode strings in Legacy String Comparison Mode.
A repository in Unicode Comparison Mode uses Unicode collation for all flavors of strings and symbols. In this mode, you can use Traditional strings and Unicode strings interchangeably.
Constraining the indexed variables using lastElementClass is not effective for strings, since Traditional string, symbol and Unicode string classes inherit by codePoint range rather than by collation or other behavior. It is allowed, but not recommended, to specify CharacterCollection (the superclass of all kinds of Strings and Symbols), since (depending on the mode and index type) it may create an ambiguous indexes.
In both Comparison Modes, specifying a lastElementClass of any of the following will create an index that includes a cached collator:
Unicode7, Unicode16, Unicode32
In Legacy String Comparison Mode, the lastElementClass of any of the following will permit instance of any of the classes:
String, DoubleByteString, QuadByteString,
Symbol, DoubleByteSymbol, QuadByteSymbol
In Unicode Comparison Mode, the lastElementClass of any of the following will permit instance of any of the classes:
String, DoubleByteString QuadByteString,
Symbol, DoubleByteSymbol, QuadByteSymbol
Unicode7, Unicode16, Unicode32
Note that some optimized indexes disallow mixing Symbols with any kinds of Strings.
If you create an index on values that are instances of your application classes, these classes must implement the basic comparison operators, at least =, >, <, and <=. You can redefine one or more of these in terms of another.
The operators must be defined to conform to the following rules:
While the indexing subsystem does not use hashing itself, note that redefining = does requires attention to the hash method to be consistent with the new definition of equality. Object that are equal must return the same hash value to ensure they behave in a consistent and logical manner in all use cases.
Before you can define indexes on your collection, you need to determine the ways in which you will need to search your collection to retrieve elements. The queries you need determine the details of the indexes to create.
At its simplest, a query consists of the specification of an instance variable common to all the objects in the collection, a comparison operator, and a literal to which the value is compared. For example, if you wish to be able to find all employees 21 and older, your query formula could be something like this:
each.age >= 21
In this example, every object in the collection (each) has an instance variable age, which is specified using dot-path notation. The value of that instance variable is compared, greater than or equal, to the literal SmallInteger 21.
While this formula is simple, you can formulate queries based on multiple instance variable values, operators, and constants, and combine them using boolean logic. However, using this query syntax, you cannot include message sends; the indexes are based on structural relationships using instance variable names.
For performance and clarity, it is an advantage to use short and simple queries. However, it may be valuable to compose your queries based on the statement of business logic. This may mean creating a complicated query that is not in its most efficient form. The final query will be automatically optimized to a logically equivalent form that is more efficient for GemStone to execute. See Formulating queries and performance.
A query contains a predicate expression, which is a Boolean expression that, when evaluated with the elements of the collection, returns true or false. In a query, the expression usually compares an instance variable on the collection objects with another instance variable or with a constant.
A predicate contains one or more predicate terms—the expressions that specify comparisons.
A term is a Boolean expression containing an operand and usually a comparison operator followed by another operand. For example, in
each.age >= 18
each.age and 18 are operands, while >= is a comparison operator. The only time you would not have a comparison operator is if the operand is itself a Boolean (true or false).
If you want retrieval of an element to be contingent on the values of two or more of its instance variables, you can join several terms using a conjunction operator & (logical AND) or disjunction operator | (logical OR).
The conjunction operator, &, makes the predicate true if and only if the terms it connects are true. The disjunction operator, |, makes the predicate true if either one, or both, of the terms it connects are true.
You may also negate individual predicate terms using not.
Each predicate term must be parenthesized.
For example, the following are legal queries.
(each.name = 'Conan') & (each.job = 'librarian')
(each.age <= 40) | (each.job = 'librarian') not
Queries that use less than or greater than, such as each.age >= 18, define a starting (or ending) point in a range query. Specifying both a starting point and ending point creates a range query. For example,
(18 <= each.age) & (each.age <= 65)
These two terms can be combined into single range predicate.
18 <= each.age <= 65
Range specifications such this can only be defined with this syntax if the operands and comparison operators truly define a range.
GsQuery is a programmatic way to define a query, allowing you to easily abstract, store and reuse various aspects of the query.
To create a GsQuery, you create an instance of GsQuery using query predicate syntax. The most simple way to create a GsQuery is by passing in a string. For example:
GsQuery fromString: 'each.age >= 18'
Since the fromString: protocol requires a string, if the query includes literal strings, you must include two single quotes within the string. For example:
GsQuery fromString: 'each.firstName = ''Fred'''.
This message will return an instance of GsQuery. Before it can be executed, it must be bound to a collection:
The strings used to define GsQuery instances may contain variables—any element of a predicate that is are not a literal or path-dot expressions. This allows your query to be stored and executed later using different values.
For example, for a query such as
GsQuery fromString: '18 <= each.age <= 65'
This can be generalized to a query with variables:
GsQuery fromString: 'min <= each.age <= max'.
The resulting formula in the GsQuery includes 'min' and 'max' as variables. These must be bound to specific values before the query can be executed. Binding is done by sending the bind:to: message to the query. For the above example, to execute the query:
aQuery := GsQuery fromString: 'min <= each.age < max'.
aQuery
bind: 'min' to: 18;
bind: 'max' to: 65;
on: myEmployees;
queryResult
Note that the “max” and “min” in the query formula are string elements, and are not affected by any temporary or instance variables named max or min in the scope of the code being executed. The only way to resolve max and min are by binding variables.
Queries can be executed without an associated index, but there is no performance benefit. To execute a query efficiently, you need to also create an index on the instance variables for the query. These indexes provide a mapping from the specific key values that you are interested in to the results (the objects in the collection).
The path you provide when creating an index provides the key that is needed to lookup the value during a query. These keys are the values of a specific instance variables within the elements of a collection, or the elements of the collection itself. For example, given a collection of Employees, and the path each.address.state, the objects at the state instance variable (perhaps two-character Strings) would be the keys.
The values for these keys are the objects in the collection itself, which are the results of the query using that index. For our example, the values are the instances of Employee in AllEmployees. When you make an indexed query for Employees with addresses in a given state, that state key is used to lookup the matching elements (instance of Employee).
Indexes fall into two main types: Equality Indexes and Identity Indexes. Equality indexes support equality-based queries, including >, >=, <, <=, =, and ~=. Identity indexes support queries containing identity comparisons, == and ~~.
When creating an index, you specify whether an equality or identity index is created. Since identity comparisons are done by OOP, not by the object’s contents, they are faster, and the lastElementClass does not matter; any two objects can be compared for identity.
If you only have an identity index on a variable, but form your query using an equality operator, the query will not have an index to use (and thus, will iterate the collection).
You may create both equality and identity indexes on the same path.
GemStone supports two different internal structures; the legacy structures, which includes a btree and an index dictionary; and the btreePlus structures, which use a btree+ and does not require the dictionary. The query results are the same for each, of course, but the performance profile is different.
The decision of which to use impacts your indexing work.
With a legacy identity index, the index dictionary provides a identity-based lookup for the key. In a btreePlus identity index, the keys are in a btree. This allows you to stream over the results of a identity query only when using a btreePlus index.
The index structure you use can be specified for each index, otherwise it relies on the system or configured default. Since structures are shared between indexes on a collection, all indexes on a specific collection must use the same internal structure.
Note this is entirely distinct from the historic indexing API (using UnorderedCollection methods to create indexes); creating indexes using the historic API may create either kind of internal structure, depending on the current default.
See here for details on how to configure each index type.
Creating an index involves creating an instance of GsIndexSpec and sending messages to define the index and the parameters and options for that index, then use this spec to create indexes on a specific collection.
Before creating an index, you must know:
To create an index using GsIndexSpec, do the following:
To define an index, send an index creation message to the GsIndexSpec, including the path you want indexed, the class of the last element (for equality indexes), and options (if used).
The most general index creation methods include:
equalityIndex:lastElementClass:
identityIndex:
While these methods can be used to create indexes on strings, there are additional index creation methods are specific to various kinds of string indexes. These methods have variants that allow you to specify the index options.
To actually create the index, send the message createIndexesOn:, providing the specific collection on which you want to create the indexes.
To put this all together, for example:
GsIndexSpec new
identityIndex: 'each.userId';
equalityIndex: 'each.age' lastElementClass: SmallInteger;
equalityIndex: 'each.address.state' lastElementClass: String;
createIndexesOn: myEmployees.
This creates an identity index on userId, an equality index on age, and another equality index on address.state, all on the collection myEmployees.
You can view the indexes by recreating the specification from the indexed collection, using indexSpec. For example:
run
myEmployees indexSpec printString
%
GsIndexSpec new
identityIndex: 'each.userId';
equalityIndex: 'each.age'
lastElementClass: SmallInteger;
equalityIndex: 'each.address.state'
lastElementClass: String;
yourself.
Equality indexes on strings present a variety of options and restrictions, depending on:
The following methods can be used to create equality indexes on strings and/or symbols. Note that each has a variants that allow you to specify the index options.
equalityIndex:lastElementClass:
unicodeIndex:
unicodeIndex:collator:
stringOptimizedIndex:
symbolOptimizedIndex:
symbolOptimizedIndex:collator:
unicodeStringOptimizedIndex:
unicodeStringOptimizedIndex:collator:
Which one you should use, and the rules allowing comparisons between different kinds of data, are different for repositories in Legacy String Comparison Mode or in Unicode Comparison Mode.
Comparison Modes are described on here.
In Legacy String Comparison mode, it is disallowed to compare Traditional and Unicode strings, so it’s not possible for the indexed variables to contain a mix of Unicode strings and Traditional strings or Symbols.
To create a legacy index on Traditional strings, symbols, or a mix of the two,
use a equalityIndex:* method specifying a lastElementClass of String.
If you are using Unicode strings in Legacy String Comparison Mode,
use a unicodeIndex:* method.
You cannot create an optimizedComparison index on a mix of types.
If your indexed elements are all Traditional strings,
use a stringOptimizedIndex:* method.
If your indexed elements are all Unicode strings,
use a unicodeStringOptimizedIndex:* method.
If your indexed elements are all Symbols,
use a symbolOptimizedIndex:* method.
In Unicode Comparison Mode, Traditional strings are collated exactly like Unicode strings, and indexes make no distinction between them.
Symbols are also collated like Unicode strings, but due to the definition of equality, optimizedComparison indexes do make a distinction between strings and symbols.
To create a legacy index in Unicode Comparison Mode on Traditional strings, Unicode strings, symbols, or any mix, use a unicodeIndex:* method, to ensure the collator is persisted with the index.
optimizedComparison indexes may mix Traditional and Unicode strings, but may not mix strings and symbols.
If your indexed elements are all Traditional or Unicode strings,
use the method unicodeStringOptimizedIndex:*.
If your indexed elements are all Symbols,
use the method symbolOptimizedIndex:*.
With legacy indexes, the indexing internal structures include a dictionary. This dictionary, as a side effect, provides de facto identity indexes with some equality indexes: specifically, for non-terminal pathTerms, and where the lastElementClass is a Special (SmallInteger, SmallDouble, SmallFraction, Character, or Boolean, in which equality and identity are the same). Such indexes are referred to as implicit indexes.
Since with btreePlusIndexes there is no dictionary, there are also no implicit indexes defined.
For clarity, and to avoid dependency on side-effects of the internal structures, it is recommended to explicitly define any identity indexes that you require. There is no risk in explicitly creating an identity index that would exist as a implicit index.
An instance of GsIndexOptions specifies features that will be used when creating a particular index on a collection. GsIndexSpec index definition methods all have variants that accept an instance of GsIndexOptions, although some override certain settings. If no GsIndexOptions is explicitly provided, the session or repository default is used.
The GsIndexOptions defines if the index is a legacy index or a btreePlus index, as well as other important indexing features. The options available for GsIndexOptions are:
GsIndexOptions class >> legacyIndex
defines a legacy index structure, and disables btreePlusIndex and optimizedComparison.
GsIndexOptions class >> btreePlusIndex
defines a btreePlus index structure, and disables legacyIndex.
GsIndexOptions class >> optimizedComparison
adding optimizedComparison is only allowed with btreePlusIndex.
GsIndexOptions class >> reducedConflict
Instructs the index to create the internal structures as reduced-conflict, recommended when indexing on a reduced-conflict collection.
GsIndexOptions class >> optionalPathTerms
Instructs the index to allow objects that do not include an indexed instance variables to be present in the indexed collection.
These options are described in more detail starting here.
GsIndexOptions can be combined using the plus operator and removed using the minus or not operators, with the caveat that not all options are compatible with each other. For example:
GsIndexOptions legacyIndex + GsIndexOptions reducedConflict
GsIndexOptions btreePlusIndex + GsIndexOptions optimizedComparison not
If you combine two options that conflict, the later one has precedence.
Creating an instance of GsIndexOptions, using class methods such as GsIndexOptions >> legacyIndex, begins with the default, repository-wide GsIndexOptions.
The specific value requested by the class method (such as legacyIndex) overwrites the default only for that setting and its dependents.
For example, using GsIndexOptions legacyIndex will return a GsIndexOptions instance with legacyIndexes on and both btreePlusIndex and optimizedComparison disabled, regardless of the default. However, the default GsIndexOptions setting for other values, such as reducedConflict, will be retained
The initial default GsIndexOptions is:
GsIndexOptions btreePlusIndex + GsIndexOptions optimizedComparison.
In an upgraded application, the system default is set instead to:
to ensure that the behavior does not change from previous releases.
You can manually set the repository-wide default, as SystemUser, by executing GsIndexOptions class >> default:. Do this with care, since it may affect all indexes that are created in the future that do not explicitly set all the GsIndexOptions values.
For example, if you have an upgraded application and want to default to btreePlusIndexes and optimizedComparison, execute
GsIndexOptions default: (GsIndexOptions legacyIndex +
GsIndexOptions reducedConflict)
You may also set a session-wide default that applies only to your session and only until you log out, using GsIndexOptions class >> sessionDefault:.
The options btreePlusIndex, optimizedComparison, and legacyIndex are used to specify the index type.
The following table describes the three combinations:
Using optimizedComparison, it is disallowed to use a mix of certain kinds of objects in the collection. The following rules when using optimizedComparison:
When using the "Optimized" index specification methods to define an index, it overrides the settings for these three options in the default or argument GsIndexOptions.
In a multi-user system, reduced-conflict collection classes may help avoid transaction conflicts if multiple users simultaneously add or remove objects from the collection; for more on this problem, see Classes That Reduce the Chance of Conflict. For example, using an RcIdentityBag rather than an IdentityBag allows concurrent updates to the collection itself.
If there are concurrent updates of the same indexed instance variable for different objects in the collection (for example, the addresses associated with two different customer objects are both changed), there is not an application object conflict, since the objects are independent. However, there may be a transaction conflict due to the indexes, since both addresses are keys in the same indexing structure.
This doesn’t apply to legacy identity indexes, which are always reduced-conflict.
To avoid transaction conflicts from the indexing internal structures, specify that the indexes are reducedConflict, using GsIndexOptions reducedConflict.
GsIndexSpec new
equalityIndex: 'each.address'
options: (GsIndexOptions reducedConflict)
A homogenous collection is one in which each element in the indexed collection defines the instance variable described by the index, for each pathTerm in the indexed path. By default, indexes require that the collection be homogeneous. If any element does not have the given instance variable, it will raise an error when the element is added to the collection.
If you want to create an index on a non-homogenous collection, you can define the indexes with optional pathTerms. For example:
GsIndexSpec new
equalityIndex: 'each.nickName'
options: (GsIndexOptions optionalPathTerms)
When creating an optional pathTerm index, it is not an error when the objects in the collection do not implement an instance variable specified by the index. For a multi-pathTerm index, that includes each pathTerm; objects with missing instance variable definitions for any of the pathTerms in the indexed path are not considered when creating query results.
Note that this option bypasses some error detection. If you create an index using an instance variable that does not exist at all (perhaps due to a typing error), then the index is created correctly and does not report an error, even if it does not create the index you might have intended to create.
Once you have defined your query, created the GsQuery, and bound it to a collection, there are further options in how to access the results of the query.
To simply get the results, you can send queryResult to the instance of GsQuery.
GsQuery >> queryResult will, like selection block queries, return a new instance of collection of the same class as the base collection, unless protocol such as asArray are used to specify the class of the results.
Also similarly to selection block queries, queries on instances of reduced-conflict (Rc) collections, return the equivalent non-Rc collection.
The collection returned from a query has no index structures. Indexes belong to specific instances of collections, rather than the classes. If you want to perform indexed selections on the new collection, you must build the necessary indexes on the new collection.
GsQuery accepts other Collection protocol, and, provided the query has bound to a collection and to query variables, the GsQuery instance responds to as if the GsQuery was a collection of the results of the query. This means that rather than having to put the results of a query into a temporary variable for further processing, GsQuery can respond directly to the kinds of message you are likely to send to the query results.
You can convert the type of collection, for example, using asArray or asIdentityBag:
(GsQuery fromString: 'each.address.state = ''OR'''
on: Employees) asArray
Or fetch a single instance from the results:
(GsQuery fromString: 'each.firstName = ''Sophie'''
on: Employees) any
Performing one of the collection operations that are provided for GsQuery simplifies your code, since you may not have to put results in temporary variables. It may or may not allow you to avoid creating query result objects.
Enumeration methods also allows you to perform code while the query is executing, rather than waiting for the results.
While GsQuery responds to messages as if it was a collection, the results of a query are not a static collection. By default, each time you execute any GsQuery collection protocol, the query is performed again. So, for example, sending isEmpty to a GsQuery before sending asArray will execute the underlying query twice.
You can cache the results of your GsQuery using GsQueryOptions cacheQueryResult. By default, it is false. Using this option allows the resultSet of the GsQuery to be cached. Note that this cache will not reflect changes in the root collection that occurred after the query was executed; you are responsible for re-running the query if current results are required.
To create an instance of GsQueryOptions with cacheQueryResult true, use this expression:
GsQueryOptions cacheQueryResult
And use this instance with GsQuery methods that includes the options: keyword.
query := (GsQuery fromString: 'each.address.state = ''OR'''
options: (GsQueryOptions cacheQueryResult)
on: Employees).
query isEmpty ifTrue: [^'no results'].
report := self createReportingStructure.
query do: [:ea | report updateDataWith: ea].
...
Among the collection protocol that GsQuery understands are the methods do:, select:, reject:, collect:, detect: and detect:ifNone:. These may look similar to iterative queries on the root collection, but since the actual query is already provided by the GsQuery, the action is quite different.
With GsQuery, these will operate on the result set of the initial query. In essence, you are adding an additional, non-indexed search criteria to the indexed query. This additional code will be executed for each element in the collection for which the indexed query matches, at the time that the index query is examining that result element.
For example, if you have an index on Employee age, and a query such as:
(GsQuery fromString: 'each.age <= 18' on: Employees)
Using this query, you can add an additional search criteria using select:, so that only Employees who live in Oregon are returned.
(GsQuery fromString: 'each.age <= 18' on: Employees) select:
[:each | each address state = 'OR']
This will return a result set that includes Employees under 18 who live in Oregon.
The address message is only sent to the elements (Employees) who are under 18, it is not executed for every element in the collection. Also note that the state comparison does not use an index; these are message sends.
Provided there is an index on the query path, the enumeration block operates on each object in the result set in the order specified by the index. However, if you wish to use the result of the select: or other enumeration method, the result will necessarily be a kind of UnorderedCollection, and the objects in the returned collection will be not be ordered.
You can still use the enumeration protocol to produce results that are ordered according to the index, by adding each element to a temporary Array. However, for ordered results, you may want to stream over the results instead.
It is more efficient to perform an indexed query with multiple predicates using GsQuery, than to add additional criteria using enumeration methods.
For example, the following code returns a collection of all employees who are 26 or younger, and who respond false to hasOtherHealthInsurance.
GsQuery fromString: 'each.age <= 26' on: myEmployees)
reject: [:each | each hasOtherHealthInsurance]
This may be useful if you have predicates that require message sends. However, if you can formulate the second statement as an indexable predicate, it would be more efficient as a query. If hasOtherHealthInsurance was actually an instance variable, you could write this as:
(GsQuery fromString: '(each.age <= 26) &
(each.hasOtherHealthInsurance) not' on: myEmployees)
queryResults
Since the code in the block provided to select: (and similar methods) is executed for each element that the indexed query itself would return, this provides a way to exit the indexed query early. In this block, you can execute any code (as long as it does not modify the collection or the objects in the collection, in ways that would change the result set). If it’s no longer useful to continue the search, you can exit the block and potentially save a lot of time.
For example, say you have a collection of purchase orders, and you are generating a report of all open purchase orders. If a new order arrives during the period you are executing this operation, you might want not want to bother producing the already-obsolete report.
(GsQuery fromString: 'each.isOpen' on: MyOrders) do:
[:anOrder |
report add: anOrder description.
self checkForNewOrders ifTrue: [^'report canceled']
]
It may be more useful to return the result of an equality query as a stream, instead of a collection, especially if the result set is large. Returning the result as a stream not only is faster, is also avoids the need to have all the result objects in memory simultaneously.
You can stream on an identity query only when using a btreePlusIndex. You cannot stream on the results of an identity legacyIndex.
Streaming on index results return the results in order that is defined by the index, so you can iterate over the elements that are returned in the order defined by the index, with no extra effort.
To get the results as a stream, use the message GsQuery >> readStream or GsQuery >> reversedReadStream.
These methods return an instance of a specialized subclass of Stream that understand a limited number of ReadStream protocol. Legal messages to an index stream are:
atEnd
do:
next
reversed
size
Streams do not automatically save the resulting objects. If you do not save them as you read them, the results of the query are lost. You should not modify the objects in the base collection while streaming, nor add or remove objects; doing so can cause an error or corrupt the stream.
For example, suppose your company wishes to send a congratulatory letter to anyone who has worked there for thirty years or more. Once you have sent the letter, you have no further use for the data. Assuming that each employee has an instance variable called lengthOfService, and there is an index on this, you can use a stream to formulate the query as follows:
oldTimers := (GsQuery fromString: 'each.lengthOfService >= 30'
on: myEmployees) readStream.
[ oldTimers atEnd ] whileFalse: [
| anEmployee |
anEmployee := oldTimers next.
anEmployee sendCongratuations. ].
Streams on query results have certain limitations; for example, the predicate in the query must be logically streamable. The following restrictions apply:
Enumerated path terms allow you query over more than one instance variable value in a single query. This is specified using the vertical bar | in the path term, between the instance variable names.
The instance variables are treated as alternate choices; if any one of the specified instance variables matches the search criteria, the predicate evaluates to true.
For example, you might want to search on both first name and nickname in a single operation. The query might look like this:
(GsQuery fromString: 'each.firstName|nickName = ''Freddie'''
on: MyEmployees) queryResult
When this is executed, the results will include all instances that have either the firstName equal to ‘Freddie’, or the nickName ‘Freddie’, or both.
In order to optimize this query with an index, you need to create an index on the specific enumeration, e.g. 'each.firstName|nickName'. An enumerated path term query will not use an index on the individual instance variables that are enumerated.
Your business objects may themselves contain collections; for example, an employee may contain a collection of children; and you may want to search based on some criteria of the objects in that collection. As long as this collection is itself indexable, indexes and queries can include all elements within these contained collections.
Index paths that include collections, and the queries that use these indexes, are generally referred to as Set-valued indexes and queries for historical reasons, although any kind of indexable collection, not just Sets, may be used.
When you wish to specify a path containing an instance of a subclass of UnorderedCollection, the collection is represented by an asterisk *. This syntax may be used to create indexes and perform queries. Only GsQuery may be used to perform set-valued queries.
For example, suppose you want to know which of your employees has children of age 18 or younger. To facilitate such queries, each of your employees has an instance variable named children, which is implemented as a set. This set contains instances of a class that has an instance variable named age.
GsIndexSpec new
equalityIndex: 'each.children.*.age'
lastElementClass: SmallInteger;
createIndexesOn: myEmployees.
When you execute a set-valued query, the results you get will follow the particular semantics of Set-valued queries. Since there are potentially multiple “true” query results for a given element in the base collection, the result of a set-valued query such as this can be larger than the original collection.
For example, consider the following query, using the index created above:
(GsQuery fromString: 'each.children.*.age <= 18'
on: myEmployees) queryResult
In this example, if the root collection myEmployees is a Bag or IdentityBag (rather than a Set or IdentitySet), and an employee has two children that are under 18, then that employee will appear in the results (a Bag or IdentityBag) twice. Employees with three minor children appear in the results three times, and so on. The resulting collection may be several times as large as the original collection, depending on the details of the query and data.
If the root collection myEmployees is a Set, which does not allow multiple instances of the same object, this potential source of confusion does not occur.
The semantics of set-valued indexes do not allow multiple conjoined predicates that use the same set-valued pathTerm, since each predicate is evaluated separately. (conjoined predicates are those connected using &).
In general, it is recommended to avoid using multiple- set-valued predicate queries, although some multiple-predicate set-valued queries can be optimized, or avoid the problem cases, and are safe and therefor allowed.
You may need to find out about all the indexes in your system, and to remove selected indexes or clean up indexes that were not successfully created. This functionality is provided by the class IndexManager.
IndexManager has a single instance which provides much of the functionality, accessible via IndexManager current.
This instance is lazy initialized, and stored in the IndexManager class instance variable after it is created. Any configuration you do on IndexManager current, therefore, will be used by all affected operations, if you commit after making the change.
Indexing a large collection will take some amount of time to create the infrastructure and tracking for each indexed object.
The message progressOfIndexCreation returns a description of the current status for an index as it is created.
While the index is being created, the index is write-locked. Any query that would normally use the index is performed directly on the collection, by brute force. If a concurrent user modifies an object that is actively participating in the index at the same time, index creation is terminated with an error.
Creating or removing an index creates and/or modifies many objects related to the internal structures that support indexes. These modifications are uncommitted changes that must be kept in the session’s memory until these changes are committed. Many uncommitted changes place a large demand on memory and creates a risk of out of memory conditions. Chapter 8, “Transactions and Concurrency Control”, explains uncommitted objects and transactions in more detail, while Chapter 15, “Performance and Optimization” includes information on object memory use.
To avoid problems during index creation, it is often necessary to set the IndexManager to autoCommit. When IndexManager is set to autoCommit, it will commit the partially created index, rather than risk running out of resources and failing the index operation.
By default, autoCommit is false. When you send the following message:
IndexManager autoCommit: true
it configures your IndexManager such that the current transaction is committed during an indexing operation, whenever any of the following occur:
The default is 60. This threshold can be changed using IndexManager >> percentTempObjSpaceCommitThreshold: anInt
The default is SmallInteger maximum value, which means this limit is effectively disabled.This limit can be changed using IndexManager >> dirtyObjectCommitThreshold: anInt
When autoCommit is true, a transaction will be started (if necessary) before the indexing operation begins, and the IndexManager will commit at the completion of the indexing operation. Note that this means that, even if you are in manual transaction mode and not in a transaction, index operations will cause changes to be committed to the repository without you explicitly beginning a transaction.
If you want to enable autoCommit only for the current session, not for all index creation, you can use
You may create indexes on temporary collections containing temporary and persistent objects. However, on abort, any indexes on temporary collections are removed.
For a full description of the indexes on a particular collection, send indexSpec to the collection. This produces a string containing the GsIndexSpec code that would recreate the same indexes, and provides useful documentation on those indexes.
myEmployees indexSpec printString
%
GsIndexSpec new
equalityIndex: 'each.age'
lastElementClass: SmallInteger;
equalityIndex: 'each.address.state'
lastElementClass: String;
options: GsIndexOptions reducedConflict;
identityIndex: 'each.userId';
yourself.
The following IndexManager messages allow you to inquire about all indexes in the repository.
Returns a collection of all UnorderedCollections in the repository that have indexes.
Returns a report on all indexes on all UnorderedCollections in the repository.
There are a number of ways to remove indexes.
Since indexing internal structures create references to the indexed collection and to objects in the collection, before dereferencing a collection, you should be sure to remove all indexes on the collection. This allows the collection to be garbage collected.
As you can create indexes based on an instance of GsIndexSpec, you can also use that specification to remove these indexes.
GsIndexSpec >> removeIndexesFrom: aCollection
This method removes the indexes described by the GsIndexSpec from the collection aCollection. If any of the indexes do not exist, they are not removed and no error is returned.
This is most useful in combination with the method that creates the spec from the existing collection. For example:
(MyEmployees indexSpec)
removeIndexesFrom: MyEmployees.
To remove a single index, you may edit the specification code printed by indexSpec, or create a simple GsIndexSpec with information to remove a single index:
(GsIndexSpec new
equalityIndex: 'each.age' lastElementClass: Object)
removeIndexesFrom: MyEmployees.
IndexManager, which provides a system-wide view of all the indexes in the repository, provides a number of methods to remove indexes both individually, by collection, and globally.
IndexManager >> removeEqualityIndexFor: aCollection on: aPathString
Removes an equality index from the collection aCollection with the indexed path described by aPathString. If the path specified does not exist, this method returns an error. Implicit indexes are not removed.
IndexManager >> removeIdentityIndexFor: aCollection on: aPathString
Removes the identity index from the collection aCollection with the indexed path described by aPathString. If the path specified does not exist, this method returns an error. Implicit indexes are not removed.
IndexManager >> removeAllIndexesOn: aCollection
Removes all explicitly created indexes from the collection aCollection. Implicit indexes that were created by these elements participating in other indexed collections are not removed.
IndexManager >> removeAllIndexes
Removes all indexes on all UnorderedCollections, including all implicit and partial indexes.
IndexManager >> removeAllTracking
Removes all indexes on all UnorderedCollections, and all object tracking. While this is the fastest way and most complete way to remove indexing infrastructure, if you are using modification tracking for any other purpose, that tracking will be removed as well.
When objects that participate in an index are modified, the related indexing infrastructure must be updated. This causes some overhead. If you are performing an operation that will modify a large number of objects that participate in multiple indexes, such as a large migration, it may be more efficient to remove some or all of the indexes on the collection before performing the migrate, and rebuild those indexes after the migration is complete.
It is also sometimes required to remove and rebuild indexes as part of a GemStone upgrade; certain changes in GemStone kernel classes require you to either rebuild specific kinds of, or all, indexes. Any requirement to do this will be included in upgrade instructions in the Installation Guide for the version of GemStone to which you are upgrading.
To remove and rebuild indexes, you can extract and save the GsIndexSpec, and reuse that after the operation is complete.
| mySpec |
mySpec := myCollection indexSpec.
mySpec removeAllIndexesFrom: myCollection.
<perform migration or other operation>
mySpec createIndexesOn:myCollection
Using IndexManager >> getAllNSCRoots, you may extend this example to retrieve the GsIndexSpec for each collection in the repository, which will allow you to remove and rebuild the indexes.
To ensure that indexing structures are consistent, some kinds of errors that may occur during index creation will disable commits. Before creating an index, it is advisable to commit any work in progress, to avoid losing any work if an indexing error does occur.
For example, if you create an index on a collection and one or more of the objects that participate in the index do not implement the instance variable on the path, it will raise an error (unless using optionalPathTerms, as described here).
If an error occurs partly through index creation, and the autoCommit status (see Auto-commit) means that some portion of the index creation was committed, a collection may have unusable partial indexes. These indexes must be manually removed.
The following IndexManager instance methods allow you to remove incomplete indexes, while not affecting any complete, usable indexes:
IndexManager current removeAllIncompleteIndexes
Removes all incomplete indexes on all UnorderedCollections.
IndexManager current removeAllIncompleteIndexesOn: anNSC
Removes all incomplete indexes on the specified UnorderedCollection.
If you modify objects that participate in an index, try to commit your transaction, and your commit operation fails, query results can become inconsistent. If this occurs, abort the transaction and try again.
Indexes should be audited regularly, as part of your regular application maintenance, to ensure there are no problems.
You can audit the internal indexing structures for a particular collection by executing:
aCollection auditIndexes
This audits all the indexes, explicit and implicit, on the given collection. If indexes are correct, this method returns 'Indexes are OK' or 'Indexes are OK and the receiver participates in one or more indexes.'. If there are no indexes on the collection, a message such as 'No indexes are present.' is returned.
In the case of failure, a list of specific problems is returned.
You can audit all indexes in the entire repository at once using:
IndexManager current nscsWithBadIndexes
which will return an IdentitySet containing all collections that fail auditIndexes. Depending on the number of indexed collections in your system, this may take a considerable time to run.
In the rare case of a problem reported, the usual way to resolve the problem is to remove and rebuild the affected indexes. In some cases, removing all indexes on the collection may succeed even if the internal problems prevent a single index being removed.
The value of Indexes is to improve performance, of course. It is always recommended to perform tests to verify performance improvements.
Indexing improves query performance dramatically (in most cases), but does have a negative impact on updating the indexed data, since the indexes must be kept up to date.
The performance characteristics of btreePlus and legacy indexes are quite different.
btreePlus indexes without optimized comparison are usually slower than other kinds of indexes. If your desired index cannot support optimizedComparison, you should use a legacyIndex.
btreePlus optimizedComparison indexes are usually considerably faster than a legacy index, but they create a somewhat larger negative impact on data updates.
As your application is in use and the data in the indexed collection changes, the index must be updated. While normally indexing a large collection speeds up queries performed on that collection and has little effect on other operations, there are cases in which maintaining the index can cause a performance bottleneck.
For example, you may notice slower than acceptable performance if you are making a great many modifications to the instance variables of objects that participate in an index, and more than one of the following is true:
Even so, indexing a large collection is still likely to improve performance unless more than one of these circumstances holds true. If you do experience a performance problem, you can work around it in one of two ways:
If you have created relatively few indexes but are modifying many indexed objects, it may be worthwhile to remove the indexes, modify the objects, and then re-create the indexes.
If you are making many modifications to only a few objects, or if you have created a great many indexes, it is more efficient to commit frequently during the course of your work. That is, modify a few objects, commit the transaction, modify a few more objects, and commit again.
The most efficient queries are the ones in which the first predicate will return the smallest result set. This is sometimes easy for a human to determine, but the query cannot predict this without actually running the query. Queries should be manually reviewed for these kinds of domain-specific optimizations.
For example, you might want to query for current orders for a particular customer.
(each.status = #current) & (each.customer.name = 'Smith')
If your application is likely to have only a few current orders, then this is more efficient. However, if you are likely to have many current orders, but only a few customers named Smith, it would be more efficient for you to write the formula in reverse order.
Queries, by default, are optimized before execution; for example, the not operator is transformed into the logical equivalent by changing the comparison operator.
In addition, the predicates are reordered as follows, from left to right:
1. predicates involving indexed paths.
2. predicates with identity comparisons on paths without indexes.
3. predicates with equality comparisons on paths without indexes.
Auto-optimize can be disable using the instance of GsQueryOptions that is associated with each query. The GsQueryOptions instance controls optimization and other query features. In addition to the various specific optimizations performed, GsQueryOptions controls if automatic query optimization is done; the default is to do auto-optimization.
In older versions of GemStone/S and GemStone/S 64 Bit, indexes and queries used a more limited API based on UnorderedCollection methods and a block-like query syntax. This API remains fully supported and interoperates with the GsIndexSpec/GsQuery API, with some limitations. A number of features are not supported by the older API.
UnorderedCollection provides protocol to create indexes. This creates the same index structures as GsIndexSpec, but does not provide access to some index features.
The following index creation methods are defined on UnorderedCollection:
createIdentityIndexOn:
createEqualityIndexOn:withLastElementClass:
The path argument is the same as the path used to create a GsIndexSpec index, however you may not include the initial "each".
For example, the following three statements create the same indexes that were created here.
myEmployees createIdentityIndexOn: 'userId'.
myEmployees
createEqualityIndexOn: 'age'
withLastElementClass: SmallInteger.
myEmployees
createEqualityIndexOn: 'address.state'
withLastElementClass: String.
Enumerated and set-value indexes and queries are not supported using historic API.
The used of legacyIndex or btreePlusIndex/optimizedComparison is based on the default GsIndexOptions. Whatever the session or system default is will determine the type of index being created
Indexes on various kinds of strings follow the same rules as GsIndexSpec string indexes, with the exception that the optimized indexes cannot be created this way.
To create unicode indexes, specify a lastElementClass of any Unicode string class (Unicode7, Unicode 16, or Unicode32). Since no collator can be specified, the index will be created using the current default IcuCollator.
An Rc Equality Index is a type of Equality Index in which internal indexing structures are reduced-conflict. This avoids some transaction conflicts when creating an index on a reduced-conflict (RC) collection, such as RcIdentityBag. Reduced-conflict classes are described in Indexes and Concurrency Control. Rc Equality indexes are described under Reduced-Conflict.
Using UnorderedCollection index creation protocol to create an index, the message is:
createRcEqualityIndexOn:withLastElementClass:
Selection blocks are a kind of block specialized for queries, using curly braces instead of brackets. The compiler understands this syntax and creates the selection block instance when the code or method is compiled.
A selection block query might be written like this:
{:each | each.address.state = 'OR'}
Selection blocks are quite restrictive:
In selection block queries, you can reference temporary, instance or other variables within the block, and these are resolved at runtime as in ordinary blocks.
A selection block is used with select:, reject:, detect:, detect:ifNone:, or selectAsStream: to perform the query over a collection.
Employees select: {:each | each.address.state = 'OR'}
These have the same semantics as with standard blocks executed on a collection. For example, reject: will return a result set that includes all elements for which the block evaluation would return false. The results are in a collection the same class as the base collection (unless species or speciesForSelect specifies a different class, as with the RC classes).
The collection returned from a query has no index structures. If you want to perform indexed selections on the new collection, you must build the necessary indexes on the new collection.
To get the results as a stream, use UnorderedCollection >> selectAsStream:. This returns an instance of RangeIndexReadStream, which understands the following messages:
next
Returns the next value on a stream of range index values.
atEnd
Returns true if there are no more elements to return through the logical iteration of the stream.
reversed
Create a ReversedRangeIndexReadStream based on the receiver, allowing you to stream over the results from last to first.
If you have existing code that includes selection block queries, you can use those selection blocks to create the instances of GsQuery.
GsQuery fromSelectBlock: {:each | each.address.state = 'OR'}
This can be bound using on:, or created using fromSelectBlock:on:, similar to how you create and bind a GsQuery from a string.
Sending indexSpec to the collection provides a complete description of the indexes on a collection, and can be used for information without using the GsIndexSpec API; the extra details provided by indexSpec can be ignored.
You can also send messages to the collection that will return quick information on indexed paths.
equalityIndexedPaths and identityIndexedPaths
Returns, respectively, the equality indexes and the identity indexes on the receiver’s contents. Each message returns an array of strings representing the paths in question.
For example, the following expression returns the paths into myEmployees that bear equality indexes:
myEmployees equalityIndexedPaths
%
anArray( 'age', 'address.state')
kindsOfIndexOn: aPathNameString
Returns information about the kind of index present on an instance variable within the elements of the receiver. The information is returned as one of these symbols: #none, #identity, #equality, #identityAndEquality.
equalityIndexedPathsAndConstraints
Returns an array in which the odd-numbered elements are the elements of the path, and the even-numbered elements are the constraints specified when creating an index using the keyword withLastElementClass:.
Removing indexes can be done using the GsIndexSpec
You may send methods to the indexed collection directly to remove one or all indexes.
UnorderedCollection >> removeEqualityIndexOn: aPathString
Removes an equality index from the path indicated by aPathString. If the path specified does not exist, this method returns an error. Implicit indexes are not removed.
UnorderedCollection >> removeIdentityIndexOn: aPathString
Removes the identity index on the specified path. If the path specified does not exist, this method returns an error. Implicit indexes are not removed.
UnorderedCollection >> removeAllIndexes
Removes all explicitly created indexes from the receiver. Implicit indexes that were created by these elements participating in other indexed collections are not removed.