21. Testing and SUnit

SUnit is a framework that supports the creation of automated unit and application tests. This chapter discusses the role of repeatable unit tests and how to easily create these tests using SUnit.

This chapter was originally based on “SUnit Explained” by Stéphane Ducasse (http://www.iam.unibe.ch/~ducasse/Programmez/OnTheWeb/Eng-Art8-SUnit-V1.pdf) and is used by permission.

Application Testing
describes the general goals of automated testing, and the benefits of the SUnit framework.

SUnit example: ExampleSetTest
presents a step-by-step example that illustrates the use of SUnit.

The SUnit Framework and Implementation
describes the core classes of the SUnit framework.

Running a Single Test
explores key aspects of the implementation by following the execution of a test and test suite.

21.1 Application Testing

The value of testing

Many traditional development methodologies include testing as a step that follows coding, and this step is often cut short when time pressures arise. Development of automated tests in parallel with application development can save time in ensuring that each part of the application that is written is working as intended, that new development does not break existing functionality, and allow making changes in the application with much higher confidence.

The Smalltalk community has a long tradition of testing, due to the incremental development supported by its programming environment. Testing is commonly done during development: a typical practice is to compile a method and then, from a workspace, write a small expression to test it, perhaps even keeping the expression in method comments. However, these examples can be difficult to understand after some time passes, and there is no easy way to keep track of them and to automatically run them. Unfortunately, tests that you cannot automatically run are less likely to be run. Moreover, having a code snippet to run in isolation often does not readily indicate the expected result.

Automated tests play several roles:

Automated tests are an active and always synchronized documentation of the functionality they cover. Testing can demonstrate how the application classes and methods are meant to be used together, and are a valuable training resource.
By writing tests at the same time or even before writing code, it forces you to think about the functionality you want to design. You are required to clearly state the context in which your functionality will run, the way it will interact with other code, and, more important, the expected results. Moreover, when you are writing tests, you are your first client and your code will naturally improve.
The application lifecycle includes initial development, and also further development, bug fixes, and new features. With automated testing, you can have confidence that the changes you introduce are not breaking existing functionality, or easily locate and address any issues that arise.

By using “test-first” or “test-driven” development, eXtreme Programming proposes to write tests even before writing code. While this is counter-intuitive to the traditional “design-code-test” mindset, it can have a powerful impact on the overall result. Test-driven development can improve the design by helping you to discover the needed interface for a class and by clarifying when you are done (the tests pass!).

Writing good tests

It is clear that we cannot test all the aspects of an application. Covering a complete application is simply impossible and should not be the goal of testing. Even with a good test suite, defects can creep into the application and be left hidden waiting for an opportunity to damage your system. While there are a variety of test practices that can address these issues, the goal of regression tests is to ensure that a previously discovered and fixed defect is not reintroduced into a later release of the product.

Writing good tests is a technique that can be easily learned by practice. Let us look at the properties that tests should have to get a maximum benefit:

Repeatable. We should be able to easily repeat a test and get the same result each time.
Automated. Tests should be run without human intervention. You should be able to run them during the night.
Tell a story. A test should cover one aspect of a piece of code. A test should act as a specification for a unit of code.
Resilient. You should not have to change many tests for every application change. One way to achieve this property is to write tests based on the interfaces of the tested functionality.

In addition, for test suites, the number of tests should be somehow proportional to the bulk of the tested functionality. For example, changing one aspect of the system might break some tests, but it should not break all the tests. This is important because having 100 tests broken should be a much more important message for you than having 10 tests failing.

Why SUnit?

SUnit is a mature public-domain framework, widely available for most dialects of Smalltalk. SUnit is easy to use for basic testing, and also includes support for more sophisticated testing requirements.

The value of SUnit is that it provides a code framework to describe the context of your tests as well as to run them automatically. With a set of SUnit tests, you can quickly add tests that become part of an automated test suite. This represents a vast improvement over writing small code snippets in an ephemeral workspace.

SUnit was developed originally by Kent Beck and was extended by Joseph Pelrine and others. SUnit is not limited to Smalltalk; SUnit inspired testing frameworks in many other languages, such as JUnit for Java, RUnit for R, and so on. The language-independent framework is called XUnit.

"There are simpler ways of running automated tests, and many, many more complicated ways, but this architecture seems to hit a sweet spot."

Portable SUnit classes and GemStone’s SUnit classes

The SUnit framework is available in many Smalltalk dialects. Application tests that are written using base SUnit classes (TestCase, TestResult, TestSuite, etc), allows your tests to be portable to other Smalltalk dialects.

The GemStone distribution also includes a GemStone-specific refinement of SUnit, via subclasses GsTestCase, GsTestResult, and GsTestSuite. These classes provides additional test methods and support for GemStone-specific behavior in testing.

There are some differences in behavior; when using GsTestCase, results are written to log files, including SUnit.log and SUnitDefects.log. Also, GsTestResults do not preserve the actual TestCases, as does TestResult; instead, the printString is stored.

Terminology: test errors vs. failures

SUnit recognizes two kinds of defects: not getting the correct answer (a failure) and not completing the test (an error).

Usually, a failure indicates that there is a problem in the code you are testing; the test is working correctly to uncover defects in your code. Of course, it may also be the case that the test itself includes incorrect logic.

An error indicates just that: an unhandled code error occurred, such as a divide by zero, and the test could not continue running. This may indicate a problem in the SUnit test code itself. For example, your SUnit class may send a method to an application class that is not implemented, and a message not understood occurs. You should either remove that message send from the test, or setup your test to expect that particular error.

21.2 SUnit example: ExampleSetTest

Before going into the details of SUnit, let’s look at a simple example. This example tests the class Set. The example test class, ExampleSetTest, is included in the GemStone distribution, so that you can read the code directly in the image and execute the tests.

Define the new testing Class

To start off, look at the definition of the class ExampleSetTest, which is a subclass of TestCase.

Example 21.1

TestCase subclass: 'ExampleSetTest'

	instVarNames: #( full empty)

	classVars: #()

	classInstVars: #()

	poolDictionaries: #()

	inDictionary: Globals

The class ExampleSetTest here is intended to contain all tests related to the class Set.

It establishes the context of all the tests that we will specify. Here, the context is described by specifying two instance variables, full and empty, that represent a full and empty set, respectively.

This example is a subclass of TestCase; you may also subclass GsTestCase for testing that involves GemStone-specific behavior.

Define setUp and tearDown methods

Testing often re quires a context in which to run the tests, which is established by the method setUp. This method is automatically invoked before the execution of each test method defined in a subclass of TestCase.

Here, we initialize the instance variable empty to refer to an empty set, and the full instance variable to refer to a set containing two elements.

Example 21.2

ExampleSetTest>>setUp

empty := Set new.

full := Set with: 5 with: #abc.

The complementary method tearDown allows you to clear or release any resources that were used during the tests, so you can restore state to exactly as it was prior to testing.

It is not requires to have a setUp or tearDown method, if the testing code does not need to take any action. ExampleSetTest does not need or have a tearDown method.

Define the actual testing methods

We start off by defining three methods to test Set class. Each method represents one test.

The selectors for methods that represent tests, normally start with ’test’. Methods that match this pattern are automatically collected by SUnit for testing. You may need to implement support methods that are not themselves tests; these methods should not begin with ’test’.

Example 21.3

ExampleSetTest>>testIncludes

self assert: (full includes: 5).

self assert: (full includes: #abc).

ExampleSetTest>>testOccurrences

self assert: (empty occurrencesOf: 0) = 0.

self assert: (full occurrencesOf: 5) = 1.

full add: 5.

self assert: (full occurrencesOf: 5) = 1.

ExampleSetTest>>testRemove

full remove: 5.

self assert: (full includes: #abc).

self deny: (full includes: 5).

Each method starts executing with an initial state defined by the setUp method. So the testIncludes method tests that after running the setUp method in Example 21.2, the full instance variable include both 5 and #abc.

While the testRemove method removes an element from full, this has no effect on any other tests that might run later, since the initial state for each test comes from the setUp method.

Execute the tests

Tests can be executed by executing code, in Topaz, GBS, or other tools.

To run a test, for example, execute the following code:

ExampleSetTest run: #testRemove.

To run all tests on the class, execute

ExampleSetTest suite run

These methods return an instance of the TestResult class, showing the tests that errored, failed, and passed. You must examine the object returned from these methods to determine if the test passed or not.

During test development, it is much more useful to run the test and bring up a walkback if an error or failure occurs. To do this, execute:

ExampleSetTest debug: #testRemove.

Assertions and other claims in test methods

SUnit provides a number of methods to support verifying the results from code execution.

Checking result value with assert: and deny:

The method TestCase>>assert: requires a single argument, a boolean that represents the value of a tested expression. When the argument is true, the expression is considered to be correct, and we know that the test is valid. When the argument is false, then the test failed.

The method deny: is the negation of assert:. Hence

self deny: Set new size > 0.

is equal to

self assert: (Set new size > 0) not.

There are other methods such as assert:description: and assert:equals:, see the image for details.

The method should: takes a block. For example,

self should: [Set new size > 0].

Checking for exceptions with should:raise:

In addition to testing for correct results, your application may need testing for how it responds to abnormal input or unexpected conditions by raising an error. This is supported by the should:raise: and shouldnt:raise: methods.

Test for general error

Because SUnit runs on a variety of Smalltalk dialects, the SUnit framework factors out the variant parts (such as the name of the exception). This pattern is important if you plan to write tests that are intended to be cross-dialect.

This will catch Error, or any subclass of Error.

self should: [empty at: 5] raise: TestResult error.

Specific named error

You can also test for specific exception classes. This code may be portable provided that the target Smalltalk dialect implements the specific Error class. For example, MessageNotUnderstood is a fairly standard error. However, GemStone and other Smalltalk dialects will include many dialect-specific error classes that will not be portable.

self should: [empty foo] raise: MessageNotUnderstood.

GemStone-specific legacy error numbers

GemStone’s legacy error handling used numbers, rather than classes, and some ANSI error classes map to multiple more specific error numbers. For example, [empty at: 5] throws ANSI class Error, with legacy number 2007/#rtErrShouldNotImplement.

A test for Error or TestResult error will catch MessageNotUnderstood as well as Error. Using a subclass of GsTestCase, rather than of TestCase, you may use the legacy error numbers in place of the error class to test for more specific errors.

In a subclass of GsTestCase:

self should: [empty at: 5] raise: 2007.

This will catch #rtErrShouldNotImplement, but will not catch MessageNotUnderstood (legacy error 2010).

With a subclass of GsTestCase, you may use both ANSI exception classes and GemStone legacy error numbers interchangeably. However, testing for legacy error numbers will not be portable to other Smalltalk dialects.

21.3 The SUnit Framework and Implementation

While you can use SUnit without understanding the implementation, it may be useful when you need to customize your test suite.

SUnit Core Classes

SUnit is implemented by four main classes: TestSuite, TestCase, TestResult, and TestResource. See Figure 21.1. (Note that this is an object composition diagram, not a class hierarchy diagram.)

Figure 21.1 The SUnit Core Classes

TestSuite

The class TestSuite represents a collection of tests. A TestSuite contains subclasses of TestCase and other instances of TestSuite. The classes TestSuite and TestCase form a composite pattern in which TestSuite is the composite and TestCase is the leaf.

TestCase

The class TestCase represents a family of tests that share a common context. The context is specified by instance variables on a subclass of TestCase and by the specialization method setUp, which initializes the context in which the each of the tests will be executed. The method setUp is invoked before the execution of every test.

The class TestCase also defines the method tearDown, which is responsible for cleanup, including releasing the objects allocated by setUp. The method tearDown is invoked after the execution of every test.

TestResult

The class TestResult represents the results of a TestSuite execution. This includes a description of which tests passed, which failed, and which had errors.

TestResource

Recall that the setUp method is used to create a context in which the test will run. Often that context is quite inexpensive to establish, as in Example 21.2 seen earlier, which creates two instances of Set and adds two objects to one of those instances.

At times, however, the context may be comparatively expensive to establish. In such cases, the prospect of re-establishing the context for each run of each test might discourage frequent running of the tests. To address this problem, SUnit introduces the notion of a resource that is shared by multiple tests.

The class TestResource represents a resource that is used by one or more tests in a suite, but instead of being set up and torn down for each test, it is established once before the first test and reset once after the last test. By default, an instance of TestSuite defines as its resources the list of resources for the TestCase instances that compose it.

As shown in Example 21.4, a resource is identified by overriding the class method resources. Here, we define a subclass of TestResource called MyTestResource. We associate it with MyTestCase by overriding the class method resources to return an array of the test classes to which it is associated.

Example 21.4

MyTestCase class>>resources

"associate a resource with a testcase"

^ Array with: MyTestResource.

As with a TestCase, we use the method setUp to define the actions that will be run during the setup of the resource.

Running a Single Test

To execute a single test, we evaluate the expression

(TestCase selector: aSymbol) run.

The method TestCase>>run creates an instance of TestResult to contain the result of the executed tests, and then invokes the method TestCase>>run:, which in turn invokes the method TestResult>>runCase:. See Figure 21.2.

Figure 21.2 TestCase instance methods run and run: (source code)

TestCase>>run

| result |

result := TestResult new.

[self run: result]

	ensure: [TestResource resetResources: self resources].

^result.

TestCase>>run: aResult

aResult runCase: self.

The runCase: method (Figure 21.5) invokes the method TestCase>>runCase, which executes a test. Without going into the details, TestCase>>runCase pays attention to the possible exception that may be raised during the execution of the test, invokes the execution of a TestCase by calling the method runCase, and counts the errors, failures, and passed tests.

Example 21.5 TestResult instance method runCase: (source code)

TestResult>>runCase: aTestCase

[aTestCase runCase.

self addPass: aTestCase]

	on: self class failure , self class error

	do: [:ex | ex sunitAnnounce: aTestCase toResult: self]

As shown in Figure 21.6, the method TestCase>>runCase calls the methods setUp and tearDown.

Example 21.6 TestCase instance method runCase (source code)

TestCase>>runCase

self resources do: [:each | each availableFor: self].

[self setUp.

self performTest]

	ensure: [ tornDown ifNil:[

		tornDown := true .

		self tearDown	  ]

Running a TestSuite

To execute more than a single test, we invoke the method TestSuite>>run on a TestSuite (see Figure 21.7). The class TestCase provides the functionality to build a test suite from its methods. The expression MyTestCase suite returns a suite containing all the tests defined in the class MyTestCase.

The method TestSuite>>run creates an instance of TestResult, verifies that all the resource are available, then invokes the method TestSuite>>run: to run all the tests that compose the test suite. All the resources are then reset.

Example 21.7 TestSuite instance methods run and run: (source code)

TestSuite>>run

| result |

result := TestResult new.

[self run: result]

	ensure: [TestResource resetResources: self resources].

^result

TestSuite>>run: aResult

self tests do: [:each |

	self sunitChanged: each.

	each run: aResult]

The class TestResource and its subclasses use the class method current to keep track of their currently created instances (one per class). This instance is cleared when the tests have finished running and the resources are reset. The resources are created as needed. See Figure 21.8.

Example 21.8 TestResource class methods isAvailable and current (source code)

TestResource class>>isAvailable

^self current notNil

TestResource class>>current

current ifNil: [current := self new].

^current