19. The Foreign Function Interface

Previous chapter

Next chapter

This chapter describes the Foreign Function Interface (FFI) classes and methods, and how you can use them to build and interface to an existing C library.

Overview of the Foreign Function Interface
The purpose anFd use of the FFI.

Using the FFI
Describes the specific steps and classes involved in using the FFI.

Example using Zlib
Simple example using Zlib.

Example using CCalloutStructs and variable arguments with RabbitMQ®
An example illustrating the use of CCalloutStructs and variable arguments.

19.1 Overview of the Foreign Function Interface

For certain applications, you may need to provide functionality that is not readily available within GemStone Smalltalk. This may includes functionality that can be provided by third-party products that provide a C interface, such as the zlib compression library, data encryption, Oracle, mySQL, or other databases, and so on.

To interact with third-party products such as these, you can use the FFI to make C library calls from within GemStone Smalltalk. Using the FFI, you can access C functions in external libraries without the need to write and compile UserAction libraries.

The FFI is composed of a number of classes: CLibrary, CHeader, CDeclaration, CCallout, CCalloutStructs, CByteArray, and others.

These classes can be used to hand-craft access to a single or a few simple functions; they also allow you to parse the C header file to create new classes for C libraries, with Smalltalk methods that access each C function, and new classes for complex C data types.

Parsing C header files requires that you have gcc/g++ installed in your development environment. Once you have created C library interface classes for your application, the C compiler is not needed to execute the FFI code.

The GemStone image includes the class GciLibrary, which is an FFI interface class that allows calls to GemStone’s C interface (GCI). You may examine the class GciLibrary as an example of the results of creating an FFI wrapper class.

19.2 Using the FFI

Making a C function call involves defining an instance of CCallout for the specific function, and then invoking that CCallout with the appropriate arguments at runtime.

The CCallout definition includes the specifications of the various argument and return types, and the name of the compiled library that supports that function. The defined CCallout instances can be persistent and are reusable.

The FFI supports parsing the header file that accompanies C libraries, to simplify and automate the process of defining CCallouts and making these easy to use from GemStone Smalltalk.

CLibrary: defining the compiled C or C++ library

To make a call to a third party or custom C or C++ library, you will need the compiled C library file (.so or .dyn). The file is loaded into the gem or linked process by a primitive which calls dlopen; See "man dlopen" for the operating system details of library search order. An absolute path can be used, or the environment variable LD_LIBRARY_PATH can be used to specify a search path.

Instances of CLibrary are created using:

CLibrary class >> named:libraryName

passing in the path and name of the C shared library to be loaded. The platform-specific extension (such as .so) is optional.

To generate support classes and methods, the FFI interface uses the header file, which is also located by name. These are normally made available along with compiled libraries.

CCallout: function definitions in GemStone

Functions in a C library are accessed via instances of CCallout or a subclass, including CCalloutStructs; there is one CCallout instance for each function you wish to call directly from the FFI. The CCallout is created with the function name and declarations for the result and argument types.

For example,

myCCallout := CCallout 
	library: 	(CLibrary named: 'myLib.so')
	name: 'DoSomething'
	result: #'uint64'
	args: #(#'char*' #'uint64').

You will need a CCalloutStructs if there is a structure (passed by value) either as an Argument or a result. These are handled somewhat differently and have more restrictions than CCallouts; seeCCalloutStructs: Using C Structures passed by value.

Creating CCallouts by hand is tedious and error prone, especially if variable arguments or structures are involved. In general, you should parse the C header file, using CHeader class wrapper:* methods, to create the CCallout definitions.

The CCallout definitions include the C types of all the arguments (except variable arguments, which are declared when the function is called), and the type of the result. For the supported argument type symbols, see C type symbols.

To ensure you have the correct version of the function, you may send version: to the CCallout; this string will be used as third argument to dlvsym().

Note that while UserActions creation performs checks of your code against function prototypes of the external library, the FFI does not do this checking. You are responsible for ensuring that the CCallout definition matches the C call.

CHeader: Parsing the C header file

The class CHeader allows you to parse the C header file/s that corresponds to the C library you wish to call. You can examine the functions in that library, and define a class and instance methods to invoke the C functions.

Header files are parsed using the methods:

CHeader class >> path: headerFileOrPath
You may pass in the full path and name of a header file, or just name of the header file. If the headerFileOrPath does not fully specify a file, will look in the system search path, including the current directory, /usr/include/, and /usr/local/include/. These paths are also used to locate any files that are parsed due to include statements. You may lookup the current search path using: CPreprocessor new allSearchPaths

CHeader class >> path: headerFileOrPath searchPath: aPath
Search for headerFileOrPath and its include files by looking first in aPath, and then in the current directory and the system search path.

CHeader class >> path: headerFileOrPath searchPaths: collOfPaths
Search for headerFileOrPath and its include files by searching first in the directories in collOfPaths in order, then in the current directory and the system search path.

Parsing the header file - that is, creating the instance of CHeader - does not give you the location of the actual C library file that you will be calling. To define or invoke the call to an FFI function, you will also need to specify the compiled C library name.

Parsing the file does not itself create a class or methods.

Cheader parsing is also used to create classes and methods for struct data types; see CByteArray: Allocating memory for data.

CDeclaration: Finding details on a C function

The instance of CHeader file that is returned by the parsing methods contains CDefinitions for the functions, data types, and other C elements in the header file, and all included header files. The specific function calls are presented as instances of CDeclaration, and you may view the source code and the processed representation of the function prototype.

For example:

((CHeader path: 'myLib.hf') functions at: #DoSomething) 		
	source 
%
extern uint32 DoSomething (char *__buf, size_t __size)
	__attribute__ ((__nothrow__ , __leaf__)) ;
 
((CHeader path: 'myLib.hf') functions at: #DoSomething)
	printString
%
extern int32 DoSomething(uint8 *dest, uint64 *destLen, const
	uint8 *source, uint64 sourceLen)

Creating a Class and Methods from the CHeader

The instance of CHeader can be used to create classes and methods for the CDeclaration types and functions that were parsed. The resulting class contains methods that allow you to invoke the C function from a Smalltalk method. The class will include reusable persistent instances of CCallout and CCalloutStucts.

  • The name of the class to create is supplied as an argument. Method variants exist that create a default name, which is the name without ’lib’; see the image for details.
  • When creating this class, the library name is an additional argument, and is stored in a class variable. Method variants exist that allow to you include an expression to execute to lookup the library name; see the image for details.
  • The instance of CHeader includes definitions for every function in the header file and in every included header file. You may limit the set of functions that are made accessible in the Smalltalk class using a select block. The select block should evaluate to a Boolean that indicates whether or not to include the particular function.

For example, the following code parses a C header file and creates a class named MyLibrary, that contains methods corresponding to all functions for which the first letter is uppercase.

| wrapperClass |
wrapperClass := (CHeader path: 'myLib.hf')
	wrapperNamed: 'MyLibrary'
	forLibraryAt: 'myLib.so'
	select: [:each | each name first isUppercase].
wrapperClass initializeFunctions.
UserGlobals at: wrapperClass name put: wrapperClass.

The instance methods created by this code invoke a persistent instance of CCallout or CCalloutStructs for the specified library name, so you can invoke the method with only the arguments that are specific to that function.

CByteArray: Allocating memory for data

The FFI, in general, allows you to work with C objects without needing to yourself malloc and free memory.

A CByteArray represents an allocation of C memory. When objects such as pointers or strings are passed to or from C functions, creating a CByteArray, using a method that invokes CByteArray gcMalloc:, ensures that the memory will be valid following the call; and that once the object is dereferenced, the associated heap memory will be freed.

When creating a String, for example, using CByteArray class>>withAll:, the operation mallocs memory for the string, fills the CByteArray from the string contents, and adds the terminal 0 byte.

You may allocate memory using CByteArray >> malloc:, which is not automatically freed; this may be useful for some interfaces that require long-lived data types.

You can add to and retrieve Smalltalk objects from a CByteArray using data type specific methods. For example:

CByteArray >> int64At: zeroBasedOffset
CByteArray >> int64At: zeroBasedOffset put: anInteger

See the image for the many other available access methods, including methods to add and extract Arrays and adding and extracting strings as UTF-8 bytes.

Complex data types

For struct and class data types, the CHeader that results from parsing C header files can be used to create subclasses of CByteArray, using CHeader>>wrapperForTypeNamed:. This allows creating instances of the correct size for the struct, and retrieving individual elements using instance methods rather than offset into the CByteArray.

For example, the GCI includes a struct type GciFetchObjInfoArgsSType. To see the CDeclaration for this struct:

((CHeader path: '$GEMSTONE/include/gci.hf') types at: 	
		'GciFetchObjInfoArgsSType') source
%
typedef struct { int64 startIndex; int64 bufSize;  
	int64 	numReturned; GciObjInfoSType *info; ByteType *buffer; 
	int retrievalFlags; BoolType isRpc; } GciFetchObjInfoArgsSType;

A GemStone data class can be created for this defintion by executing:

UserGlobals at: #GciFetchObjInfoArgsSType put: 
	((CHeader path: '$GEMSTONE/include/gci.hf')
		wrapperForTypeNamed: 'GciFetchObjInfoArgsSType')

The resulting class GciFetchObjInfoArgsSType is a subclass of CByteArray that includes instance creation methods that gcMalloc 48 bytes for the new instance,

GciFetchObjInfoArgsSType class >> new
	^self gcMalloc: 48 

and has (logical) getter and setter instance methods for startIndex, bufSize, and so on, that access the correct offsets within the CByteArray.

GciFetchObjInfoArgsSType >> bufSize
	^ self int64At: 8 

Making FFI calls

To call the C function via the FFI, you will create an instance of CCallout that describes the C function, and send #callWith: to that instance of CCallout, passing in an array containing the argument values.

These arguments may include instances of CByteArray or CPointer, which handle allocating memory and ensure that the memory remains allocated while needed.

Executing a CCallout created by hand

For example, if the definition was created by hand:

| doSomethingCallout |
doSomethingCallout := CCallout 
	library: (CLibrary named: 'myLib.so')
	name: 'DoSomething'
	result: #'uint64'
	args: #(#'char*' #'uint64').

Then it can be invoked:

result := doSomethingCallout callWith: { 'string1' . 22 } 

Execute CHeader-generated definition

When using CHeader to generate the CCallouts and store them in a class, you will invoke the generated instance method; each argument is separate. The instance method names that are generated use _: to indicate each argument, in the order required by the C header file’s function definition.

To execute the function DoSomething in the generated MyLibrary class, for example:

result := MyLibrary new DoSomething_: 'string2' 	_: 33

Internally, when using wrapperNamed:forLibraryAt:select: to generate the MyLibrary class, the following methods pairs were created, one pair for each function that matched the select block:

MyLibrary class >> initializeFunction_DoSomething_inLibrary: 
Function_DoSomething := CCallout 
		library: (CLibrary named: 'myLib.so')
		name: 'DoSomething'
		result: #'uint64'
		args: #(#'char*' #'uint64').

This method is executed by initializeFunctions, which stores the CCallout in a Class variable. The CCallout is accessed via the corresponding instance method. The method that was called in the example is defined as:

MyLibrary >> DoSomething_: theString_: theNumber
Function_DoSomething callWith: { theString . theNumber }

Arguments and Return Values

When a CCallout is configured, the arguments are described as an array of C type symbols. Then, when the CCallout is used to make a C call, the values for these arguments are provided as an array that matches the CCallout’s configuration in type, size, and order.

for example, in the CCallout instance creation:

... args: #(#'char*' #'uint64').

And when this CCallout is invoked:

myCallout callWith: { 'string1' . 22 }

See C type symbols for the supported argument type symbols.

String type arguments

For a function argument of type char*. const char*, UChar* or const UChar*, an instance of CCallout will accept an argument of the appropriate kind of string, and copy the body of that string to C memory. However that C memory data is only valid for the duration of the call; after the function completes, the reference to memory is no longer valid.

If the string will be updated by the function or needs to be referenced later, a CByteArray should be created for the string arguments. Creating these via CByteArray class>>withAll: mallocs memory, adds the terminal nul, and performs tracking so that this memory is later freed.

For example:

( CByteArray withAll: 'abc' ) printString
%
aCByteArray size=4 gcFree=0x3 dead=false address=0x561d9150ade0

In this result, note the size includes the terminal zero byte, and the gcFree indicates that free() will be called when the instance is garbage collected.

Variable Arguments

Functions may also have variable arguments, as indicated by ,... in the C function header. When making a call to a function with variable arguments, the callWith: array should include, in addition to any regular arguments (which are only the value), two elements for each variable argument; the type definition for the argument, followed by the argument value.

For example, if the C header function is defined as:

myfunct( const char* arg1, int arg2,...) 

than to pass in two string variable arguments, along with the two fixed arguments:

myCCallout callWith: { 
		'string1' . 
		22 . 
		#'const char*' . 'string2' . 
		#'const char*' . 'string3' }

Functions using varArgs generally may have a maximum of 20 variable arguments. This limit is lower if native code is disabled for this session, as described under Limitations with native code disabled.

For an example of a function that accept variable arguments, see under Example using CCalloutStructs and variable arguments with RabbitMQ®.

CCalloutStructs: Using C Structures passed by value

The class CCalloutStructs is a kind of CCallout, that allows you to use C Structures passed by value (rather than by reference). CCalloutStructs can be used on Linux and Mac only, on x86 or ARM platforms.

The C structs that are used in a CCalloutStructs are in the form of a CByteArray, which must be the correct size to accommodate the data types within the struct. These should be created using CHeader>>wrapperForTypeNamed:. This ensures that the size of the CByteArray is correct, and allows you to set and fetch the individual struct fields by name rather than offset.

While you can hand-build the definition for a CCalloutStructs, it is recommended to generate these using CHeader wrapper* methods, to ensure the definition and function matches correctly. The CHeader wrapper* methods that generate CCallouts will generate a CCallout or CCalloutStructs, and the instance method that invokes it, depending on the declaration in the header file.

When the C function returns a struct, there are special considerations; these do not apply when an argument is a struct, but the return value is a simple data type.

  • the generated instance method has an initial argument, in the first position, to hold the struct result. The argument passed in must be an instance of a kind of CByteArray of the correct size. It is strongly recommended that this be an instance of a data type subclass of CByteArray that was generated for the specific struct type that is returned
  • If invoking this by hand, the method CCalloutStructs callWith:structResult:errno: must be used.

For an example of a function that return a struct, see under Example using CCalloutStructs and variable arguments with RabbitMQ®.

CPointer

CPointer encapsulates a C pointer that does not have auto-free semantics. New instances are created by CFunction calls with result type #ptr, and are also used for certain arguments of CFunctions.

Errno Handling

C functions may set errno following execution. To access this value, instead of using callWith: to call the function, use callWith:errno:, and pass in a 1-element array (not invariant). The errno for the function execute will be placed in the first position in this array.

If the element in the first position of the 1-element array is a SmallInteger, errno will be initialized to this value before making the C call.

For example:

| errnoArray myErr myCCallout |
errnoArray := { 0 }.
myCCallout callWith: #( 3 0 ) errorno: myErr.
myErr := errnoArray at: 1

CCallin: Creating and invoking callbacks

A CCallin represents a signature for a C function to be called by C code. The resulting CCallin may be used as a type within the argumentTypes array when defining a CCallout.

For example, with the following C definitions compiled into the library myLib.so,

 
typedef double (*CbdFType)(double p1, int64 p2);
 
extern "C" double myCallBack( int64 a1, double a2, CbdFType a3)
{ 
	double d = (*a3)(a2, a1); return d; 
	}

The following examples creates the callback and invokes it:

Example 19.1 Callback example

topaz 1> run
| myCallin myCallback myResult |
myCallin := CCallin 
	name: 'doMath' 
	result: #double 
	args: {  #double . #int64 }.
 
myCallback := CCallout 
	library: (CLibrary named:'myLib.so') 
	name: 'myCallBack' 
	result: #double
	args: { #int64 . #double . myCallin } .
 
myResult := myCallback callWith: { 
		115 . 
		338.0 . 
		[:a1 :a2 | a1 + a2 ] 
		}.
myResult
%
 453.0
 

C type symbols

Table 19.1 lists the symbols used for creating resType (result type) and argumentTypes arguments when creating CCallouts. See the comments in the method CCallout class >> library:name:result:args: for details.

Table 19.1 C Type

 

Return type

Argument type

#int64

Integer

Integer representable as a C int64

#uint64

Integer

Integer representable as a C uint64

#int32

SmallInteger

Integer representable as a C 32-bit int

#uint32

SmallInteger

Integer representable as a C 32-bit uint

#int16

SmallInteger

Integer representable as a C 16-bit short

#uint16

SmallInteger

Integer representable as a C 16-bit ushort

#int8

SmallInteger

Integer representable as a C 8-bit signed char

#uint8

SmallInteger

Integer representable as a C 8-bit uchar

#bool

true or false

Boolean

#double

SmallDouble or Float

SmallDouble or Float

#float

SmallDouble or Float

SmallDouble or Float

#'char*'

nil or a String

a non-nil String or Unicode7. The body is copied to C memory before the call, and a zero byte appended; and copied from C memory (and possibly grown/shrunk) after call. C memory will not be valid after the call finishes. A CByteArray can be used if the argument should be available after the call.

#'const char*'

 

nil (to pass NULL), String, Unicode7, or Utf8. The body is copied to C memory before the call, and a zero byte appended. C memory will not be valid after the call finishes. A CByteArray can be used if the argument should be available after the call.

#void

nil

 

#'UChar*'

 

Unicode7, Unicode16, or Unicode32. The body is copied to C memory in UTF16 encoding before the call, and a zero byte appended; and copied from C memory (and possibly grown/shrunk) after call, which may cause the class to change. C memory will not be valid after the call finishes. A CByteArray can be used if the argument should be available after the call.

#'const UChar*'

 

nil (to pass NULL), or Unicode7, Unicode16, or Unicode32. The body is copied to C memory in UTF16 encoding before the call, and a zero byte appended. C memory will not be valid after the call finishes. A CByteArray can be used if the argument should be available after the call.

#'ptr'

nil or a CPointer

One of:

  • nil, in which case C NULL is passed.
  • a CByteArray, in which case the address of the body is passed.
  • a CPointer, in which case the encapsulated pointer is passed.

#'&ptr'

 

a CPointer. The CPointer’s value will be passed and updated on return.

#'struct*'

a CByteArray (CCalloutStructs only)

a CByteArray (CCalloutStructs only)

Limitations with native code disabled

If the generation of native code is disabled, there are further limitations:

  • Functions using varArgs may have a maximum of four fixed and 10 total arguments.
  • Functions not using varArgs are limited to a maximum of 15 total arguments.
  • Arguments and results of C type float are not supported.
  • Functions with one or more args of C type double are limited to a maximum of four arguments.
  • CCallin cannot be used

Native code generation is on by default, but may be configured to be disabled or becomes disabled when breakpoints are set. See the System Administration Guide for more information on native code generation.

19.3 Example using Zlib

The following examples use zlib, a commonly available software library for data compression that is available on many platforms. The examples are based on zlib v1.2.8 on Linux; with other versions of zlib.h or on other platforms, you may need to experiment. Documentation on zlib is available at http://zlib.net/manual.html.

Parsing the header file

For this example, the ZLib header file is installed in /usr/include/zlib.h, which is on the machine search path.

The C library is install at /lib/x86_64-linux-gnu/libc.so.6.

The following example analyzes a a header file and stores the result in a variable in UserGlobals:

Example 19.2 Create a CHeader for zlib.h

topaz 1> printit
UserGlobals at: #'ZLibHeader' put: 
	(CHeader path: '/usr/include/zlib.h').
%
 

Given the location of zlib.h on the C compiler search path /usr/include/, the following expressions can also be used to lookup the header file:

(CHeader path: 'zlib.h')
(CHeader path: 'zlib.h' searchPath: '/usr/include/')
(CHeader path: 'zlib.h' searchPaths: {'/usr/include/'})
(CHeader path: 'include/zlib.h' searchPath: '/usr/')
(CHeader path: 'include/zlib.h' searchPaths: {'/usr/'})

Once you have a CHeader object, you can get information about the various things defined in the header file and those it includes.

Example 19.3 CDeclaration for compress()

topaz 1> printit
(ZLibHeader functions at: 'compress')
%
a CDeclaration
  header              a CHeader
  name                compress
  storage             extern
  linkageSpec         nil
  type                int32
  count               nil
  pointer             0
  fields              nil
  parameters          a Array
  enumTag             nil
  isConstant          false
  includesCode        false
  isVaryingArgCount false
  bitCount            nil
  source              extern int compress (Bytef *dest, uLongf
		*destLen, const Bytef *source, uLong sourceLen)
  file                /usr/include/zlib.h
  line                1060
 

While the compress() function is directly in zlib.h, this isn’t necessarily the case for all functions in ZLibHeader. Functions that are defined in any header file that is #included in the parsed header file also will have definitions in the instance of CHeader.

For example, on Linux the zlib.h file #includes unistd.h, so functions such as getcwd() also have definitions in the instance of CHeader:

topaz 1> printit
(ZLibHeader functions at: 'getcwd') file.
%
/usr/include/unistd.h

On other platforms, zlib.h may not #include unistd.h. In this case, the definition is not included in ZLibHeader. In this case (if you wanted to access these functions from GemStone), you could create a separate instance of CHeader for unistd.h:

topaz 1> printit
UserGlobals at: #'UnistdLibHeader' 
	put: (CHeader path: '/usr/include/unistd.h').
%
Simple function call – getcwd()

To take an example that is in unistd.c, viewing the source for the getcwd() function declaration will let us see the argument declarations.

The CHeader allows you to view the method source, and a version with resolved types:

topaz 1> printit
(ZLibHeader functions at: 'getcwd') source
%
extern char *getcwd (char *__buf, size_t __size) __attribute__
	((__nothrow__ , __leaf__)) ;
 
topaz 1> printit
(ZLibHeader functions at: 'getcwd') printString
%
extern uint8 *getcwd(uint8 *__buf, typedef uint64 size_t __size)
 

This tells us that the function takes two arguments, a pointer to a string and an integer, and returns a pointer to a string. Knowing that the function defined by this header is in libc, and the library path and filename is /lib/x86_64-linux-gnu/libc.so.6, we can manually create a call to this function:

Example 19.4 CCallout to invoke getcwd()

topaz 1> printit
| string ccallout_getcwd result |
string := String new: 200.
ccallout_getcwd := CCallout library: 
		(CLibrary named: '/lib/x86_64-linux-gnu/libc.so.6')
	name: 'getcwd'
	result: #'char*'
	args: #(#'char*' #'uint64').
result := ccallout_getcwd callWith: 
	(Array with: string with: string size).
%
 

It’s important to note the way arguments are defined, since C handles memory differently from Smalltalk. The temporary string that is created as an argument to the function must be created with a size larger than the expected result. This is required for heap space to be allocated for the C function; if it is not large enough, the function will error. Also keep in mind that it’s very important that the specified size of the string in the second argument not be larger than the actual size of the string. The C function will write results to memory limited by the second argument.

getcwd() updates the argument as well as returning a value; both contain the same string, but different instances. In both cases, after execution the String’s size is the actual size; the String is truncated from the original size of 200.

More complex function call – compress()

A more complex example is the ZLib function compress(). This is defined in zlib.h as follows:

ZEXTERN int ZEXPORT compress OF((Bytef *dest, uLongf *destLen,
const Bytef *source, uLong sourceLen));

You can view a simplified and clarified definition using CHeader printString:

topaz 1> printit
(ZLibHeader functions at: 'compress') printString
%
extern int32 compress(uint8 *dest, uint64 *destLen, const uint8 *source, uint64 sourceLen)

This tells us that compress() takes four arguments:

  • a pointer to a destination buffer
  • a pointer to the length of the destination buffer
  • a pointer to the source data
  • the length of the source data

The function compresses the source data and places the result in the destination buffer. The destination length is updated with the space actually used. The function returns a flag indicating success or the type of error experienced.

We can manually create a call to this function using the core classes described in 16.1:

CCallout
	library: (CLibrary 
		named: '/lib/x86_64-linux-gnu/libz.so.1')
	name: 'compress'
	result: #'int32'
	args: #(#'ptr' #'ptr' #'const char*' #'uint64').

This creates an object that can be used to call the compress() function in the library. The constructor takes four arguments: (1) an instance of CLibrary; (2) the name of the function; (3) the result type; and (4) a list of the types of the arguments.

In order to call the function from Smalltalk we need to create the arguments. The source string and the source length are easy; they are just instances of a Smalltalk String and Integer. The destination and destination length are a bit more complex. They are both pointers to memory locations where the function will retrieve information (destLen starts as the available length of the destination buffer) as well as return information (dest, where the result is placed, and destLen, the amount of dest actually used).

In general, C libraries cannot deal directly with Smalltalk objects since the format is different and objects can move in memory with various garbage collection operations. As part of making the C function call, the virtual machine converts the Smalltalk objects to C data and constructs a C stack before making the C library call. For many objects this works fine; as we saw in the getcwd() example above, simple String and Integer objects are handled properly. But when an argument is a pointer to a chunk of memory in which the C library will place arbitrary data, we need to explicitly allocate that space and pass a pointer to it.

The class CByteArray represents a chunk of memory that is outside the Smalltalk object space (it is on the "heap"), and when an instance of CByteArray is passed as a #'ptr' type, the virtual machine puts a pointer to the space on the stack before making the function call. There are methods in CByteArray to place various Smalltalk objects in the allocated memory and to retrieve Smalltalk objects from the memory.

To allocate memory for the destination buffer, we can do the following:

destination := CByteArray gcMalloc: 100.

The gcMalloc constructor says to create space on the heap (outside of Smalltalk's object memory) and create a Smalltalk object (in object memory) that references the external memory. The heap memory will be automatically freed when the Smalltalk object is garbage collected. We don't need to put anything into the memory since the compress() function will not retrieve anything from the buffer. We pick a size that is enough to hold the expected result (we made an educated guess for this example; in real use we could get a better estimate by calling compressBound() with the source length).

To allocate memory for the destination size, and put a value in the location, we can do the following:

dest_size := CByteArray gcMalloc: 8.
dest_size uint64At: 0 put: destination size.

This allocates 8 bytes in the heap and puts the integer 100 (or whatever size we have allocated for the destination buffer) in that memory location (starting at a zero-based offset of 0). When we call the function we will pass a pointer to the number, not the number itself. This is so we provide a place for the function to tell us the amount of the destination buffer actually used (reusing the memory we allocated). After we make the call we can get the size back from the memory location:

used := dest_size uint64At: 0.

Once we know the amount of the destination actually used, we can extract the zip data. Note that the zip data is generic binary data, not a string, and may include bytes with a value of 0 (so cannot be treated as a C-string). Note that we are again dealing with zero-based offsets since our underlying structures are C memory:

compressed := destination byteArrayFrom: 0 numBytes: used.

We can put this all together and pass a source string to be compressed:

Example 19.5 CCallout to invoke compress()

topaz 1> printit
| ccallout_compress source destination dest_size result used compressed |
ccallout_compress := CCallout
	library: (CLibrary 
		named: '/lib/x86_64-linux-gnu/libz.so.1')
	name: 'compress'
	result: #'int32'
	args: #(#'ptr' #'ptr' #'const char*' #'uint64').
source := 'The quick brown fox jumped over the lazy dog'.
destination := CByteArray gcMalloc: 100.
dest_size := CByteArray gcMalloc: 8.
dest_size uint64At: 0 put: destination size.
result := ccallout_compress callWith: 
{ destination . dest_size . source . source size }.
result == 0 ifFalse:[ Error signal:'compress failed' ].
used := dest_size uint64At: 0.
compressed := destination byteArrayFrom: 0 numBytes: used.
%
 

If the result is zero (Z_OK), then the function executed successfully, and compressed will reference a ByteArray that contains the compressed data.

Using CHeader wrapper methods to create a class

The CHeader object can be used to create a new Smalltalk class and automatically generate methods to invoke the C functions.

The method CHeader >> wrapperForLibraryAt: can be used to create a Smalltalk class with default name and methods for each function. The default name is the library name without the ‘lib’, so for zlib.h, the resulting class name is simply “Z”.

When creating Smalltalk methods that allow arguments to be passed to the C function in the generated interface methods, each function argument is represented with “_:”. So for example for the getcwd() function, which as two arguments, the equivalent Smalltalk method is:

getcwd_: buffer _: size

To generate a wrapper class for the zlib library, in the most simple case you could use the following code:

Example 19.6 Create wrapper class using default

topaz 1> printit
| header wrapperClass |
header := CHeader path: '/usr/include/zlib.h'.
wrapperClass := header wrapperForLibraryAt:
	'/lib/x86_64-linux-gnu/libz.so.1'.
wrapperClass initializeFunctions.
UserGlobals at: wrapperClass name put: wrapperClass.
%
 

After this is executed, you can use a code browser to view the class-side methods that create the CCallout instances, and the instance-side methods that call the functions.

As mentioned earlier, the header file may include many functions beyond that provided in the library – all the functions that are defined in the referenced include files. And we can call any of these functions through this library, due to the way the C function lookup occurs.

For example, the function getpid() is defined to take no arguments and return a 32-bit number. This makes it very easy to call once we have defined a wrapper class:

Example 19.7 Invoke Z function getpid

topaz 1> printit
Z new getpid
%
22753
 

We probably don’t want to allow the Z class to have access to every function that is included – for example, it might be better not to have access to sethostid(), which changes the current machine's Internet number. It’s better to be more selective about what functions to include in the wrapper. It’s also desirable to have a more descriptive name for the library wrapper class.

The method CHeader>> wrapperNamed:forLibraryAt:select: allows you to specify the name and a select block to determine the specific functions to include. The select block should evaluate to a Boolean that indicates whether or not to include the particular function.

For example, to create a wrapper for various compress functions, you could do the following:

Example 19.8 Create wrapper class specifying name and functions

topaz 1> printit
| header class |
UserGlobals removeKey: #'ZLib' ifAbsent: [].
header := CHeader path: '/usr/include/zlib.h'.
class := header
	wrapperNamed: 'ZLib'
	forLibraryAt: '/lib/x86_64-linux-gnu/libz.so.1'
	select: [:each | 
		each name includesString: 'compress'].
class initializeFunctions.
UserGlobals at: class name put: class.
%
 

This code creates a wrapper class, ZLib, that contains only four functions: compress(), uncompress(), compress2(), and compressBound(), all the ones that happen to include the string “compress”. The select block may be considerably more complex, depending on which specific functions you want to include.

To invoke compress using the Zlib class rather than manually creating a CCallout:.

Example 19.9 Invoke Zlib function compress()

topaz 1> printit
| source destination dest_size result used compressed |
source := 'The quick brown fox jumped over the lazy dog'.
destination := CByteArray gcMalloc: 100.
dest_size := CByteArray gcMalloc: 8.
dest_size uint64At: 0 put: destination size.
result := ZLib new
      compress_: destination
      _: dest_size
      _: source
      _: source size.
used := dest_size int64At: 0.
compressed := destination byteArrayFrom: 0 to: used - 1. 
compressed
%
x\u9c^KÉHU(,ÍLÎVH*Ê/ÏSH˯PÈ*Í-HMQÈ/K-R(^AÊç$VU*¤ä§^C.k\u93^P0
 

19.4 Example using CCalloutStructs and variable arguments with RabbitMQ®

When interfacing to a library that includes functions that have arguments or return values that are structs, you will use CCalloutStructs. There are some differences from using CCallout, especially when the function returns a struct.

The following example calls a function in the open-source RabbitMQ messaging library, amqp_login, which includes variable arguments and returns a struct.

Note that for simplicity of illustration, the examples here use (CPointer newNull) for the argument that requires an established connection from an earlier amqp call. The methods as described can be executed, but obviously not function correctly.

Example 19.10 Source for the amqp_login function

((CHeader path: '/usr/include/amqp.h') 
	functions at: 'amqp_login'	) source 
%
 __attribute__ ((visibility ("default")))amqp_rpc_reply_t
amqp_login(amqp_connection_state_t state, char const *vhost,
int channel_max, int frame_max, int heartbeat,
amqp_sasl_method_enum sasl_method, ...); amqp_sasl_method_enum sasl_method, ...);
 

As you can determine, this function has six regular arguments, includes variable arguments, and returns a struct of type amqp_rpc_reply_t.

To generate a data type class for the return struct amqp_rpc_reply_t, and save that class in UserGlobals:

Example 19.11 Create wrapper: CByteArray class for data type

"create wrapper for the struct"

| wrapper |
wrapper := (CHeader path: '/usr/include/amqp.h')
wrapperForTypeNamed: 'amqp_rpc_reply_t'.
UserGlobals at: wrapper name put: wrapper.
 
 

It is strongly recommended that the CCalloutStructs instances also be created using CHeader wrapper: methods, to avoid error-prone hand construction. The generated instances contain the same code as a hand-construct call, and you can use the information in the generated methods to build your own class and methods.

Hand creation of CCalloutStructs

The following example shows the creation of a hand-constructed CCalloutStructs, and how it is invoked.

In this example, note that the argument arrays for the args: keyword, structSizes:, and callWith: have different sizes, as they include or omit the result and the variable arguments):

  • the first element in the structSizes: argument is for the return value; this must be the size of the return value. In this case, since the function returns an instance of amqp_rpc_reply_t (a subclasss of CByteArray), that is the required size.
  • The function takes two variable arguments; strings that represent the userid and password. Variable arguments require passing in both type and value in the array of arguments. In the example, the arguments and variable arguments are composed separately and appended to make the call.
Example 19.12 Hand-constructed CCalloutStructs

topaz 1 > run
| theResult  myCO fArgs varArgs |
theResult := amqp_rpc_reply_t new.
 
"create the CCalloutStructs"
myCO := CCalloutStructs 
	library: (CLibrary named: '$GEMSTONE/testlib/testmq')
	name: 'amqp_login'
  	result:  #struct 
  	args: #( #'ptr' #'const char*' #'int32' #'int32' #'int32' #'int32' )
	varArgsAfter: 6
	structSizes: { (theResult size) . nil . nil . nil . nil . nil . nil  ). 
fArgs := { (CPointer newNull) . '/'  . 0 . 131072 . 0 . 0 }.
varArgs := #( #'const char*' 'guest' #'const char*' 'guest' ).
 
"invoke the CCalloutStructs"
myCO 
	callWith: (fArgs, varArgs) 
	structResult: theResult 
	errno: nil.
theResult 
%
 

Using CHeader wrapper methods to create CCalloutStructs

Defining the CCalloutStructs correctly, including the structSizes and varArgs, is challenging, so it is strongly recommended to use the CHeader utilities to create these methods.

Example 19.13 Creating and using the Smalltalk class with the amqp_login function

"create the Wrapper class MyAmq and the CCalloutStructs"
topaz 1> run
| class wrapper |
wrapper := (CHeader path: '/usr/include/amqp.h')
	wrapperNamed: 'MyAmq'
	forLibraryAt: '$GEMSTONE/testlib/testmq'
	select: [:each | each name = 'amqp_login'].
wrapper initializeFunctions.
UserGlobals at: wrapper name put: wrapper.
%
 
"call the Method on the wrapper class"
topaz 1> run
| theResult |
theResult := MyAmq new
amqp_login_: (amqp_rpc_reply_t new)
_: (CPointer newNull)
_: '/'
_: 0 
_: 131072 
_: 0 
_: 0 	 
varArgs: #(#'const char*' 'guest' #'const char*' 'guest').
theResult
%
 

Further refinement: creating a convenience method

The generated method does not have meaningful method names, and includes arguments for both important arguments, and arguments that may be static for your application. You can create a further method for ease of use, as in the following example.

Example 19.14 Convenience method

topaz 1> method MyAmq
amqpLoginTo: conn userId: userId password: pass
	| result varArgsArray |
	result :=  amqp_rpc_reply_t new.
	varArgsArray := Array 
		with: #'const char*' with: userId
		with: #'const char*' with: pass.
	self 
		amqp_login_: result 
		_: conn _: '/'
		_: 0 _: 131072 _: 0 _: 0 
		varArgs: varArgsArray.
	^result
%
topaz 1> commit
 
" execute the convenience method "
topaz > run
MyAmq new 
	amqpLoginTo: (CPointer newNull) 
	userId: 'guest' 
	password: 'guest'
%
 
 

Previous chapter

Next chapter