Many GemStone applications will need to read and write files. GemStone provides two interfaces; the older GsFile interface, and the new FileSystem interface.
This chapter explains how to use GsFile and FileSystem to read, write, and manage files and directories.
Accessing Files using GsFile
describes the protocol provided by class GsFile to open and close files, read their contents, and write to them.
FileSystem
describes FileSystem, a set of classes ported and adapted from Pharo to support a wide range of File and Directory operations.
The class GsFile provides basic protocol to create and access operating system files. Most functions of GsFile class can also be performed using FileSystem.
This section provides a few examples of the more common operations using GsFile. For more information, see the GsFile methods in the image.
Instances of GsFile understand most protocol common to Streams.
Many of the methods in GsFile distinguish between files on the client and those on the server machine. In this context, the term client refers to the machine on which the GCI interface is executing, and the server refers to the machine on which the Gem is executing. This terminology is historic, and may be misleading, since the server in this case does not mean the machine the Stone is running on, if the Gem is remote from the Stone.
In the case of a linked interface, the interface and the Gem execute as a single process, so the client machine and the server machine are the same.
In the case of an RPC interface, the interface and the Gem are separate processes, and the client machine may be different from the server machine. When using GBS or topaz on Windows, GsFile client methods can operate on files in the Windows file system.
Methods that include "OnServer" operate on server files, accessible in the file system of the machine that the Gem is running on. Methods that do not mention "OnServer" operate on files on the machine that the client is running on.
GsFile is implemented in UserActions (for historic reasons), and thus client-side file access is not supported in external sessions using GsTsExternalSession or GsExternalSession, nor within code invoked via GciNb* functions.
Many of the methods in the class GsFile take as arguments a file specification, which is any string that constitutes a legal file specification in the operating system under which GemStone is running. Wildcard characters and environment variables are legal in a file specification.
If you supply an environment variable instead of a full path when using the methods described in this chapter, the way in which the environment variable is expanded depends upon whether the process is running on the client or the server machine.
You can create a new operating system file from GemStone Smalltalk, using class methods to open a GsFile for write or append; these methods will create a file if it does not exist, as well as returning the open file.
Example 7.1 creates a file named aFileName in the current directory on the server.
myFilePath := 'aFileName'.
myFile := GsFile openWriteOnServer: myFilePath.
"Here would go code to write data to the file"
myFile close
myFilePath := 'aFileName'.
myFile := GsFile openWrite: myFilePath.
"Here would go code to write data to the file"
myFile close
These methods return the instance of GsFile that was created, or nil if an error occurred. See GsFile Errors for more on how to find the details of why a GsFile operation returned nil.
GsFile provides a wide variety of protocol to open files. For a complete set of methods, see the image. These methods return the GsFile instance if successful, or nil if an error occurs..
The following methods close the current instance, or multiple open files:
Your operating system limits the number of files a process can concurrently access. Instances of GsFile automatically have their C state closed when the instance is garbage collected or when a persistent instance drops out of memory. However, you should still close files as soon as you are done using them.
After you have opened a file for writing, you can add new contents to it in several ways.
For example, the instance methods addAll: and nextPutAll: take strings as arguments and write the string to the end of the file specified by the receiver. The method add: takes a single character as argument and writes the character to the end of the file. And various methods such as cr, lf, and ff write specific characters to the end of the file—in this case, a carriage return, a line feed, and a form feed character, respectively.
The write methods return the number of bytes that were written to the file, or nil if an error occurs.
For example, the following code writes the two strings specified to the file myFile.txt, separated by end-of-line characters.
myFile := GsFile openWrite: 'myFile.txt'.
myFile nextPutAll: 'All of us are in the gutter,'.
myFile cr.
myFile nextPutAll: 'but some of us are looking at the stars.'.
myFile close.
If the text you wish to write contains characters outside the ASCII range (that is, with codePoints greater than 127), then there are additional considerations.
Text with Characters with codePoints greater than 255 require more than one byte to represent. These must be encoded before they can be written to a GsFile. Encoding to UTF-8 is done by using GsFile >> nextPutAllUtf8: or by passing an instance of Utf8 to the write methods.
For example, the Euro character € has the Unicode value U+20AC.
| myfile str |
myfile := GsFile openWrite: 'extendedCharacterExample.txt'.
str := String new.
str add: 'How to write a Euro character '.
str add: (Character codePoint: 16r20AC).
str add: ' to a file'; lf.
myfile nextPutAllUtf8: str.
myfile close.
Text with Characters with codePoints in the range of 128..254 have ambiguity between the legacy output and UTF-8 encoding. Traditionally, GsFiles write Characters as byte data without encoding or decoding. Most modern systems encode files as UTF-8, and expect files to be encoded as UTF-8. However, to ensure legacy uses of GsFile do not create invalid Strings, the GsFile write methods continue to write data as bytes.
While unlikely, it is possible that Characters with codePoints in the range 128..254 could be either a portion of a UTF-8 encoding or an 8-bit character. Most often, specifying to default to the incorrect format will result in a badly formed UTF-8 error, or un-decoded bytes.
Since for ASCII Characters (codePoints in the 7-bit range), the legacy output and the UTF-8 encoding are the same, encoding all writes (and decoding all reads) is an effective way to remove ambiguity.
Instances of GsFile can be accessed in many of the same ways as instances of Stream subclasses. Like streams, GsFile instances also include the notion of a position, or pointer into the file. When you first open a file, the pointer is positioned at the beginning of the file. Reading or writing elements of the file ordinarily repositions the pointer as if you were processing elements of a stream.
A variety of methods allow you to read some or all of the contents of a file from within GemStone Smalltalk. For example, the contents method (at the end of Example 7.3) returns the entire contents of the specified file and positions the pointer at the end of the file.
In Example 7.5, next: into: takes the 12 characters after the current pointer position and places them into the specified string object. It then advances the pointer by 12 characters.
| result |
result := String new.
myFile := GsFile openRead: 'myFileName'.
myFile next: 12 into: result.
myFile close
result.
To read a file containing data encoded in UTF-8, you may read the file as usual, and then send decodeFromUTF8ToString or decodeFromUTF8ToUnicode to decode the results. Alternatively, you may use the method
GsFile >> contentsAsUtf8
which you can then decode from the instance of Utf8 similarly using decodeToString or decodeToUnicode.
Note that when reading files whose contents logically contain Characters with codePoints larger than 127, you must be aware of the whether the file is encoded in order to decode appropriately. GsFile reads the bytes and does not distinguish between encoded or un-encoded contents. A UTF-8 encoded file when read in using a GsFile and not explicitly decoded will be garbled, and a file written as 8-bit characters that you attempt to decode will almost always result in a badly formed UTF-8 error.
If you are writing a file in topaz format, for example source code, you may include a header line in the output file, either:
fileformat utf8
fileformat 8bit
You can also reposition the pointer without reading characters, or peek at characters without repositioning the pointer. The method:
GsFile peek
allows you to view the next character in the file without advancing the pointer.
To advance the pointer without reading the intervening characters, use:
GsFile skip: anInteger
To determine the position or set the position specifically, use:
GsFile position
GsFile position: anInteger
The class GsFile provides a variety of methods that allow you to determine facts about a file.
To test for existence of a file, use:
GsFile exists: aFileNameString
GsFile existsOnServer: aFileNameString
These methods returns true if the file exists, false if it does not, and nil if an error occurred.
Files on the client or server can be renamed or moved. For example:
GsFile rename: '/tmp/myfile.txt' to: '/tmp/newname.txt'.
GsFile renameFileOnServer: '$GEMSTONE/data/system.conf'
to: '/users/david/mysystem.conf'.
To remove a file from the client machine, use an expression of the form:
GsFile closeAll.
GsFile removeClientFile: mySpec.
To remove a file from the server machine, use the method removeServerFile: instead. These methods return the receiver or nil if an error occurred.
To get a list of the names of files in a directory, send GsFile the message contentsOfDirectory: aFileSpec onClient: aBoolean. This message acts very much like the UNIX ls command, returning an array of file specifications for all entries in the directory.
If the argument to the onClient: keyword is true, GemStone searches on the client machine. If the argument is false, it searches on the server instead.
GsFile contentsOfDirectory: '$GEMSTONE/examples/admin' onClient: false
%
an Array
#1 /dbf/gsadmin/GS6437/examples/admin/.
#2 /dbf/gsadmin/GS6437/examples/admin/..
#3 /dbf/gsadmin/GS6437/examples/admin/onlinebackup.sh
#4 /dbf/gsadmin/GS6437/examples/admin/archivelogs.sh
If the argument is a directory name, this message returns the full pathnames of all files in the directory, as shown in Example 7.6. However, if the argument is a filename, this message returns the full pathnames of all files in the current directory that match the filename. The argument can contain wildcard characters such as *. The following example shows a different use of this message.
GsFile contentsOfDirectory: '$GEMSTONE/ver*' onClient: false
%
an Array
#1 /dbf/gsadmin/GS6432/version.txt
If you wish to distinguish between files and directories, you can use the message contentsAndTypesOfDirectory:onClient: instead. This method returns an array of pairs of elements. After the name of the directory element, a value of true indicates a file; a value of false indicates a directory. For example:
GsFile operations return nil in cases where an error occurs during the operation. For this reason, most GsFile operations should check for nil return. There are separate methods to check for errors within file operations on server files and client files.
To check for errors in an operation on a server file, the method is GsFile >> serverErrorString. It is nil if no error has occurred. This error is available until the next GsFile operation is executed.
| myFile |
myFile := GsFile openReadOnServer: 'nonexistentfile'.
myFile
ifNil: [GsFile serverErrorString]
ifNotNil: ['Succesfully opened'].
%
No such file or directory : nonexistentfile
GsFile class protocol allows you to write messages to stdout of either the Gem (server) or the client. Note that for clients running without a console (as may be the case, for example, using GBS on Windows), linked output may not be accessible.
GsFile gciLogServer: aString
gciLogServer: writes to stdout of the Gem. For an RPC login, this is the Gem log file. For a linked login on topaz, this is the console or (when using topaz) an output file as controlled by an output push command.
GsFile gciLogClient: aString
gciLogClient: writes to stdout of the GCI application. This is the console for both linked and RPC logins (or on topaz, an output file as controlled by an output push command). If it is not possible to perform GsFile client writes (e.g, if within an external session or a non-blocking GCI execution), it will fall back to executing gciLogServer:.
FileSystem is a set of classes, ported and adapted from Pharo, that support operations on files and directories. FileSystem provides a much more flexible and feature-rich environment than GsFile.
The term ’FileSystem’ can be used to refer to the entire set of classes that support file and directory operations, or to the specific class named FileSystem. The specific class FileSystem represents the underlying file environment. Most operations within the general File System environment use the class FileReference, which represents a file or directory.
Unlike GsFile, FileSystem is implemented using FFI, which makes it easily extensible to support specific behavior under different OS platforms. However, FileSystem does not support operations on the client (that is, on the node on which the GCI client is running); only operations on the server (the node on which the Gem is running, which may or may not be the node on which the Stone is running). If you are running an client application on Windows, for example, you will need to continue to use GsFile to access files on the Windows client. FileSystem is also not supported on AIX.
FileSystem is still under refinement, and may be missing features or contain unexpected behavior. The low level support classes in particular may be refactored and/or have protocol modifications.
FileReference is the primary entry point for file and directory operations. A FileReference represents a file or directory, which may nor may not exist on disk.
File references can be created from Strings or using FileReference or FileSystem class methods. For example:
'/gshost/test/foo.txt' asFileReference
See Specifying a FileReference for other options.
Using the FileSystem environment, you do not normally "open" a file; instead, you create a read or write stream on a file to perform file operations on the contents. Unlike GsFile, FileReference instances themselves do not understand stream protocol.
FileReference includes many methods to get information about a file or directory, access it for read and write, decompose the filename and path, and file parent and child directories and files. Many of these are inherited from its superclass, AbstractFileReference.
FileLocator is a sibling class of FileReference, which provides much the same file and directory behavior inherited from AbstractFileRefernce. It is explicitly designed to allow an environment-independent specification of paths. The actual physical file or directory is resolved according to the environment at runtime. This allows you to move code between environments without requiring explicit management of the paths.
FileLocator provides a number of common environments; these can be listed using
FileLocator class >> supportedOrigins
Class methods are available for the various supportedOrigins, including home, cache, temp, userData, tranlog, preferences, extent1Directory, extent1, documents, desktop, and workingDirectory.
To find the resolved value of the FileLocator for your current environment, you can send the message absolutePath; this returns an instance of a kind of Path.
FileLocator extent1 absolutePath printString
Path / 'gshost' / 'GemStone3.7' / 'data' / 'extent0.dbf'
While the primary entry point in the FileSystem environment is FileReference, the class FileSystem provides additional behavior.
FileSystem includes support for both ordinary disk based file systems (FsDiskFileSystem) and in-memory file systems (FsMemoryFileSystem). These can be retrieved using class methods. For example:
FileSystem disk
returns a disk-based FileSystem, which can be used with @ or / to create a FileReference, e.g. FileSystem disk / '/gshost/test/foo.txt'.
FileSystem memory
returns an in memory-based FileSystem, which can be used with @ or / to create a FileReference, e.g. FileSystem memory / 'foo' / 'bar'.
FileSystem workingDirectory
creates a FileReference to the current working directory (disk-based).
FileSystem instances should not be used to perform file and directory operations; these instance methods are subject to change or removal in a future version. Obtaining instances of FileReference or changing the working directory are valid operations for FileSystem instances.
There are many ways to specify a FileReferences to a particular file, using methods on FileReference, FileSystem, or by creating a FileReference from a String.
The following are all equivalent:
'/gshost/test/foo.txt' asFileReference
FileReference / '/gshost/test/foo.txt'
FileReference / 'gshost' / 'test' / 'foo.txt'
FileReference disk @ '/gshost/test/foo.txt'
FileSystem disk / '/gshost/test/foo.txt'
'/gshost/test/' asFileReference / 'foo.txt'
These all resolve to a FileReference that is printed as:
FileReference disk @ '/gshost/test/foo.txt'
The FileReference and FileSystem / operator and FileSystem @ operator create a new instance of FileReference with its argument string interpreted as a path or file relative to the path of the receiver. If the argument string is an absolute path (that is, includes a leading / in the string), then the new FileReference has that absolute path.
The working directory is a special case of environment-independent path. You can use working directory as a file location without needing to use using FileLocator.
Using FileSystem >> workingDirectory, you can create a FileReference can be used to access or create a file in the directory that is the current working directory when the code is executed.
FileSystem workingDirectory / 'myLogFile.txt'
The workingDirectory can be modified using:
FileSystem disk setWorkingDirectory: aFileReference
The working directory is also accessible from FileLocator, and other environment-independent root paths can be specified using FileLocator. For example,
FileLocator home / 'output.txt'.
FileLocator temp / 'testLogs' / 'performance.log'
To read from or write to a file, open the file for read or write using:
aFileReference readStream
aFileReference writeStream
These methods return a kind of ZnStream, which understands standard stream protocol.
rdStream := '/gshost/test/foo.txt' asFileReference readStream.
[rdStream atEnd] whileFalse:
[report add: rdStream nextLine; lf.].
rdStream close.
These methods have a number of variants. For example, the readStreamIfAbsent: and writeStreamIfPresent: provide easy checking for some common error conditions.
Streams that have been opened should be closed when they are no longer needed.
When closing a writestream, you should call flush to ensure all data is written to the file.
wrStream := '/gshost/test/foo.txt' asFileReference writeStream.
wrStream nextPutAll: SystemRepository fileSizeReport; lf.
wrStream flush; close.
The readStreamDo: and writeStreamDo: variants close the file after the Do: block is complete, avoiding the need for an explicit close.
'/gshost/test/foo.txt' asFileReference readStreamDo:
[:str |
[str atEnd] whileFalse: [
report add: str nextLine; lf.]
The above methods read the disk file with the default encoding, UTF-8. You can write instances of kinds of Strings or Unicode strings with Characters outside the ASCII range to a file, and you read instances of these classes from a file, without having to explicitly convert them to or from the raw file bytes.
FileReference also supports reading files using GemStone’s legacy 8-bit encoding. To do this, use the methods that include Encoded:, which takes an instance of a kind of ZnCharacterEncoder. For example,
rdStream := '/gshost/test/foo.txt' asFileReference
readStreamEncoded: '8bit' asZnCharacterEncoder.
[rdStream atEnd] whileFalse:
[report add: rdStream nextLine; lf.].
rdStream close.
You can also read the files as binary, which allows you to do your own processing of the results. The binary*Stream methods return instance of FsBinaryFileStream, which reads and writes instances of ByteArray.
rdStream := '/gshost/test/foo.txt' asFileReference
binaryReadStream.
[rdStream atEnd] whileFalse:
[report add: rdStream contents decodeFromUTF8ToString; lf].
rdStream close.
There are a number of options for operating on files and directories; see the image for the full set of methods.
createFile
Create the file; signal an exception if the parent does not exist.
ensureCreateFile
Create the file, if it does not exist, including parents if needed.
createDirectory
Create the directory; signal an exception of the parent does not exist.
ensureCreateDirectory
Create the directory, if does not exist, including parents if needed.
delete
Delete the file or directory. If the file or directory does not exist, or the directory is not empty, signal an exception.
ensureDelete – like delete, but does not signal an exception if the file or directory does not exist.
deleteAll
Delete the file or directory, including all files and directories under the directory. If the file or directory does not exist, signal an exception
ensureDeleteAll
Similar to deleteAll, but does not signal an error if the file or directory does not exist.
moveTo: anotherFileReference
Moves the file at location for the receiver to the location of the argument.
FileReference contains many methods providing information about a file or directory. The following methods are available for FileReference and FileLocator.
FileReference methods allow you to test the status of a file or directory.
Some available testing methods are:
exists
isReadable
isWriteable
isFile
isDirectory
isExecutable
Operations can also be performed based on file status. For example,
ifAbsent: absentBlock
ifExists: existBlock
ifExists: existBlock ifAbsent: absentBlock
ifFile: fileBlock ifDirectory: directoryBlock ifAbsent: absentBlock
FileReference returns information about a file or directory. There are many methods for file attributes; see the image for available methods.
exec '/gshost/test/foo.txt' asFileReference modificationTime %
2023-02-27 23:17:32.768
exec '/gshost/test/foo.txt' asFileReference size %
1823
File permissions use another class, FileSystemPermission, to represent a file or directory’s permissions. FileSystemPermission includes methods to query for read and write permission.
exec '/gshost/test/foo.txt' asFileReference permissions %
a FileSystemPermission
posixPermission 420
exec '/gshost/test/foo.txt' asFileReference permissions printString %
rw-r--r--
exec '/gshost/test/foo.txt' asFileReference permissions ownerExecute %
false
Operations on files often need to filter out one segment of a path or filename. FileReference has a rich set of options to access segments of a path and filename.
exec '/gshost/test/foo.txt.gz' asFileReference basename %
foo.txt.gz
exec '/gshost/test/foo.txt.gz' asFileReference
pathSegments printString %
anArray( 'gshost', 'test', 'foo.txt.gz')
exec '/gshost/test/foo.txt.gz' asFileReference
extensions printString %
anOrderedCollection( 'txt', 'gz')
exec '/gshost/test/foo.txt.gz' asFileReference
basenameWithoutExtension: 'gz' %
foo.txt
FileReference uses the term children for the files and directories that are under the current FileReference.
For example, the directory /gshost/tests/ contains two files and one subdirectory. The following reports all children, and only the children that are files.
exec '/gshost/test' asFileReference children printString %
anArray( File @ /gshost/test/foo.txt, File @ /gshost/test/logs, File @ /gshost/test/bar.txt)
exec '/gshost/test' asFileReference files printString %
anArray( File @ /gshost/test/foo.txt, File @ /gshost/test/bar.txt)
Methods such as allChildren, allFiles and allDirectories will recursively return all files and/or directories underneath the receiver.
There are a number of classes supporting FileSystem; many of these are for internal use in supporting FileSystem features. The following are some important subsystems of FileSystem.
FsFileDescriptor represents the file itself. This is a lower level support class, however, it is possible to use this directly.
Within the FileSystem environment, you do not normally "open" a file; instead, you create a read or write stream (a kind of ZnStream) on a file to perform file operations on the contents.
FileReference open: methods return an instance of FsFileDescriptor. Instances of FsFileDescriptor read and write ByteArrays, rather than strings.
Subclasses of ZnObject, including ZnBuffed*Stream, ZnEncoded*Stream, and Zn*Encoder, provide read and writestream support for FileSystem. Zn*Encoder streams handle the encoding/decoding from UTF-8 and 8-bit (GemStone legacy) encoded files, and the creation of Legacy or Unicode String instances.
FsError and its subclasses represent errors associated with FileSystem. FileSystem exception represent errors related to the FileSystem, FileException and its subclases represent FileReference/FileLocator errors.
Other errors such as FsUnixError and its subclasses represent low level UNIX file errors, which are generally handled by public API.
Path, RelativePath, and Absolute Path encapsulate a path. A Path can be obtained from a FileReference and vice versa. While Path is an abstract class, you can use it to create instances, e.g. Path from: '/gshost/test/' will return an instance of AbsolutePath.
FsFileOpeningOptions and its subclasses provide the operating-system specific options for opening files.
FileSystemStore and subclasses represent the OS file system or memory file system, with differences for specific operating systems.