This document is a proposal for a Standard ML Basis Library. This library provides a rich initial basis for Standard ML, which complements the language described by the Definition of Standard ML The goals of the Basis Library are to:
In this chapter, we discuss the principles used in the design of the Library, and present a high-level view of the library structure.
By design, the Basis Library is meant to provide a fairly rich collection of general-purpose modules that can serve as the basis for applications programming or for more domain-specific libraries. One criterion for inclusion in the Basis Library was that a type or value requires compiler or run-time system support. In addition, the Library defines a standard minimal environment that anyone using SML interactively can expect to find. The Library also attempts to provide similar functions in similar contexts. Thus, the traditional app
function for lists, which applies a function to each member of a list, has also been provided for arrays and vectors.
An opposite design force has been the desire to keep the basis small. In general, a function has been included only if it has clear or proven utility, with additional emphasis on those that are complicated to implement, require compiler support, or are more concise or efficient than an equivalent combination of other functions. Some exceptions were made for historical reasons.
The Basis Library is contained in a set of structures. Almost every type, exception constructor and value belongs to some structure. Although some identifiers are also bound in the initial top-level environment we have attempted to keep the number of top-level identifiers small. Infix declarations and overloading are specified for the top-level environment.
We have divided the modules into required and optional modules. Any conforming implementation of SML Standard Library will provide implementations of all of the required modules. In addition, if an implementation provides any of the services covered by the optional modules, then they shall conform to the given interfaces.
Many of the structures are variations on some generic module (e.g., single and double-precision floating-point numbers). [TABLE] gives a list of the required generic signatures.
Signature | Description |
---|---|
CHAR | Generic character interface |
INTEGER | Generic integer interface |
MATH | Generic math library interface |
IMPERATIVE_IO | Imperative I/O interface |
MONO_ARRAY | Mutable monomorphic arrays |
MONO_VECTOR | Immutable monomorphic vectors |
PRIM_IO | System-call operations for IO |
REAL | Generic real number interface |
STREAM_IO | Stream I/O interface |
STRING | Generic string interface |
SUBSTRING | Generic substring interface |
TEXT_IO | Text I/O interface |
TEXT_STREAM_IO | Text stream I/O interface |
WORD | Generic word (i.e., unsigned modular integer) interface |
BIN_IO |
---|
BOOL |
BYTE |
COMMAND_LINE |
DATE |
GENERAL |
IEEE_REAL |
IO |
LIST |
LIST_PAIR |
OPTION |
OS |
OS_FILE_SYS |
OS_IO |
OS_PATH |
OS_PROCESS |
STRING_CVT |
TIME |
TIMER |
Module | Signature | Status | Description |
---|---|---|---|
Array | ARRAY | OM | Mutable polymorphic arrays |
BinIO | BIN_IO | Binary input/output types and operations | |
BinPrimIO | PRIM_IO | M | Low-level binary IO |
Bool | BOOL | O | Boolean type and values |
Byte | BYTE | M | Conversions between Word8 and Char |
Char | CHAR | OM | Ordinary characters |
CharArray | MONO_ARRAY | M | Mutable arrays of characters |
CharVector | MONO_VECTOR | M | Immutable arrays of characters |
CommandLine | COMMAND_LINE | M | Program name and arguments |
Date | DATE | M | Calendar operations |
General | GENERAL | OM | General-purpose types, exceptions and values |
IEEEReal | IEEE_REAL | M | Floating-point classes and hardware control |
Int | INTEGER | OM | Default integer structure |
IO | IO | Basic I/O types and exceptions | |
LargeInt | INTEGER | M | Structure providing largest integer |
LargeReal | REAL | M | Largest floating-point representation |
LargeWord | WORD | M | Structure providing largest word |
List | LIST | O | List type and utility functions |
ListPair | LIST_PAIR | List of pairs and utility functions | |
Math | MATH | Default math structure | |
Option | OPTION | O | Optional values and partial functions |
OS | OS | M | Basic operating system services |
OS.FileSys | OS_FILE_SYS | M | File status and directory operations |
OS.IO | OS_IO | M | Support for polling I/O devices |
OS.Path | OS_PATH | Pathname operations | |
OS.Process | OS_PROCESS | M | Simple process operations |
Position | INTEGER | M | File system positions |
Real | REAL | OM | Default real structure |
String | STRING | OM | Ordinary strings |
StringCvt | STRING_CVT | Conversions between strings and various types | |
Substring | SUBSTRING | O | Substrings |
TextIO | TEXT_IO | O | Text input/output types and operations |
TextPrimIO | PRIM_IO | M | Low-level text IO |
Time | TIME | M | Representation of time values |
Timer | TIMER | M | Timing operations |
Vector | VECTOR | OM | Immutable polymorphic vectors |
Word | WORD | OM | Default word structure |
Word8 | WORD | M | 8-bit words |
Word8Array | MONO_ARRAY | M | Arrays of 8-bit words |
Word8Vector | MONO_VECTOR | M | Vectors of 8-bit words |
BoolArray | MONO_ARRAY | Mutable arrays of booleans |
BoolVector | MONO_VECTOR | Immutable arrays of booleans |
FixedInt | INTEGER | Largest fixed precision integers |
ImperativeIO | IMPERATIVE_IO | Functor to convert stream I/O into imperative IO |
IntInf | INT_INF | Arbitrary-precision integers |
IntN | INTEGER | N-bit, fixed precision integers |
IntArray | MONO_ARRAY | Mutable arrays of default integer |
IntNArray | MONO_ARRAY | Mutable arrays of N-bit integers |
IntVector | MONO_VECTOR | Immutable vectors of default integers |
IntNVector | MONO_VECTOR | Immutable vectors of N-bit integers |
Locale | LOCALE | Support for locale-dependent applications |
MultiByte | MULTIBYTE | Support for multibyte characters |
PackRealNBig | PACK_REAL | Big-endian packing for N-bit floats |
PackRealNLittle | PACK_REAL | Little-endian packing for N-bit floats |
PackRealBig | PACK_REAL | Big-endian packing for default floats |
PackRealLittle | PACK_REAL | Little-endian packing for default floats |
PackNBig | PACK_WORD | Big-endian packing for N-byte words |
PackNLittle | PACK_WORD | Little-endian packing for N-byte words |
Posix | POSIX | Root POSIX structure |
Posix.Error | POSIX_ERROR | POSIX error values |
Posix.FileSys | POSIX_FILE_SYS | POSIX file system operations |
- | POSIX_FLAGS | Generic POSIX flag interface |
Posix.IO | POSIX_IO | POSIX I/O operations |
Posix.ProcEnv | POSIX_PROC_ENV | POSIX process environment operations |
Posix.Process | POSIX_PROCESS | POSIX process operations |
Posix.Signal | POSIX_SIGNAL | POSIX signal types and values |
Posix.SysDB | POSIX_SYS_DB | POSIX system database types and values |
Posix.TTY | POSIX_TTY | Control of POSIX TTY drivers |
PrimIO | PRIM_IO |
Functor to build PRIM_IO structure
|
RealArray | MONO_ARRAY | Mutable arrays for default reals |
RealVector | MONO_VECTOR | Immutable vectors for default reals |
RealN | REAL | N-bit floating-point numbers |
RealNArray | MONO_ARRAY | Mutable arrays of N-bit floating-point numbers |
RealNVector | MONO_VECTOR | Immutable vectors of N-bit floating-point numbers |
StreamIO | STREAM_IO | Functor to convert primitive I/O into stream I/O |
SysWord | WORD | Words sufficient for OS operations |
WideChar | CHAR | Support for wide characters |
WideString | STRING | Support for wide strings |
WideSubstring | SUBSTRING | Support for wide substrings |
WideTextPrimIO | PRIM_IO | Low-level wide char IO |
WideTextIO | TEXT_IO | Text I/O on wide characters |
WordN | WORD | N-bit words |
INT_INF |
LOCALE |
MULTIBYTE |
PACK_REAL |
PACK_WORD |
POSIX |
POSIX_ERROR |
POSIX_FILE_SYS |
POSIX_FLAGS |
POSIX_IO |
POSIX_PROC_ENV |
POSIX_PROCESS |
POSIX_SIGNAL |
POSIX_SYS_DB |
POSIX_TTY |
We specify certain relationships among the modules.
To permit users to compile programs written under the old basis, we require that each implementation provide the structure SML90
. This structure contains the top-level bindings specified in the Definition, along with one or more substructures that define the top-level bindings of various implementations. For example, a user might write:
local open SML90 SML90.NJ in (* user's program *) endto compile a user's program under the old SML/NJ basis.
We expect that at some future point, the SML90
module will be deemed obsolete, and will be dropped from the standard basis.
Conforming implementations must provide modules that exactly match the signatures defined in the SML Standard Library. For example, the Int
structure provided by an implementation should not match a superset of the INTEGER
signature. Additional structures should be provided for extensions to the basis, other libraries, or access to implementation-specific information.
We use a new set of spelling and capitalization conventions. Some of these conventions, e.g., the capitalization of value constructors, seem to be widely accepted in the user community. Other decisions were based less on dominant style or compelling reason than on compromise and the need for consistency and some sense of good taste. We hope users will accept the conventions and concentrate on the issues of semantics.
The conventions we use are:
map
, openIn
.
word
, file_desc
.
PACK_WORD
, OS_PATH
. We refer to this as the signature convention.
General
, WideChar
. We refer to this as the structure convention.
SOME
, A_READ
, FOLLOW_ALL
. In certain cases, where external usage or aesthetics dictates otherwise, the structure convention should be followed; e.g., Jan
, Mon
. Within the basis library, the only use of the latter convention occurs with the months and weekdays in Date. The only exceptions to these rules are the identifiers nil
, true
and false
, where we bow to tradition.
Domain
, TerminatedStream
.
Similar values should have similar names, with similar type shapes, following the conventions outlined above. For example, the function Array.app
has the type:
val app : ('a -> unit) -> 'a array -> unitwhich has the same shape as
List.app
. Names should be meaningful, but concise. However, we have broken this rule in certain instances where previous usage seemed compelling. For example, we have kept the name app
rather than adopt apply
. More dramatically, we have purposely kept most of the traditional Unix names in the optional Posix modules, to capitalize on the familiarity of these names and the available documentation.
Many structures define a type ty
along with a comparison function
val compare : ty * ty -> orderplus the expected relational operators
>
, >=
, <
and <=
. In all cases, the standard relationships hold between these functions. For example, we have x > y = true
if and only if compare(x, y) = GREATER
. If, in addition, ty
is an equality type, we assume that the operators =
and <>
satisfy the usual relationships with compare
and the relational operators. For example, if x = y
, then compare(x,y) = EQUAL
. Note that these assumptions are not quite true for real values; see the REAL signature for more details.
Types that have a standard or obvious linear order come with the full set of relational operators plus a compare
function. Certain abstract types, e.g., OS.FileSys.file_id, provide a compare
function for use with, for example, ordered binary trees.
Most structures defining a type provide conversion functions to and from other types. When unambiguous, we use the naming convention toT and fromT, where T is some version of the name of the other type. For example, in WORD, we have
val fromInt : Int.int -> word val toInt : word -> Int.intIf this naming is ambiguous (e.g., a structure defines multiple types that have conversions from integers), we use the convention TFromTT and TToTT. For example, in POSIX_PROC_ENV, we have
val uidToWord : uid -> SysWord.word val gidToWord : gid -> SysWord.word
There should be conversions to and from strings for most types. Following the convention above, these functions are typically called toString
and fromString
. Usually, modules provide additional string conversion functions that allow more control over format and operate on an abstract character stream. These functions are called fmt
and scan
. The input accepted by fromString
and scan
consists of printable ASCII characters. The output generated by toString
and fmt
consists of printable ASCII characters.
We adopt the convention that conversions from strings should be forgiving, allowing initial white space and multiple formats, and ignoring additional terminating characters. On the other hand, we have tried to specify conversions to strings precisely. In addition, for basic types, scanning functions should accept legal SML literals, and formatting functions should, whenever possible, produce the value part of a valid SML literal but, for flexibility, may omit certain annotations. For example, String.toString
produces a valid SML string constant, but without the enclosing quotes, and Word.toString
produces a word constant without the "0wx"
prefix.
The old basis did not provide a character type, only a string type. To manipulate characters, programmers used integers corresponding to the character's code. This was unsatisfactory for several reasons:
string
type provided by the Definition with the types string
and char
, where the string
type is a vector of characters. In addition, we define the optional types WideString.string
and WideChar.char
, in which the former is again a vector of the latter, for handling character sets more extensive than Latin-1.
Functional arguments that are evaluated solely for their side-effects should be required to have a return type of unit
. For example, the list application function should have the type:
val app : ('a -> unit) -> 'a list -> unit
Last Modified January 21, 1997
Copyright © 1996 AT&T