|
RESEARCH INTERESTS | ||||||
|
Shape Theory
GeneralShape in Computing is a position paper on the importance of shape in computing. It begins:"Values associated with a data type usually have a shape, too. For example, the shape of a matrix is its size. Shape refers to data structures into which data can be inserted at various positions, or ``holes''. For example, the shape of a labelled graph is its underlying, unlabelled graph; that of a record is its set of field names. Shapes, like types, are important in the specification and semantics of programs, and should support tools for static error detection and improved compilation techniques. This position paper will outline the semantic underpinnings of shape theory, and suggest how it might be exploited in a variety of computational settings." Separating Shape from Data is the summary of an invited address emphasising the semantic aspects of shape. It begins:
"Shape theory gives a precise categorical account of how data is
stored within data structures, or shapes..."
Work on the semantics of shape lead to the discovery of the covariant types which form a sub-system
of system F of polymorphic lambda calculus based on a type constructor
that represents (natural) transformations. This can be used to
construct an alternative presentation of F that uses transformations
instead of universal quantification: Here is a puzzle. Each of the constructions -> and => has a set-theoretic interpretation, and yet we know that system F has no set-theoretic model, at least, no model satisfying Reynold's (mild) assumptions. How are these facts to be reconciled?
Robin Cockett matched this with another pullback, describing labelled tree types:
These two equivalent views, using a shape-data decomposition, and using a class of functors and natural transformations, are both important, and powerful. The original account is in our joint paper Shapely Types and Shape Polymorphism . The full picture can be found in A semantics for shape.
If F and G are functors shapely over lists then one can (typically)
construct an object F => G of natural transformations from F to G.
The theory of functors shapely over lists extends to functors of
several variables. So now we can consider transformations which are natural in B. For each A there is an object of transformations FA => G which combine to produce a functor. However, such functors need not be shapely over lists, essentially because exponentiating by a fixed object need not be so (unless the object is finite, in the sense of Finite Objects in a Locos .
Rather, they are data functors . Their properties are
developed in Data Categories
though there are some unsolved existence problems waiting to be sorted
out. In addition to the question of existence of initial algebras for
data functors, ther is the problem of establishing that they form data
functors themselves. In the paper I claimed to have solved this latter
problem (Theorem 5.7) but Bart Jacobs was kind enough to point out an
error, so the question is still open.
Initially, it was not at all clear that shape polymorphic algorithms exist. The first construction was given in the experimental language P2 described in Polynomial Polymorphism .
Eugenio Moggi then proposed that shape polymorphism be viewed as
quantification over functors, or functorial polymorphism
. This was incorporated in an extension of ML called Functorial ML or FML written by us with
Gianna Belle. It supports an additional layer of functors underneath
the types which supports most of the structure. Then the types T are
either variables X, obtained by application of functors F to existing
types, or function types: This approach exposes some essential subtleties in algorithms such as mapping a function over a data structure. The high-level mapping algorithm can be expressed as The second step is easy. The first one is difficult in a polymorphic setting. Is a particular number part of the shape or of the data. FML uses combinators representing canonical isomorphisms to make the shape-data distinction in a polymorphic manner.
Most recently, this approach has been applied to the study of traversals which generalise mapping by
performing side-effects when each datum is found. More generally, we
can use a monad to represent the computation at each
node. Representing the shape by a functor F and letting M be the
monad, a traversal can be captured as natural transformation
Note that for general monads, e.g. those performing side-effects, the order of traversal is important, so that a given functor typically has many associated traversals.
The array type constructor has been studied in two forms. In the early version of FISh The functional imperative: shape! it represents one dimensional arrays. In the later version Poly-dimensional regular arrays in FISh it represents finite dimensional regular arrays. Paul Steckler and I have implemented both versions for the SunOS operating system, or using the ocaml byte-code interpreter. You can download either. The implementation works by first translating to a simple imperative language Turbot, and translating this into C. Since shape analysis has determined the exact storage requirements for every array, there is no need to box data. The C code produced is easily read, and efficient. Benchmarking against Ocaml shows a speed-up of about 4-8 times when higher-order functions appear in loops over arrays.
FISh now has a formal
semantics in which shape analysis and evaluation of Turbot
programs are described formally.
Before choosing between data distributions, one requires a means of comparing the relative costs of the alternatives. Shape information was used to produce a PRAM cost model for VEC in A Monadic Calculus for Parallel Costing in a Functional Language of Arrays written by Murray Cole, Milan Sekanina, Paul Steckler and myself. This paper also introduced the idea of using computational monads to track costs in the parallel setting.
This idea will be used in a parallel version of FISh called GoldFISh.
GoldFISh will be a purely functional language that will use shape
analysis to determine costs, and hence appropriate distributions. Then
the code for each processor will be translated into efficient FISh
code. Latest ideas on this can be found in Costing parallel programs as a function of
shapes .
This an area of active research.
|
||||||
|
|
![]()
Main |
Personal Details |
Research Interests |
Research Publications Please feel free to send any comments.
Copyright Barry Jay © 1998
|