Terms: Languages. Keywords: Haskell.
to provide evidence for the validity of assertions. We expect our tools to assist the programmer with evidence management, using certificates to record evidence, and to provide whatever translation of Haskell code needed to enable the use of theorem provers and other tools that can serve as sources of evidence.
The Programatica Tools, while still work in progress, can manipulate Haskell programs in various ways and have some support for evidence management. In Section tools, we describe a selection of the functionality provided by the tools, starting with functionality that might be of interest to Haskell programmers in general, and ending with functionality more directly aimed at supporting the goals of the Programatica Project. Section implementation contains some notes on the implementation.
The tools operate on a project, which is simply a collection of files containing Haskell source code. The command pfe new creates a new project, while pfe add adds files and pfe remove removes files from a project. There is also a command pfe chase to search for source files containing needed modules. This is the only command that assumes a relationship between module names and file names. A wrapper script pfesetup uses pfe new and pfe chase to create a project in a way that mimics how you would compile a program with ghc --make [ghc] or load it in Hugs [hugs].
Like batch compilers, pfe caches various information in files between runs (e.g., type information). pfe also caches the module dependency graph, while other systems parse source files to rediscover it every time they are run. This allows pfe to detect if something has changed in a fraction of a second, even for projects that contain hundreds of modules.
Once a project has been set up, the user can use pfe for various types of queries. For example, pfe iface M displays the interface of module M, pfe find x lists modules that export something called x and pfe uses M.x lists all places where the entity called x, defined in Module M, is referenced.
While the syntax highlighting provided by editors such as Emacs and Vim is based on a simple lexical analysis, our tool also makes use of syntax analysis to distinguish between type constructors and data constructors. Also, for hyperlinking identifiers to their definition, a full implementation of the module system and the scoping rules of Haskell are used. Although the source code is parsed, the HTML code is not generated by pretty-printing the abstract syntax tree, but by decorating the output from the first pass of our lexical analyzer [lhih-online], preserving layout and comments.
We also support a simple markup language for use in comments, keeping the plain ASCII presentation of source text readable, while, at the same time, making it possible to generate nice looking LaTeX and HTML from it. We used it when preparing our paper on the Haskell Module System [haskellmodules].
Our HTML renderer is a tool for documenting and presenting source code, in contrast to Haddock [haddock], which is a tool for producing user documentation for libraries.
While HTML rendering makes it possible to use a standard web browser to browse Haskell programs, we have also implemented a dedicated Haskell browser (Figure pfebrowser). It assumes that a project has already been set up with pfe, and is started with the command pfebrowser. It provides a syntax highlighted and hyperlinked source view similar to the one provided in the HTML rendering, but it also has some additional functionality. For example, the user can click on an identifier to find out what its type is. The browser can reload and retypecheck a module after it has been edited. Information is cached internally to make this quick. Reloading and retypechecking a moderately sized (a few hundred lines) module, takes just a second or two. (It is nowhere near the speed of Hugs, though...)
At present, the tools type check complete modules, and do not provide any type information when type checking fails. In the future, we might make the type checker more incremental, providing partial information in the presence of type errors.
In the future, we might also turn the browser into a Haskell editor. Another possibility is creating a dedicated proof tool for Haskell, perhaps similar the proof tool Sparkle [sparkle] for Clean.
The dependency analysis is performed on type-checked code, allowing unused instance declarations to be eliminated. However, to keep the slicer simple, top-level declarations are treated as atomic units (Actually, type signatures and fixity declarations with more than one identifier are split up.) and this can make the slicing coarse, for example including definitions of default methods (and everything they depend on) in class declarations, even if they aren't needed by any instance declarations.
The slicer preserves the module structure of the source program. Import and export declarations are adjusted appropriately.
Because of the annoying monomorphism restriction, eliminating a definition that is not used can change the meaning of a declaration that is used. Dependencies caused by this side effect are currently not taken into account by our slicer.
The syntax of Structured Type Theory is simpler than that of Haskell, so a number of simplifying program transformations are performed as part of the translation. This affects, for example, list comprehensions, the do-notation, pattern matching, derived instances and literals. There are also various ad-hoc transformations to work around syntactic restrictions and limitations of the type checker used in Alfa. Apart from this, the translated code looks fairly similar to the original Haskell code. (An example can be seen in Figure Proofs.)
While Haskell has separate name spaces for types/classes, values and modules, Structured Type Theory has only one, so some name mangling is required to avoid name clashes. This is done in a context-independent way, so that when some Haskell code is modified, the translation of unmodified parts remains unchanged, allowing proofs about unmodified parts to be reused unchanged.
While type theory formally only deals with total functions over finite data structures, the translator allows any Haskell program to be translated. Alfa contains a syntactic termination checker [abel:foetus] that can verify that a set of structurally recursive function definitions are total. When the translation falls outside that set, any correctness proofs constructed in Alfa entail only partial correctness, and we leave it to the user to judge the value of such proofs. In the future, we might use a more sophisticated termination checker and/or change the translation to use domain predicates [bove:thesis] to make partiality explicit and to make the user responsible for providing termination proofs.
While the type system of plain Haskell 98 can be mapped in a straight-forward way to the predicative type system of Structured Type Theory, we foresee problems extending the translation to cover existential quantification in data types and perhaps also higher rank polymorphism.
When source code is changed, the validity of existing certificates can be affected. Revalidating certificates might require some work by the user. To help identify those changes that can actually influence the validity of a certificate, dependencies are tracked on the definition level (rather than the module level). Changes are detected by comparing hash values computed from the abstract syntax. As a result, we avoid false alarms trigged by changes to irrelevant parts of a module, and changes that only affect comments or the layout of code.
Just to give a simple example, Figure Simple shows a Haskell module containing three valid certificates, two Alfa certificates and one QuickCheck certificate. Figure Proofs shows the translation of the module to Alfa and the proofs of the two Alfa certificates.
Not surprisingly, our implementation has a lot in common with a Haskell compiler front-end, and is likely to be reusable in many contexts outside the Programatica project. For example, it has already been used in the refactoring project at the University of Kent [refactor-fp].
Our tools are implemented in Haskell, including the browser, which uses Fudgets [fudgets] for the user interface. Amongst other things, our implementation contains
an abstract syntax;
a lexer and a parser;
a pretty printer;
some static analyses, including inter-module name resolution;
some program transformations;
a type checker and
definition-level dependency tracking.
Some parts -- the abstract syntax, the parser and the pretty printer -- were inherited from another source [hssource] and modified, while most other parts were written from scratch. Some parts -- the lexer [lhih-online] and the module system [haskellmodules] -- are implemented in pure Haskell 98 and can serve as reference implementations, while others make use of type system extensions, in particular multi-parameter classes with functional dependencies [jones:funcdep]. Together with a two-level approach to the abstract syntax [sheard01generic] and other design choices, this makes the code more modular and reusable, but perhaps also too complicated to serve as a reference implementation of Haskell.
The fact that the first pass of our lexer preserves white space and comments made it easy to implement the HTML renderer. The modular structure of the lexer also allowed us to quickly create a simple tool that someone asked for on the Haskell mailing list: a program that removes comments and blank lines from Haskell code [stripcomments].
While implemented from scratch, key design choices in the type checker were influenced by the simplicity of Typing Haskell in Haskell [jones99typing], the efficiency of the type checkers in HBC and NHC, and the constraint based approach used in one of Johan Nordlander's type checkers for O'Haskell [ohaskell]. It performs the dictionary translation and inserts enough type annotations to make the output suitable for translation to an explicitly typed language like Structured Type Theory or System F.
Acknowledgments The author would like to thank Johan Jeuring, Mark Jones and John Matthews for suggesting improvements to this abstract. Also, although this abstract has one author, many people contributed to the work it describes.