1. Introduction

JML is a notation for formally specifying the behavior and interfaces of Java [Arnold-Gosling-Holmes00] [Gosling-etal00] classes and methods.

The goal of this reference manual is to precisely record the design of JML. We include both informal semantics (intentions) and where possible [[[we will eventually include]]] formal semantics (describing when an implementation satisfies a specification). We also discuss the implications for various tools (such as the run-time assertion checker, static checkers such as ESC/Java2, and documentation generators such as jmldoc [Burdy-etal03]).

In this manual we also try to give examples and explanations, and we hope that these will be helpful to readers trying to learn about formal specification using JML. However, this manual is not designed to give all the background needed to write JML specifications, nor to give the prospective user an overview of a useful subset of the language. For this background, we recommend starting with the papers "Design by Contract with JML" [Leavens-Cheon06] and "JML: A notation for detailed design" [Leavens-Baker-Ruby99], and continuing with the paper "Preliminary Design of JML" [Leavens-Baker-Ruby06]. These are all available from the JML web site `http://www.jmlspecs.org/', where further readings and examples may also be found.

Readers with the necessary background, and users wanting more details may, we hope, profit from reading this manual. We suggest reading this manual starting with chapters 1-3, skimming chapter 4 quickly, skimming chapter 5 to get the idea of what declarations mean in JML, and then reading the chapters on class specifications (chapter 6) and method specifications (chapter 7), paying particular attention to the examples. After that, one can use the rest of this manual as a reference.

The rest of this chapter describes some of the fundamental ideas and background behind JML.

1.1 Behavioral Interface Specifications

1.2 A First Example

1.3 What is JML Good For?

1.4 Status and Plans for JML

1.5 Historical Precedents

1.6 Acknowledgments

1.1 Behavioral Interface Specifications

JML is a behavioral interface specification language (BISL) that builds on the Larch approach [Guttag-Horning93] [Guttag-Horning-Wing85b] and that found in APP [Rosenblum95] and Eiffel [Meyer92b] [Meyer97]. In this style of specification, which might be called model-oriented [Wing90a], one specifies both the interface of a method or abstract data type and its behavior [Lamport89]. In particular JML builds on the work done by Leavens and others in Larch/C++ [Leavens-Baker99] [Leavens96b] [Leavens97c]. (Indeed, large parts of this manual are adapted wholesale from the Larch/C++ reference manual [Leavens97c].) Much of JML's design was heavily influenced by the work of Leino and his collaborators [Leino95] [Leino95b] [Leino98] [Leino-etal00] [Leino-Nelson-Saxe00]. JML continues to be influenced by ongoing work in formal specification and verification. A collection of papers relating directly to JML and its design is found at `http://www.jmlspecs.org/papers.shtml'.

The interface of the method or type is the information needed to use it from other programs. In the case of JML, this is the Java syntax and type information needed to call a method or use a field or type. For a method the interface includes such things as the name of the method, its modifiers (including its visibility and whether it is final) its number of arguments, its return type, what exceptions it may throw, and so on. For a field the interface includes its name and type, and its modifiers. For a type, the interface includes its name, its modifiers, its package, whether it is a class or interface, its supertypes, and the interfaces of the fields and methods it declares and inherits. JML specifies all such interface information using Java's syntax.

A behavior of a method or type describes a set of state transformations that it can perform. A behavior of a method is specified by describing: a set of states in which calling the method is defined, a set of locations that the method is allowed to assign to (and hence change), and the relations between the calling state and the state in which it either returns normally, throws an exception, or for which it might not return to the caller. The states for which the method is defined are formally described by a logical assertion, called the method's precondition. The allowed relationships between these states and the states that may result from normal return are formally described by another logical assertion called the method's normal postcondition. Similarly the relationships between these pre-states and the states that may result from throwing an exception are described by the method's exceptional postcondition. The states for which the method need not return to the caller are described by the method's divergence condition; however, explicit specification of divergence is rarely used in JML. The set of locations the method is allowed to assign to is described by the method's frame axiom [Borgida-etal95]. In JML one can also specify other aspects of behavior, such as the time a method can use to execute and the space it may need.

The behavior of an abstract data type (ADT), which is implemented by a class in Java, is specified by describing a set of abstract fields for its objects and by specifying the behavior of its methods (as described above). The abstract fields for an object can be specified either by using JML's model and ghost fields [Cheon-etal05], which are specification-only fields, or by specifying some of the fields used in the implementation as spec_public or spec_protected. These declarations allow the specifier using JML to model an instance as a collection of abstract instance variables, in much the same way as other specification languages, such as Z [Hayes93] [Spivey92] or Fresco [Wills92b].

1.2 A First Example

For example, consider the following JML specification of a simple Java abstract class IntHeap. (An explanation of the notation follows the specification. This specification, like the others in this manual, ships with the JML release in the `JML/org/jmlspecs/samples/jmlrefman' directory.)

package org.jmlspecs.samples.jmlrefman; // line 1 // line 2 public abstract class IntHeap { // line 3 // line 4 //@ public model non_null int [] elements; // line 5 // line 6 /*@ public normal_behavior // line 7 @ requires elements.length >= 1; // line 8 @ assignable \nothing; // line 9 @ ensures \result // line 10 @ == (\max int j; // line 11 @ 0 <= j && j < elements.length; // line 12 @ elements[j]); // line 13 @*/ // line 14 public abstract /*@ pure @*/ int largest(); // line 15 // line 16 //@ ensures \result == elements.length; // line 17 public abstract /*@ pure @*/ int size(); // line 18 }; // line 19

The interface of this class consists of lines 1, 3, 15, and 18. Line 3 specifies the class name, and the fact that the class is both public and abstract. Lines 15 and 18, apart from their comments, give the interface information for the methods of this class.

The behavior of this class is specified in the JML annotations found in the special comments that have an at-sign (@) as their first character following the usual comment beginning. Such lines look like comments to Java, but are interpreted by JML and its tools. For example, line 5 starts with an annotation comment marker of the form //@, and this annotation continues until the // towards the end of the line, which starts a comment within the annotation which even JML ignores. The other form of such annotations can be seen on lines 7 through 14, line 17, and on lines 15 and 18. These annotations start with the characters /*@ and end with either @*/ or */; within such annotations, at-signs (@) at the beginnings of lines are ignored by JML. Note that there can be no space between the start of comment marker, either // or /* and the first at-sign; thus // @ starts a comment, not an annotation. (See section 4. Lexical Conventions, for more details about annotations.)

The first annotation, on line 5 of the figure above, gives the specification of a field, named elements, which is part of this class's behavioral specification. Ignoring, for the moment the extra JML modifiers, one should think of this field, in essence, as being declared like:

public int[] elements;
That is, it is a public field with an integer array type; within specifications it is treated as such. However, because it is declared in an annotation, this field cannot be manipulated by Java code. Therefore, for example, the fact that the field is declared public is not a problem, because it cannot be directly changed by Java code.

Such declarations of fields in annotations should be marked as specification-only fields, using the JML modifier model.(1) A model field should be thought of as an abstraction of a set of concrete fields used in the implementation of this type and its subtypes. (See section 8.4 Represents Clauses, for a discussion of how to specify the connection between the concrete fields and such model fields. See also the paper by Cheon et al. [Cheon-etal05].) That is, we imagine that objects that are instances of the type IntHeap have such a field, whose value is determined by the concrete fields that are known to Java in the actual object. Of course at runtime, objects of type IntHeap have no such field, the model fields are purely imaginary. Model fields are thus a convenient fiction that is useful for describing the behavior of an ADT. One does not have to worry about their cost (in space or time), and should only be concerned with how they clarify the behavior of an ADT.

The other annotation used on line 5 is non_null. This just says that in any publicly-visible state, the value of elements must not be null. It is thus a simple kind of invariant (see section 8.2 Invariants).

In the above specification of IntHeap, the specification of each method precedes its interface declaration. This follows the usual convention of Java tools, such as JavaDoc, which put such descriptive information in front of the method. In JML, it is also possible to put the specification just before the semicolon (;) following the method's interface information (see section 9. Method Specifications), but we will usually not to do that in this document.

The specification of the method largest is given on lines 7 through 15. Line 7 says that this is a public, normal behavior specification. JML permits several different specifications for a given method, which can be of different privacy levels [Ruby-Leavens00] [Leavens-Mueller07]. The modifier public says that the specification is intended for use by clients. (If the privacy modifier had been protected, for example, then the specification would have been intended for subclasses.)

The keyword normal_behavior tells JML several things. First, it says that the specification is a heavyweight method specification, as opposed to a lightweight method specification like that given on line 17. A heavyweight specification uses one of JML's behavior keywords, like normal_behavior, which tells JML that the method specification is intended to be complete. By contrast, a lightweight specification does not use one of JML's behavior keywords, and tells JML that the specification is incomplete in the sense that it contains only some of what the specifier had in mind.(2) Second, the keyword normal_behavior tells JML that when the precondition of this method is met, then the method must return normally, without throwing an exception. In other words, it says that the exceptional postcondition is false, which prohibits the method from throwing an exception when the precondition holds. (Third, it says that the divergence condition defaults to false. See section 9. Method Specifications, for more details.)

The heart of the method specification of largest is found on lines 7 through 13. This part of the specification gives the method's precondition, on line 8, frame axiom, on line 9, and normal postcondition, on lines 10 through 13. The precondition is contained in the requires clause on line 8. The frame axiom is contained in the assignable clause on line 9. The normal postcondition is contained in the ensures clause on lines 10-13.(3)

The precondition in the requires clause on line 8 says that the length of elements must be at least 1 before this method can be called. If that is not true, then the method is under no obligation to fulfill the rest of the specified behavior.

The frame axiom in the assignable clause on line 9 says that the method may not assign to any locations (i.e. fields of objects) that are visible outside the method and which existed before the method started execution. (The method may still modify its local variables.) This form of the frame axiom is quite common.(4) Note that in assignable clauses and in assertions, JML uses keywords that start with a backslash (\), to avoid interfering with identifiers in the user's program. Examples of this are \nothing on line 9 and \result on line 10.

The postcondition in the ensures clause, on lines 10 through 13, says that the result of the method (\result) must be equal to the maximum integer found in the array elements. This postcondition uses JML's \max quantifier (lines 11 through 13). Such a quantifier is always parenthesized, and can consist of three parts. The first part of a quantifier is a declaration of some quantified variables, in this case the integer j on line 11. The second part is a range predicate, on line 12, which constrains the quantified variables. The third part is the body of the quantifier, on line 13, which in this case describes the elements of the array from which the maximum value is taken.

The methods largest and size are both specified using the JML modifier pure. This modifier says that the method has no side effects, and allows the method to be used in assertions if desired.

The method size is specified using a lightweight specification, which is given on line 17. The ensures clause on line 17 says nothing about the precondition, frame axiom, exceptional postcondition, or divergence condition of size, although the use of pure on line 18 gives an implicit frame axiom. Such a form of specification is useful when one only cares to state (the important) part of a method's specification. It is also useful when first learning JML, and when one is using tools, such as ESC/Java2, that do not need heavyweight specifications.

The specifications of the method largest above is very precise: it gives a complete specification of what the method does. Even the specification of size has a fairly complete normal postcondition. We can also give JML specifications that are far less detailed. For example, we could just specify that the result of size is non-negative, with a normal postcondition such as

//@ ensures \result >= 0;
instead of the postcondition given earlier. Such incomplete specifications give considerably more freedom to implementations, and can often be useful for hiding implementation details. However, one should try to write specifications that capture the important properties expected of callers (preconditions) and implementations (postconditions) [Meyer92a] [Liskov-Guttag86].

1.3 What is JML Good For?

JML is a formal specification language tailored to Java. Its basic use is thus the formal specification of the behavior of Java program modules. As it is a behavioral interface specification language, JML specifies how to use such Java program modules from within a Java program; hence JML is not designed for specifying the behavior of an entire program. So the question "what is JML good for?" really boils down to the following question: what good is formal specification for Java program modules?

The two main benefits in using JML are:

the precise, unambiguous specification of the behavior of Java program modules (i.e., classes and interfaces), and documentation of Java code,
the possibility of tool support [Burdy-etal03].

Although we would like tools that would help with reasoning about concurrent aspects of Java programs, the current version of JML focuses on the sequential behavior of Java code. While there is work in progress on extending JML to support concurrency, the current version of JML does not have features that help specify how Java threads interact with each other. JML does not, for example, allow the specification of elaborate temporal properties, such as coordinated access to shared variables or the absence of deadlock. Indeed, we assume, in the rest of this manual, that there is only one thread of execution in a Java program annotated with JML, and we focus on how the program manipulates object states. To summarize, JML is currently limited to sequential specification; we say that JML specifies the sequential behavior of Java program modules.

In terms of detailed design documentation, a JML specification can be a completely formal contract about an interface and its sequential behavior. Because it is an interface specification, one can record all the Java details about the interface, such as the parameter mechanisms, whether the method is final, protected, etc.; if one used a specification language such as VDM-SL or Z, which is not tailored to Java, then one could not record such details of the interface, which could cause problems in code integration. For example, in JML one can specify the precise conditions under which certain exceptions may be thrown, something which is difficult in a specification language that is not tailored to Java and that doesn't have the notion of an exception.

When should JML documentation be written? That is up to you, the user. A goal of JML is to make the notation indifferent to the precise programming method used. One can use JML either before coding or as documentation of finished code. While we recommend doing some design before coding, JML can also be used for documentation after the code is written.

Reasons for formal documentation of interfaces and their behavior, using JML, include the following.

One can ship the object code for a class library to customers, sending the JML specifications but not the source code. Customers would then have documentation that is precise, unambiguous, but not overly specific. Customers would not have the code, protecting proprietary rights. In addition, customers would not rely on details of the implementation of the library that they might otherwise glean from the code, easing the process of improving the code in future releases.
One can use a formal specification to analyze certain properties of a design carefully or formally (see [Hall90] and Chapter 7 of [Guttag-Horning93]). In general, the act of formally specifying a program module has salutary effects on the quality of the design.
One can use the JML specification as an aid to careful reasoning about the correctness of code, or even for formal verification [Huisman01] [Jacobs-Poll01] [Ruby06].
JML specifications can be used by several tools that can help debug and improve the code [Burdy-etal03].

There is one additional benefit from using JML. It is that JML allows one to record not just public interfaces and behavior, but also some detailed design decisions. That is, in JML, one can specify not just the public interface of a Java class, but also behavior of a class's protected and private interfaces. Formally documenting a base class's protected interface and "subclassing contract" allows programmers to implement derived classes of such a base class without looking at its code [Ruby-Leavens00] [Ruby06].

Recording the private interface of a class may be helpful in program development or maintenance. Usually one would expect that the public interface of a class would be specified, and then separate, more refined specifications would be given for use by derived classes and for detailed implementation (and friend classes). (See section 16. Refinement, for how to record each level in JML.)

The reader may also wish to consult the "Preliminary Design of JML" [Leavens-Baker-Ruby06] for a discussion of the goals that are behind JML's design. Apart from the improved precision in the specifications and documentation of code, the main advantage of using a formal specification language, as opposed to informal natural language, is the possibility of tool support. One specific goal that has emerged over time is that JML should be able to unify several different tool-building efforts in the area of formal methods.

The most basic tool support for JML -- simply parsing and typechecking -- is already useful. Whereas informal comments in code are typically not kept up to date as the code is changed, the simple act of running the typechecker will catch any JML assertions referring to parameter or field names that no longer exist, and all other typos of course. Enforcing the visibility rules can also provide useful feedback; for example, a precondition of a public method which refers to a private field of an object is suspect.

Of course, there are more exciting forms of tool support than just parsing and typechecking. In particular JML is designed to support static analysis (as in ESC/Java [Leino-etal00]), formal verification (as in the LOOP tool [Huisman01] [Jacobs-etal98]), recording of dynamically obtained invariants (as in Daikon [Ernst-etal01]), runtime assertion checking (as in JML's runtime assertion checker, jmlc [Cheon-Leavens02b] [Cheon03]), unit testing [Cheon-Leavens02], and documentation (as in JML's jmldoc tool). The paper by Burdy et al. [Burdy-etal03] is a recent survey of tools for JML. The utility of these tools is the ultimate answer to the question of what JML is good for.

1.4 Status and Plans for JML

JML is still in development. As you can see, this reference manual is still a draft, and there are some holes in it. [[[And some notes for the authors by the authors that look like this.]]]

Influences on JML that may lead to changes in its design include our desire to specify programs written using the unique features of MultiJava [Clifton-etal00], an eventual integration with Bandera [Corbett-etal00] or other tools for specification of concurrency, aspect-oriented programming, and the evolution of Java itself. Another influence is the ongoing effort to use JML on examples, in designing the JML tools, and efforts to give a formal semantics to JML.

1.5 Historical Precedents

JML combines ideas from Eiffel [Meyer92a] [Meyer92b] [Meyer97] with ideas from model-based specification languages such as VDM [Jones90] and the Larch family [Guttag-Horning93] [LeavensLarchFAQ] [Wing87] [Wing90a]. It also adds some ideas from the refinement calculus [Back88] [Back-vonWright89a] [Back-vonWright98] [Morgan-Vickers94] [Morgan94] (see section 16. Refinement). In this section we describe the advantages and disadvantages of these approaches. Readers unfamiliar with these historical precedents may want to skip this section.

Formal, model-based languages such as those typified by the Larch family build on ideas found originally in Hoare's work. Hoare used pre- and postconditions to describe the semantics of computer programs in his famous article [Hoare69]. Later Hoare adapted these axiomatic techniques to the specification and correctness proofs of abstract data types [Hoare72a]. To specify an ADT, Hoare described a mathematical set of abstract values for the type, and then specified pre- and postconditions for each of the operations of the type in terms of how the abstract values of objects were affected. For example, one might specify a class IntHeap using abstract values of the form empty and add(i,h), where i is an int and h is an IntHeap. These notations form a mathematical vocabulary used in the rest of the specification.

There are two advantages to writing specifications with abstract values instead of directly using Java variables and data structures. The first is that by using abstract values, the specification does not have to be changed when the particular data structure used in the program is changed. This permits different implementations of the same specification to use different data structures. Therefore the specification forms a contract between the rest of the program in the implementation, which ensures that the rest of the program is also independent of the particular data structures used [Liskov-Guttag86] [Meyer97] [Meyer92a] [Parnas72]. Second, it allows the specification to be written even when there are no implementation data structures, as is the case for IntHeap.

This idea of model-oriented specification has been followed in VDM [Jones90], VDM-SL [Fitzgerald-Larsen98] [ISO96], Z [Hayes93] [Spivey92], and the Larch family [Guttag-Horning93]. In the Larch approach, the essential elaboration of Hoare's original idea is that the abstract values also come with a set of operations. The operations on abstract values are used to precisely describe the set of abstract values and to make it possible to abbreviate interface specifications (pre- and postconditions for methods). In Z one builds abstract values using tuples, sets, relations, functions, sequences, and bags; these all come with pre-defined operations that can be used in assertions. In VDM one has a similar collection of mathematical tools to describe abstract values, and another set of pre-defined operations. In the Larch approach, there are some pre-defined kinds of abstract values (found in Guttag and Horning's LSL Handbook, Appendix A of [Guttag-Horning93]), but these are expected to be extended as needed. (The advantage of being able to extend the mathematical vocabulary is similar to one advantage of object-oriented programming: one can use a vocabulary that is close to the way one thinks about a problem.)

However, there is a problem with using mathematical notations for describing abstract values and their operations. The problem is that such mathematical notations are an extra burden on a programmer who is learning to use a specification language. The solution to this problem is the essential insight that JML takes from the Eiffel language [Meyer92a] [Meyer92b] [Meyer97]. Eiffel is a programming language with built-in specification constructs. It features pre- and postconditions, although it has no direct support for frame axioms. Programmers like Eiffel because they can easily read the assertions, which are written in Eiffel's own expression syntax. However, Eiffel does not provide support for specification-only variables, and it does not provide much explicit support for describing abstract values. Because of this, it is difficult to write specifications that are as mathematically complete in Eiffel as one can write in a language like VDM or Larch/C++.

JML attempts to combine the good features of these approaches. From Eiffel we have taken the idea that assertions can be written in a language that is based on Java expressions. We also adopt the "old" notation from Eiffel, which appears in JML as \old, instead of the Larch-style annotation of names with state functions. To make it easy to write more complete specifications, however, we use various semantic ideas from model-based specification languages. In particular we use a variant of abstract value specifications, where one describes the abstract value of an object implicitly using several model fields. These specification-only fields allow one to implicitly partition the abstract value of an object into smaller chunks, which helps in stating frame axioms. More importantly, we hide the mathematical notation behind a facade of Java classes. This makes it so the operations on abstract values appear in familiar (although perhaps verbose) Java notation, and also insulates JML from the details of the particular mathematical logic used to do reasoning.

1.6 Acknowledgments

The work of Leavens and Ruby was supported in part by a grant from Rockwell International Corporation and by NSF grant CCR-9503168. Work on JML by Leavens, and Ruby was also supported in part by NSF grant CCR-9803843. Work on JML by Cheon, Clifton, Leavens, Ruby, and others has been supported in part by NSF grants CCR-0097907, CCR-0113181, CCF-0428078, and CCF-0429567. Support from the NSF continues under a Computing Research Infrastructure (CRI) grant jointly to several institutions: CNS 08-08913 (Leavens at U. of Central Florida, and a subcontact to Rajan and Basu at Iowa State Unviversity), CNS 07-07874 (Cheon at UTEP), CNS 07-07701 (Clifton at Rose Hulman), CNS 07-07885 (Flanagan at U. Cal. Santa Cruz), CNS 07-08330 (Naumann at Stevens), and CNS 07-09169 (Robby at Kansas State). The work of Poll is partly supported by the Information Society Technologies (IST) Programme of the European Union, as part of the VerifiCard project, IST-2000-26328.

Thanks to Bart Jacobs, Rustan Leino, Arnd Poetzsch-Heffter, and Joachim van den Berg, for many discussions about the semantics of JML specifications. Thanks for Raymie Stata for spearheading an effort at Compaq SRC to unify JML and ESC/Java, and to Rustan and Raymie for many interesting ideas and discussions that have profoundly influenced JML. Thanks to Leo Freitas, Robin Greene, and Jesus Ravelo for comments and questions on earlier versions of this document. Thanks to the many who have worked on the JML checker used to check the specifications in this document. Leavens thanks Iowa State University and its computer science department for helping foster and support the initial work on JML.

See the "Preliminary Design of JML" [Leavens-Baker-Ruby06] for more acknowledgments relating to the earlier history, design, and implementation of JML.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Gary Leavens on March, 16 2009 using texi2html