4. Lexical Conventions

This chapter presents the lexical conventions of JML, that is, the microsyntax of JML.

Throughout this chapter, grammatical productions are to be understood lexically. That is, no white-space (see section 4.1 White Space) may intervene between the characters of a token. (However, outside this chapter, the opposite of this convention is in force.)

The microsyntax of JML is described by the production microsyntax below; it describes what a program looks like from the point of view of a lexical analyzer [Watt91].

In the rest of this section we provide more details on each of the major nonterminals used in the above grammar.

4.1 White Space

4.2 Lexical Pragmas

4.3 Comments

4.4 Annotation Markers

4.5 Documentation Comments

4.6 Tokens

4.1 White Space

Blanks, horizontal and vertical tabs, carriage returns, formfeeds, and newlines, collectively called white space, are ignored except as they serve to separate tokens. Newlines and carriage returns are special in that they cannot appear in some contexts where other whitespace can appear, and are also used to end Java-style comments (see section 4.3 Comments).

white-space ::= non-nl-white-space | end-of-line non-nl-white-space ::= a blank, tab, or formfeed character end-of-line ::= newline | carriage-return | carriage-return newline newline ::= a newline character carriage-return ::= a carriage return character

4.2 Lexical Pragmas

ESC/Java [Leino-etal00] has a single kind of "lexical pragma", nowarn, whose syntax is described below in general terms. The JML checker currently ignores these lexical pragmas, but nowarn is only recognized within an annotation. Note that, unlike ESC/Java, the semicolon is mandatory. This restriction seems to be necessary to prevent lexical ambiguity.

lexical-pragma ::= nowarn-pragma nowarn-pragma ::= nowarn [ spaces ] [ nowarn-label-list ] ; spaces ::= non-nl-white-space [ non-nl-white-space ] ... nowarn-label-list ::= nowarn-label [ spaces ] [ , [ spaces ] nowarn-label [ spaces ] ] ... nowarn-label ::= letter [ letter ] ...

See section 4.6 Tokens, for the syntax of letter.

4.3 Comments

Both kinds of Java comments are allowed in JML: multiline C-style comments and single line C++-style comments. However, if what looks like a comment starts with the at-sign (@) character, or with a sequence of annotation-keys and an at-sign (@), then JML considers it to be the start of an annotation (see section 4.4 Annotation Markers), and not a comment. Furthermore, if what looks like a comment starts with an asterisk (*), then it is a documentation comment, which is parsed by JML.

comment ::= C-style-comment | C++-style-comment C-style-comment ::= /* [ C-style-body ] C-style-end C-style-body ::= non-at-plus-minus-star [ non-stars-slash ] ... | + non-letter [ non-stars-slash ] ... | - non-letter [ non-stars-slash ] ... | stars-non-slash [ non-stars-slash ] ... non-letter ::= any character except _, a through z, or A through Z non-stars-slash ::= non-star | stars-non-slash stars-non-slash ::= * [ * ] ... non-star-slash non-at-plus-minus-star ::= any character except @, +, -, or * non-star ::= any character except * non-slash ::= any character except / non-star-slash ::= any character except * or / C-style-end ::= [ * ] ... */ C++-style-comment ::= // [ + ] end-of-line | // non-at-plus-minus-end-of-line [ non-end-of-line ] ... end-of-line | //+ non-letter-end-of-line [ non-end-of-line ] ... end-of-line | //- non-letter-end-of-line [ non-end-of-line ] ... end-of-line non-letter-end-of-line ::= any character except _, a through z, A through Z, a newline, or a carriage return non-end-of-line ::= any character except a newline or carriage return non-at-plus-minus-end-of-line ::= any character except @, +,-, newline, or carriage return non-at-end-of-line ::= any character except @, newline, or carriage return

4.4 Annotation Markers

If what looks to Java like a comment starts with an at-sign (@) as its first character, or starts with a sequence of annotation-keys followed by an at-sign, then it is not considered a comment by JML. We refer to the tokens between //@ and the following end-of-line, and between pairs of annotation start ( /*@ ) and end ( */ or @*/ ) markers as annotations. The definition of an annotation marker is given below.

annotation-marker ::= // [ annotation-key ]... @ [ ignored-at-in-annotation ] ... | /* [ annotation-key ]... @ [ ignored-at-in-annotation ] ... | [ ignored-at-in-annotation ] ... @+*/ | [ ignored-at-in-annotation ] ... */ annotation-key ::= positive-key | negative-key positive-key ::= + ident negative-key ::= - ident ignored-at-in-annotation ::= @

Within annotations, on each line, initial white-space and any immediately following at-signs (@) are ignored.

Note that JML annotations are not the same as Java annotations (see section 6.2.2 Java Annotations). Besides the syntactic differences, JML annotations can appear anywhere a comment may appear, not just attached to declarations.

An annotation-key is a + or - sign followed by an ident (see section 4.6 Tokens). Note that no white space can appear within, before, or after the annotation-key. Tools will provide a way to enable a selection of annotation-key identifiers. These identifiers, hereafter called "keys" provide for conditional inclusion of JML annotations as follows:

a JML annotation with no keys is always included,
a JML annotation with at least one positive-key is only included if at least one of these positive keys is enabled and there are no negative-keys in the annotation that have enabled keys, and
a JML annotation with an enabled negative-key is ignored (even if there are enabled positive-keys).

For example, a comment beginning with //+ESC@ is included as a JML annotation only if the ESC key is enabled; a comment beginning with //-ESC@ is included except when the ESC key is enabled.

Annotations must hold entire grammatical units of JML specifications, in the sense that the text of some nonterminals may not be split across two separate annotations. For example the following is illegal, because the postcondition of the ensures clause is split over two annotations, and thus each contains a fragment instead of a complete grammatical unit.

//@ ensures 0 <= x // illegal! //@ && x < a.length;
However, implementations are not required to check for such errors. On the other hand, ESC/Java [Leino-Nelson-Saxe00] and ESC/Java2 assume that nonterminals that define clauses are not split into separate annotations, and so effectively do check for them.

Annotations look like comments to Java, and are thus ignored by it, but they are significant to JML. One way that this can be achieved is by having JML drop (ie., ignore) the character sequences that are annotation-markers, as well as the ignored-at-in-annotations. However, note that this technique does not properly check for annotations that do not contain entire grammatical units of JML specifications, as described in the previous paragraph.

Note that JML will recognize jml-keywords only within JML annotations.

4.5 Documentation Comments

If what looks like a C-style comment starts with an asterisk (*) then it is a documentation comment. The syntax is given below. The syntax doc-comment-ignored is used for documentation comments that are ignored by JML.

doc-comment ::= /** [ * ] ... doc-comment-body [ * ] ... */ doc-comment-ignored ::= doc-comment

At the level of the rest of the JML grammar, a documentation comment that does not contain an embedded JML method specification is essentially described by the above, and the fact that a doc-comment-body cannot contain the two-character sequence */.

However, JML and javadoc both pay attention to the syntax inside of these documentation comments. This syntax is really best described by a context-free syntax that builds on a lexical syntax. However, because much of the documentation is free-form, the context-free syntax has a lexical flavor to it, and is quite line-oriented. Thus it should come as no surprise that the first non-whitespace, non-asterisk (ie., not *) character on a line determines its interpretation.

doc-comment-body ::= [ description ] ... [ tagged-paragraph ] ... [ jml-specs ] [ description ] description ::= doc-non-empty-textline tagged-paragraph ::= paragraph-tag [ doc-non-nl-ws ] ... [ doc-atsign ] ... [ description ] ... jml-specs ::= jml-tag [ method-specification ] end-jml-tag [ jml-tag [ method-specification ] end-jml-tag ] ...

The microsyntax or lexical grammar used within documentation comments is as follows. Note that the token doc-nl-ws can only occur at the end of a line, and is always ignored within documentation comments. Ignoring doc-nl-ws means that any asterisks at the beginning of the next line, even in the part that would be a JML method-specification, are also ignored. Otherwise the lexical syntax within a method-specification is as in the rest of JML. This method specification is attached to the following method or constructor declaration. (Currently there is no useful way to use such specifications in the documentation comments for other declarations.) Note the exception to the grammar of doc-non-empty-textline.

A jml-tag marks the (temporary) end of a documentation comment and the beginning of text contributing to a method specification. The corresponding end-jml-tag marks the reverse transition. The end-jml-tag must match the corresponding jml-tag.

4.6 Tokens

Character strings that are Java reserved words are made into the token for that reserved word, instead of being made into an ident token. Within an annotation this also applies to jml-keywords. The details are given below.

ident ::= letter [ letter-or-digit ] ... letter ::= _, $, a through z, or A through Z digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 letter-or-digit ::= letter | digit

Several strings of characters are recognized as keywords or reserved words in JML. These fall into three separate categories: Java keywords, JML predicate keywords (which start with a backslash), and JML keywords. Java keywords are truly reserved words, and are recognized in all contexts. The nonterminal java-reserved-word represents the reserved words in Java (as in the JDK version supported by the tool in question, hopefully the latest official release).

The jml-keywords are only recognized as keywords when they occur within an annotation, but outside of a spec-expression store-ref-list or constrained-list. JML predicate keywords are also only recognized within annotations, but they are recognized only inside spec-expressions, store-ref-lists, and constrained-lists.

There are options to the JML tools that extend the language in various ways. For example, when an option to parse the syntax for the Universe type system [Dietl-Mueller05] is used, the words listed in the nonterminal java-universe-reserved also act like reserved words in Java (and are thus recognized in all contexts). When an option to recognize the Universe system syntax in annotations is used, these words instead act as jml-keywords and are only recognized in annotations. However, even when no Universe options are used, pure is recognized as a keyword in annotations, since it is also a jml-keyword. (The Universe type system support in JML is experimental. Most likely the list of java-universe-reserved will be added to the list of jml-keywords eventually.)

However, even without the Universe option being on, the jml-universe-pkeyword syntax is recognized within JML annotations in the same way as JML predicate keywords are recognized.

The details are given below.

The following describes the special symbols used in JML. The nonterminal java-special-symbol is the special symbols of Java, taken without change from Java [Gosling-Joy-Steele96].

special-symbol ::= java-special-symbol | jml-special-symbol java-special-symbol ::= java-separator | java-operator java-separator ::= ( | ) | { | } | `[' | `]' | ; | , | . | @ java-operator ::= = | < | > | ! | ~ | ? | : | == | <= | >= | != | && | `||' | ++ | -- | + | - | * | / | & | `|' | ^ | % | << | >> | >>> | += | -= | *= | /= | &= | `|=' | ^= | %= | <<= | >>= | >>>= jml-special-symbol ::= ==> | <== | <==> | <=!=> | -> | <- | <: | .. | `{|' | `|}' | <# | <#=

The nonterminal java-literal represents Java literals which are taken without change from Java [Gosling-Joy-Steele96].

java-literal ::= integer-literal | floating-point-literal | boolean-literal | character-literal | string-literal | null-literal integer-literal ::= decimal-integer-literal | hex-integer-literal | octal-integer-literal decimal-integer-literal ::= non-zero-digit [ digits ] [ integer-type-suffix ] digits ::= digit [ digit ] ... digit ::= 0 | non-zero-digit non-zero-digit ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 integer-type-suffix ::= l | L hex-integer-literal ::= hex-numeral [ integer-type-suffix ] hex-numeral ::= 0x hex-digit [ hex-digit ] ... | 0X hex-digit [ hex-digit ] ... hex-digit ::= digit | a | b | c | d | e | f | A | B | C | D | E | F octal-integer-literal ::= octal-numeral [ integer-type-suffix ] octal-numeral ::= 0 octal-digit [ octal-digit ] ... octal-digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 floating-point-literal ::= digits . [ digits ] [ exponent-part ] [ float-type-suffix ] | . digits [ exponent-part ] [ float-type-suffix ] | digits exponent-part [ float-type-suffix ] | digits [ exponent-part ] float-type-suffix exponent-part ::= exponent-indicator signed-integer exponent-indicator ::= e | E signed-integer ::= [ sign ] digits sign ::= + | - float-type-suffix ::= f | F | d | D boolean-literal ::= true | false character-literal ::= ' single-character ' | ' escape-sequence ' single-character ::= any character except ', \, carriage return, or newline escape-sequence ::= \b // backspace | \t // tab | \n // newline | \r // carriage return | \' // single quote | \" // double quote | \\ // backslash | octal-escape | unicode-escape octal-escape ::= \ octal-digit [ octal-digit ] | \ zero-to-three octal-digit octal-digit zero-to-three ::= 0 | 1 | 2 | 3 unicode-escape ::= \u hex-digit hex-digit hex-digit hex-digit string-literal ::= " [ string-character ] ... " string-character ::= escape-sequence | any character except ", \, carriage return, or newline null-literal ::= null

An informal-description looks like (* some text *). It is used in predicates (see section 12.1 Predicates) and in store-ref expressions (see section 12.7 Store Refs) as an escape from formality. The exact syntax is given below.

informal-description ::= (* non-stars-close [ non-stars-close ] ... *) non-stars-close ::= non-star | stars-non-close stars-non-close ::= * [ * ] ... non-star-close non-star-close ::= any character except ) or *

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by U-leavens-nd\leavens on May, 31 2013 using texi2html