The Eiffel Compiler / Interpreter (tecomp)

The Modern Eiffel project has two major goals: Design and implement a language which (a) allows formal verification and (b) is usuable by programmers.

The second goal has not yet been fully met, because the interface to the proof engine is not easy to grasp for prgrammers coming from object oriented languages like C++, C#, Java, Scala, etc.

This needs to be improved because "Formal methods will never have a significant impact until they can be used by people who don't understand them (attributed to Tom Melham)".

The automation features of Modern Eiffel's proof engine (i.e. assertions without explicit proof are proved by the proof engine with the implicit command "resolve") already reduces the need for explicit proofs significantly.

However not all provable assertions can be proved automatically. In some cases user intervention is required.

This article describes a user interface to the proof engine which makes it possible to write the proofs like a procedure. Therefore proving an assertions will become more like programming (which programmers might like more than issuing proof commands).

We have not yet introduced procedures up to now, because we have not yet worked with mutable objects.

A procedure in Modern Eiffel looks like any other procedure in other programming languages except that it has preconditions and postconditions.

some_procedure(a:A, b:B, ... ) require pre_1; pre_2; ... do cmd_1; check a;b;c end; cmd_2; cmd_3; ... ensure post_1; post_2; ... end

The preconditions specify which conditions must be valid at the entry point. It is the caller's responsibility to satisfy the preconditions and it is the responsibility of the called procedure to establish the postconditions. The pre- and postconditions form a contract between the caller and the called procedure. The contract is the specification of a procedure.

The body is a sequence of commands interleaved with some assertions (i.e. the "check ... end" commands). At the end of the body the postcondition has to be established.

Now assume that we want to prove an assertion like

all(x:X, y:Y) a => b => c => check p;q end -- this assertion is equivalent to the two assertions all(x:X,x:Y) a => b => c => p all(x:X,x:Y) a => b => c => q

With a little syntactic sugar we rewrite the assertion as

all(x:X, y:Y) require a; b; c ensure p; q end

In this form the assertion already looks like an unnamed procedure without body.

Note that the preconditions can be empty.

all(x:X, y:Y) check p;q end -- is equivalent to all(x:X, y:Y) ensure p; q end

I.e. an assertion can be viewed as a procedure callable with any arguments satisfying the precondition and ending in a state satisfying the postcondtion. Since it should be a proof procedure, it must not modify anything.

The body of a proof procedure is the actual proof. The simplest body of a proof procedure is just a sequence of assertions.

all(x:X, y:Y) require a; b; c do a1 -- state: a, b, c : a1 a2 -- a, b, c, a1 : a2 ... ensure p -- state: a, b, c, a1, a2, ... : p q -- a, b, c, a1, a2, ... p : q end

A proof procedure is correct if the proof engine can prove all intermediate assertions and the postconditions by using all previous assertions and the already proved assertions as assumptions. In the above code snippet the corresponding start state of the proof engine is added as a comment.

At each intermediate assertion and each postcondition the proof engine enters the corresponding state and tries to proof the assertions under the assumptions on the assumption stack and with the help of the already proved asseritions. If all assertions can be proved automatically, then the proof procedure is valid.

In this example we want to prove the commutativity of "and" stated as

all(a,b:BOOLEAN) a and b => b and a

This assertion can be proved by the proof engine automatically, therefore no explicit proof is required. But in order to demonstrate the technique we give a very pedantic proof in form of the following proof procedure.

all(a,b: BOOLEAN) require a and b do elim1: a and b => a elim2: a and b => b a b intro: b => a => b and a ensure b and a end

"elim1" and "elim2 are valid, because they have already been proved within the introduction of the boolean operation "and".

The next two assertions are immediate consequences of the precondition and "elim1" or "elim2" respectively.

"intro" is valid; it has already been proved within the "and" operation.

The postcondition is an immediate consequence of "a", "b" and "intro".

In order to compare here is a proof with explicit proof commands (at the same pedantic level).

all(a,b:BOOLEAN) check a and b => b and a proof enter premise(b,a) premise(a and b) remove premise(a and b) remove end end

If we have an alternative "a or b" and we know that "a" is false, then "b" must be true. I.e. we expect the following to be valid.

all(a,b:BOOLEAN) (a or b) => not a => b

We prove this assertion with the following proof procedure.

all(a,b:BOOLEAN) require a or b do contra: a => not a => b -- from a contradiction follows anything trivial: b => not a => b elim: all(x:BOOLEAN) (a or b) => (a=>x) => (b=>x) => x ensure not a => b end

The assertions "contra" and "elim" are already proved assertions (just made more specific than the general laws). The assertion "trivial" is a triviality. The postcondition is discharged implicitely with the commands

premise(a or b, a => (not a => b), b => (not a => b) remove(3)

The intermediate assertions "contra" and "trivial" are not necessary. They have been inserted to make some intermediate steps more explicit.

Let us compare the above proof procedure with a proof with proof commands. We could proof the above assertion like

all(a,b:BOOLEAN) check a or b => not a => b proof enter case_split(a, b) remove resolve end end

The proof with proof commands is more compact. The proof procedure is more verbose. It is a matter of taste which proof one writes. However from the perspective of programmers the proof by a proof procedure might be preferable.

In order to prove the contrapositive law

all(a,b: BOOLEAN (a => b) => (not b => not a)

we can write the proof procedure

all(a,b: BOOLEAN) require a => b not b do a1: (a => b) => (a => not b) => not a ensure not a end

The intermediate assertion "a1" matches an already proved assertion. The postcondition "not a" is proved by entering the state

a => b not b (a => b) => (a => not b) => not a ================================= not a

The goal "not a" is discharged automatically by the proof engine with the commands

-- proof commands generated by the proof engine premise(a => b, a => not b) remove enter remove

The above proof procedure can be improved with respect to readability and clarity. An alternative form is:

all(a,b: BOOLEAN) require a => b not b do require a -- Assume the opposite ... ensure b; not b -- and derive a contradiction end ensure not a -- therefore `not a' must be valid end

Since proof procedures are more verbose than proofs by proof commands it is important to split the specification and the implementation part.

The statement of an assertion belongs to the specification part, the actual proof to the implementation part.

I.e. in the above example "commutativity" the specification part should include just

all(a,b:BOOLEAN) a and b => b and a

and the implementation part should include the proof procedure

all(a,b: BOOLEAN) require a and b do elim1: a and b => a elim2: a and b => b a b intro: b => a => b and a ensure b and a end

This keeps the specification part compact and readable. As described in the article "Specification and implementation of modules" Modern Eiffel allows this physical separation of specification and implementation.

In Modern Eiffel we have the two versions of an alternative command, the if-command and the inspect-command. They look like

if c1 then stmt1 elseif c2 then stmt2 ... else stmt_else end inspect e case p1 then stmt1 case p2 then stmt2 ... end

If one of these alternative commands is embedded within a sequence of assertions in a proof procedure, then the proof procedure does not have only one sequence of assertions but n sequences of assertions. Each of these n sequences must be provable as a linear sequence.

In case of nesting the corresponding sequences multiply correspondingly. Deep nesting is not recommendable neither in executable nor in proof procedures.

The code snippet below shows the possibilities for a four branch if-command.

all(x:X, y:Y) require a; b; c do ... s if c1 then a1 -- state: ..., s, c1 : a1 ... e1 -- state: ..., s, c1, a1 : e1 elseif c2 then a2 -- state: ..., s, not c1, c2 : a2 ... e2 -- ..., s, not c1, c2, a2, ... : e2 elseif c3 then a3 -- state: ..., s, not c1, not c2, c3 : a3 ... e3 -- ..., s, not c1, not c2, c3, a3, ... : e3 else a4 -- state: ..., s, not c1, not c2, not c3 : a4 ... e4 -- ..., s, not c1, not c2, not c3, a4, ... : e4 end t -- states: ...,s, ..., e1 : t -- ...,s, ..., e2 : t -- ...,s, ..., e3 : t -- ...,s, ..., e4 : t ... ensure p q end

The commutativity of "or" can be stated in the specification file like

all(a,b:BOOLEAN) a or b => b or a

The following proof procedure proves this assertion by using an if-command to realize the case split.

all(a,b:BOOLEAN) require a or b do if a then a; b or a else proved: (a or b) => not a => b not a; b; b or a end ensure b or a end

There should be no problem to read and verify this proof procedure.

Proof procedures do nothing. If called, instead of executing instructions, they prove the postcondition provided that the precondition is valid.

Since they "do" nothing the question arises "Why not call them recursively?". And indeed this is possible, provided that the recursion is guaranteed to terminate (i.e. at least one argument which is bounded from below is decreasing).

Suppose we have the module "unary_natural" with the following class and function definition.

case class UNARY_NATURAL create zero succ(pred:UNARY_NATURAL) end feature + (a,b:UNARY_NATURAL): UNARY_NATURAL = inspect b case zero then a case succ(p) then succ(a+p) end end

The corresponding type UNARY_NATURAL is an inductive type. Therefore recursion is well behaved.

Suppose we want to prove the assertion:

all(a:UNARY_NATURAL) zero + a = a

Let us rewrite the assertion into an equivalent form which looks more like a procedure.

all zero_left_neutral(a:UNARY_NATURAL) do ... ensure zero + a = a end

We have given the procedure a name. The name is just a local binding within the proof procedure and does not "pollute" the namespace of the enclosing module.

The procedure "zero_left_neutral", if called with any actual argument "arg" should establish the truth of "zero+arg=arg". In the following proof procedure we mimick the recursive definition of "+". I.e. as the next refinement we get

all zero_left_neutral(a:UNARY_NATURAL) do inspect a case zero then a = zero zero+zero=zero zero+a = a case succ(p) then ... end ensure zero + a = a end

We do a pattern match on the argument. In the case that "a" is zero the proof is a triviality. All intermediate assertions (spelled out pedantically - which is not necesary) are direct consequences of the definition of "+" in the case "a=zero".

In the case "a=succ(p)" the pattern match instantiates a fresh variable "p" with a value which is strictly smaller than "a". Therefore we can call the procudure "zero_left_neutral" with the argument "p". This call proves "zero+p=p". We can combine this fact with the fact "a=succ(p) and the definition of "+" and we get the following complete proof procedure.

all zero_left_neutral(a:UNARY_NATURAL) do inspect a case zero then a = zero zero+zero=zero zero+a = a case succ(p) then zero_left_neutral(p) -- recursive call! zero + p = p a = succ(p) succ(zero+p) = succ(p) zero + a = a end ensure zero + a = a end

This proof procedure is very detailed. Since many steps are trivial, we can let the proof engine do them automatically. A more compact version of the above proof reads like

all zero_left_neutral(a:UNARY_NATURAL) do inspect a case zero then -- trivial case succ(p) then zero_left_neutral(p) -- recursive call! succ(zero+p) = succ(p) end ensure zero + a = a end

One might be tempted to convert the recursive proof procedure into an iterative one. Using a loop we can construct the truth of "zero+a=a" bottoms up. We write a loop with a variable "i" which is initialized to zero.

The loop should maintain the invariant "zero+i=i". The loop advances by incrementing "i" each iteration by one until it reaches the value of "a". At the loop termination the truth of "zero+a=a" is established.

The proof procedure looks like:

all(a:UNARY_NATURAL) local i:UNARY_NATURAL do from i := zero invariant zero<=i; i<=a zero + i = i variant a - i until i = a loop i < a succ(i) <= a zero + i = i succ(zero+i) = succ(i) zero+succ(i) = succ(i) i := succ(i) end zero+i=i; i=a ensure zero + a = a end

In order to verify this procedure formally we need the assignment axiom

check e[x:=exp] end x := exp check e end

The assignment axiom tells us the following: If we want to establish the assertion "e" after the assignment "x:=exp", we have to prove that "e[x:=exp]" is a valid assertion before the assignment command.

It is not difficult to see that the loop maintains the invariant and decreases the variant at each iteration.

At termination we have the invariant and the exit condition which together imply the postcondition.

In my opinion the recursive proof procedure is better readable and understandable than the iterative. But some might prefer a loop ...

The notion of a proof procedure instead of a proof by proof commands has been developed in this article.

It seems that proof procedures are a little bit more verbose than proofs by proof commands. But they look more understandable than proof commands. Furthermore they are closer to the activity which software developers have to do anyhow: Programming.

In the next articles I will experiment more with proof procedures to get a better feeling about the advantages and disadvantages of this approach.

Local Variables: mode: outline coding: iso-latin-1 outline-regexp: "=\\(=\\)*" End: