The Eiffel Compiler / Interpreter (tecomp)

A function is a computation object which calculates a result given its arguments. A procedure is a computation object which establishes a postcondition provided that its precondition is satisfied. Functions and procedures do not interact with their environment during their lifetime. They just have an entry point and an exit point. What happens between is not observable (at least if we abstract the duration).

A process can interact with its environment. A complete program is usually a process, because in order to have some value it has to interact with its environment, which can be a user (equipped with mouse, keyboard and monitor) or a filesystem or other processes. This is the reason why in most operating systems running programs are called processes.

The interactions between a process and its environment are called events. The only thing an event can do is "happen".

In this article we use Tony Hoare's theory of "Communicating sequential processes" presented in a book with the same title in 1985. We just adapt the syntax to fit our programming language.

In the following description we use sets and relations extensively.

Remember: If T denotes a type then T? denotes the type of a predicate over objects of type T. All objects which satisfy the predicate form a set. Since the connection between predicates and sets of objects which satisfy the predicate is so strong, we identify both concepts. I.e. the type T? denotes a predicate over the type T and a set of objects of type T at the same time.

This notation is extended to relations. The type [T,U] denotes pairs of objects of type T and type U. Consequently [T,U]? is a predicate with two free variables or a set of pairs. I.e. we use the type [T,U]? to denote a relation between objects of type T and objects of type U.

In order to understand the notation in this article fully it might be convenient to read (at least the basic concepts) of the articles

- Functions, ghost functions and higher order functions
- Predicates as sets
- Complete lattices and closure systems
- Binary relations, endorelations and transitive closures

Consider a simple vending machine which sells chocolates. This vending machine has two events:

- coin: The event of a coin being inserted in the slot of the machine
- choc: The event of extraction of a chocolate from the dispenser of the machine

Events are treated as features in our programming language. Their declaration is very simple

feature coin choc end

I.e. events look like argument-less procedures without body. Since the only thing an event can do is "happen", there is neither an implementation nor a postcondition (but events can have arguments and preconditions as we will see later).

Having these two events we can define the process which represents a simple vending machine.

feature vm_choc ensure Process = (coin -> choc -> vm_choc) end end

I.e. a process looks syntactically like a procedure which gives in its postcondition an expression which states the property of the process. The variable 'Process' represents the process and is just a placeholder for 'vm_choc' i.e. for the process to be defined. Therefore the definition states the equality

vm_choc = (coin -> choc -> vm_choc)

which is a recursive equation. We can expand the recursive definition as often as we like to get the equations

vm_choc = (coin -> choc -> (coin -> choc -> vm_choc)) vm_choc = (coin -> choc -> (coin -> choc -> (coin -> choc -> vm_choc))) ...

The above definition says that the process 'vm_choc' can engage first in the event 'coin' then in the event 'choc' and then behaves like 'vm_choc'. I.e. the process 'vm_choc' can engage in the potentially infinite sequence of events

coin, choc, coin, choc, ...

The prefix operator '->' is a right associative binary operator which has on its left side an event and on its right side a process. I.e. we have to 'parse' the above definition it the following way.

vm_choc ensure Process = (coin -> (choc -> vm_choc)) end

A process like 'vm_choc' alone does nothing because events are interactions. Therefore the process has to be placed into an environment where it can interact. An environment can be a customer which can engage in the same events, i.e. a customer having a coin which he inserts into the slot of the machine and wants to extract a chocolate from its dispenser.

A process can engage in events or refuse events. Initially the vending machine 'vm_choc' can engage in the event 'choc' and refuses to dispense a chocolate. After having received the coin it can engage in the event 'choc' and refuses to accept a new coin until the chocolate has been dispensed. The set of all events which a process can accept or refuse is called its alphabet. The alphabet of a process is a constant through its lifetime. Events outside its alphabet are not known to a process, i.e. it can neither engage in nor refuse them.

Since our programming language is strongly typed all objects must have a type. All event objects like 'coin' and 'choc' are of type EVENT, all process objects like 'vm_choc' are of type PROCESS.

The vending machine vm_choc has just a linear sequence of events. It does not offer its customers any choice. The customer just can insert a coin and extract a chocolate. In the programming language it is easy to express choice. A vending machine which offers its customers a chocolate or a toffee after inserting a coin can be defined as

vm_choc_toffee ensure Process = (coin -> ( choc -> vm_choc_toffee | toffee -> vm_choc_toffee)) end

where the operator '|' expresses a choice.

The processes vm_choc and vm_choc_toffee are defined by a recursion equation of the form 'p=f(p)'. The definitions

proc ensure Process = f(proc) end proc ensure Process = x: f(x) -- a fresh variable `x' end

are equivalent. With this notation we can express the definition of our two vending machines in the following form.

vm_choc ensure Process = x: coin -> choc -> x end vm_choc_toffee ensure Process = x: coin -> (choc -> x | toffee -> x) end

I.e. the expression x:f(x) defines a process if 'f' is a function of type PROCESS->PROCESS (Note: The function 'f' has to be guarded, i.e. it has to start with an event. The expression 'x:x' is not a proper process expression).

One important aspect of a process is the sequence of events in which it can engage. At any point in time a process has a history of events in which it has engaged up to this point in time. We use the type LIST[EVENT] to talk about the sequence of events and call them traces.

TRACE = LIST[EVENT]

The following are valid traces of vm_choc_toffee

[] [coin, toffee, coin] [coin] [coin, choc, coin choc, coin, toffee, coin]

We can define the set of all traces ts:TRACE? in which a process can engage. Note that any prefix of a trace is a valid trace. Therefore the set of traces of a process has to be prefixclosed.

is_prefixclosed(ts:TRACE?): ghost BOOLEAN ensure Result = ts.is_closed({u,v: v.is_prefix(u)}) end

Since the empty trace is a prefix of any trace, the empty trace is contained in the trace set of any process.

Every traceset defines the set of events which occur in any of its traces

alphabet(ts:TRACE?): ghost EVENT? -- The alphabet of the trace set `ts'. ensure Result = {e: some(t:TRACE) t in ts and e in t} end

With this function we get

vm_choc.traces.alphabet = {coin, choc} vm_choc_toffee.traces.alphabet = {coin, choc, toffee}

where the function 'traces' returns the trace set of a process.

The set of all prefixclosed trace sets is a closure system. An arbitrary traceset does not represent a valid traceset of a process. The closure of a trace set with respect to the prefix relation is the smallest set which contains the trace set and is prefixclosed. I.e. the set

{[coin,choc,coin]}

is not a valid traceset. We can generate a valid trace set by closing the set.

{[coin,choc,coin]}.closed(is_prefixclosed) = {[], [coin], [coin,choc], [coin,choc,coin]} -- Note: The expression is_prefixclosed is just a shorthand for -- {ts:TRACE?: ts.is_prefixclosed}

- For details on closures see the article "Complete lattices and closure systems".

Let us try to find a closed expression for the trace set of the process vm_choc.

First we note: If t is a trace then [coin,choc]+t is a trace as well. The expression {nil}.closed(t->[coin,choc]+t) therefore generates a set of valid traces of vm_choc. This traceset is not yet prefixclosed. The full set of valid traces can be generated by the formula

vm_choc.traces = {nil}.closed(t->[coin,choc]+t) .closed(is_prefixclosed) -- Note: t->[coin,choc]+t is the function which maps any trace t -- to the trace [coin,choc]+t -- The type checker of our programming language can resolve this -- ambiguity between function expressions and process expressions

In a similar manner we can give a closed expression for the traces of the process 'vm_choc_toffee'.

vm_choc_toffee.traces = {nil} .closed({u,v: v=[coin,choc ]+u or v=[coin,toffee]+u}) .closed(is_prefixclosed)

Note that we have used three possibilities to close a set: (a) close a set with respect to a closure system, (b) close a set under a function and (c) close a set under a relation. Let us shortly repeat what these closures mean. Let c be a set, cs be a set of sets which are a closure system, f be a function and r a relation. Then we have

c.closed(cs) = ({x:c<=x}*cs).infimum -- least set of cs which contains c c.closed(f) = {c + c.image(f) + c.image(f).image(f) + ... } c.closed(r) = {c + c.image(r) + c.image(r).image(r) + ... } -- Note: '<=' for sets is the subset relation i.e. {x:c<=x} is the set of -- all sets which contain c.

where c.image(f) is the image of the set c under the function f and c.image(r) is the image of the set c under the relation r.

A specific trace t of a process characterizes in some way the state of a process. We might be interested in the behaviour of a process after having engaged in the events of the trace t. Clearly t must be a valid trace of the process, otherwise this question does not make any sense. The behaviour of a process after having engaged in t is characterized by a set of traces as well.

We define the function

/ (ts: TRACE?, t:TRACE): ghost TRACE? -- The traceset of a process with traceset `ts' after having -- engaged in the events of the trace `t'. ensure Result = ts.image(u -> u.without_prefix(t)) end

Note that the function 'image' for a function f of type A->B is defined as

image(p:A?, f:A->B): ghost B? ensure Result = {b:B: some(a:A) a in f.domain and b=f(a)} end

The initial events of a traceset are defined by the following function.

initials(ts:TRACE?): ghost TRACE? -- The set of initial events of the traceset `ts'. ensure Result = {e: some(u:TRACE) e::u in ts} end

There might be traces in a traceset which do not have any continuation. These are called endtraces.

endtraces(ts:TRACE?): ghost TRACE? ensure Result = {t: t in ts and ts/t = 0} end

Any process after having engaged in the events of one of its endtraces cannot continue to engage in events. There are two possible reasons for a process to reach any of its endtraces: (a) The process has done what it is supposed to do successfully and has therefore terminated. (b) The process has reached some deadlock situation. Successful termination or deadlock cannot be distinguished by the traces of a process.

The possibility of having endtraces reveals the fact that the alphabet of a traceset as defined above cannot represent the alphabet of the process. Consider a process with the traceset ts and an endtrace t. Then the alphabet of the traceset ts/t is empty. This is not inline with the requirement that the alphabet of a process has to be an invariant during its lifetime.

If we have a traceset ts of a process we know a lot about the behaviour of the process. The traceset defines the sequence of all possible events in which the process can engage.

But we can look at a process as well by asking what it does not do, i.e. what it refuses to do. E.g. our vending machine for chocolates refuses to dispense a chocolate at the beginning. It expects first a coin to be inserted. After a coin has been inserted the machine refuses the insertion of another coin. Before the insertion of another coin the chocolate has to be extracted.

We define a refusal as a set of events which are refused by a process at a certain state.

REFUSAL = EVENT?

If the environment offers the events of a refusal then the combination of the process with its environment cannot continue, it is blocked.

If r is a refusal (i.e. a set of events) and the environment offers a subset s (i.e. s<=r) then s must be a refusal as well. Therefore at any state the process has some set of refusals which is downclosed (i.e. with each set it contains the subsets as well).

The state of a process is characterized by the sequence of events in which it has engaged up to a certain point in time. The combination of a trace t and a refusal r i.e. the pair [t,r] is called a failure of the process. We call [t,r] a failure because the combination of the process with its environment fails after engaging in the events of t if the environment offers the events in r.

The set of all trace refusal pairs of a process is called its failure relation (note that this a relation between traces and set of refusals, i.e. set of sets).

Our intuition tells us that the refusals have to be in some way a complement of what the traceset of a process is describing. This stems from the conjecture that a process can either engage in an event or refuse an event and both actions being mutually exclusive. This is in fact true for deterministic processes.

Let us try to find an expression of the set of refusals of a process with the traceset ts after having engaged in the events of the trace t. The process is going to accept events of the set (ts/t).initials. The refusal sets have to be subsets of the complement of (ts/t).initials. But complement with respect to what? In order to answer this question the alphabet of a process enters the scene.

Let a be the alphabet of a process described by the traceset ts. In order to be a valid alphabet it has to be a superset of ts.alphabet. After having engaged in the events of the trace t the set {r: r <= a-(ts/t).initials} is the set of refusals which is inline with our intuition about deterministic processes.

We can use this method to define a function which returns the failure relation of a deterministic process with the traceset ts and the alphabet a.

failures(ts:TRACE?, a:EVENT?): ghost [TRACE,REFUSAL]? ensure Result = {t,r: t in ts and r <= a - (ts/t).initials} end

The function failures calculates a meaningful failure relation of a process only if the traceset is prefixclosed and the alphabet is a superset of the alphabet defined by the traceset.

Note that the relation calculated by the function failures has a very specific property. Any refusal after engaging in the trace t is a subset of a-(ts/t).initials which is the greatest refusal after trace t. If fr is the failure relation calculated by the function 'failures' then t.image(fr).supremum is this greatest refusal. Note that this is not the most general case. The most general case is that t.image(fr).supremum is not contained in t.image(fr) which characterizes a non-deterministic process (see next chapter).

The failure relation of a process contains more information than its traceset. We can define some useful functions based on the failure relation of a process.

initials(fr:[TRACE,REFUSAL]?): ghost EVENT? -- The set of the initial events in which a process -- characterized by the failure relation `fr' can engage. ensure Result = fr.domain.initials end refusals(fr:[TRACE,REFUSAL]?): ghost REFUSAL? -- The set of all initial refusals of a process -- characterized by the failure relation `fr'. ensure Result = nil.image(fr) end events(fr:[TRACE,REFUSAL]?): ghost EVENT? -- The set of all events which a process characterized by -- the failure relation `fr' can initially engage in or refuse. ensure Result = fr.initials + fr.refusals.supremum end alphabet(fr:[TRACE,REFUSAL]?): ghost EVENT? -- The set of all events which can occur either in a trace or -- in a refusal of the failure relation `fr'. ensure Result = {e: some(t,r) [t,r] in fr and (e in t or e in r)} end

/ (fr:[TRACE,REFUSAL]?, t:TRACE): ghost [TRACE,REFUSAL]? -- The failure relation of a process characterized by the -- failure relation `fr' after having engaged in `t'. ensure Result = {u,r: t+u in fr.domain and r in (t+u).image(fr)} end

As we have seen above an arbitrary relation between traces and refusals is not a valid failure relation of a process. With the defined functions we can formulate the constraints of the failure relation precisely.

is_process(fr:[TRACE,REFUSAL]?): ghost BOOLEAN ensure Result = ( fr.domain.is_prefixclosed and (all(t:TRACE) t.image(fr).is_downclosed) and (all(t:TRACE) t in fr.domain => fr.alphabet = (fr/t).events) ) end

These conditions require that (a) the traceset characterized by the domain of the relation has to be prefixclosed, (b) each subset of a refusal is a refusal and (c) the alphabet is an invariant through the lifetime of a process.

It is interesting to note that the set of all failure relations which describe a valid process is a closure system. I.e. we can use a arbitrary relation fr between traces and refusals and form a valid one by building the closure fr.closed(is_process). The relation 'fr.closed(is_process)' is the least relation which contains 'fr' and satisfies the requirements to be a valid process.

The processes we have seen up to now are all deterministic. The environment has full control over the behaviour via the events it offers. E.g. interacting with the vending machine vm_choc_toffee the customer can insert a coin and can decide whether to extract a chocolate or a toffee.

vm_choc_toffee ensure Process = x: coin -> (choc -> x | toffee -> x) end

The choice after inserting the coin is done by the environment.

Let us construct a machine like vm_choc which dispenses only toffees.

vm_toffee ensure Process = x: coin -> toffee -> x end

We can combine vm_choc and vm_toffee with the choice operator.

vm_choc | vm_toffee

- Note: In Tony Hoare's book about CSP this operator has been displayed as a small square. Since squares are difficult to display in ASCII we use the | operator.

The choice operator | suggests that the environment has a choice like in the machine vm_choc_toffee after inserting the coin. But there is no real choice here because both processes vm_choc and vm_toffee require a coin to be inserted at the beginning. However the expression vm_choc|vm_toffee requires a choice to be made. The only reasonable solution is to say that the process described by the expression vm_choc|vm_toffee makes this choice nondeterministically.

Once the choice has been made the process behaves subsequently either like vm_choc or vm_toffee. The environment can note the consequence of this choice after having inserted a coin and extracted one of the two possibilities. From this point on the process behaves deterministic like a vm_choc or a vm_toffee.

Note that the environment (i.e. a potential customer) has to be prepared for both possibilities. If the customer insists to extract a toffee and the process has made the non-deterministic choice to behave like a vm_choc then the system of process and environment is deadlocked and cannot continue.

Some possible traces of vm_choc|vm_toffee are

{ [], [coin], [coin, choc], [coin, choc, coin, choc, coin, choc], [coin, toffee], [coin, toffee, coin, toffee, coin, toffee], ... }

The traceset of vm_choc|vm_toffee is the union of the tracesets of the components.

It is interesting to look at the initial events and the refusal set of the combined machine after having engaged in the first insertion of a coin

((vm_choc|vm_tofffee)/[coin]).initials = {choc,toffee} ((vm_choc|vm_tofffee)/[coin]).refusals = { 0, {coin}, {choc}, {coin,choc}, {toffee}, {coin,toffee} }

The combination might accept the extraction of a chocolate and it might accept the extraction of a toffee. The internal choice made by the process is not yet visible to the environment.

The set of refusals does not have a greatest element. The complement of the initials with respect to the alphabet which is the one element set {coin} is no longer the greatest refusal like it would be if the process were deterministic. We can use this observation to formally define deterministic and non-deterministic processes.

is_deterministic(fr:[TRACE,REFUSAL]?): ghost BOOLEAN ensure Result = all(t,r) require [t,r] in fr ensure r <= fr.alphabet - (fr/t).initials end end

In this basic discussion about non-deterministic processes we have glossed over an important detail. The expression vm_choc|vm_toffee is illegal because the | operator requires identical alphabets of both operands. The process vm_choc has the alphabet {coin,choc} and the process vm_toffee has the alphabet {coin,toffee}. Therefore the above is valid only if we augment the alphabet of vm_choc by the set {toffee} and the alphabet of vm_toffee by {choc}. The augmentation of alphabets of a process is treated formally later. But the basic semantics is that augmentation does not change the traces. It adds some refusals to convert the alphabet into the augmented alphabet.

The failure relation does not describe a process completely. The missing information is obvious if we look at endtraces of a process. By definition an endtrace has no continuation. A process in this situation can be either deadlocked or might have terminated successfully.

Termination can be deterministic or nondeterminisitic, i.e. a process might make some internal choices to terminate or not terminate after engaging in some trace. In order to treat non-determinism properly for terminations as well we have to partition the set of traces into

terminations: TRACE? nonterminations: TRACE?

with the obvious constraint

```
fr.domain = terminations + nonterminations
```

For a deterministic process we require that both sets are disjoint.

The combination of the failure relation the termination and nontermination traces is sufficient to describe the behaviour of a process.

Local Variables: mode: outline coding: iso-latin-1 outline-regexp: "=\\(=\\)*" End:

LocalWords: endtraces