SourceForge.net LogoThe Eiffel Compiler / Interpreter (tecomp)

doc/internals/virtual machine/virtual machine

virtual machine

General principles

Objects

Eiffel objects are located either on the stack or on the heap. No object exists outside the stack and the heap.

The execution of a Eiffel program starts with the creation of the root object. If the root object is of expanded type, the object is pushed onto the stack. If the root object is of reference type, it is allocated on the heap and a reference is pushed onto the stack.

Then the creation procedure of the root object is executed. The creation procedure can create more objects and execute routines until the creation procedure is at its end.

The basic computing element is a feature call (either qualified or unqualified)

       t.f(a1,a2,...)
       f(a1,a2,...)
 

In the EVM a feature call is executed by loading the target onto the stack (for qualified calls only), loading the arguments and then calling the feature

  tgt  -- for qualified calls and creation calls
  a1
  a2
  ...
  cl "f" or uc "f" or ccl "f" -- qualified / unqualified / creation call

Each load instruction can be either an elementary load instruction

or the execution of a function (which loads its result).

  stack before:  ... tgt a1 a2 ...
        after:   ... res

The load instructions load temporaries onto the stack. Temporaries are removed by

Objects can be attached to entities by

Expanded type objects have copy semantics. Therefore they cannot be shared and each entity (local variable, attributes, arguments) has to have its own copy of the object. There is one exception: The current target can be a reference to either a local variable or an attribute.

The default copy semantics is field by field copy. However the user can redefine {ANY}.copy in any class. Therefore the load instructions loc, arg, uc (attribute) might cause the implicit execution of the customized copy routine.

Procedure for loading an expanded type object with customized copy routine:

  load empty object:         ... empty
  load reference to that:    ... empty   eref
  load field by field copy:  ... empty   eref  fieldcpy
  call copy (qualified call) ... copied

An alternative would be to use a creation call

  load empty object:         ... empty
  load field by field copy:  ... empty  fieldcpy
  call copy (creation call)  ... copied

The latter one seems to be more elegant.

Context

The execution is always within a routine (current routine) and has a target (the current object). The arguments, local variables and temporaries are on the stack, starting at position `bp'. The data needed to executed a routine (the current routine, target, bp, instruction pointer, ...) are called the context.

The contexts are linked. The last context in the linked list is the context of the creation procedure of the root object.

The current object

If the current object is of reference type, it is always located on the heap. There is a position on the stack which points to the object.

If the current object is of expanded type, there are more possibilities:

During the execution there exists an execution context

There is a stack and a heap of objects. The stack contains expanded type objects and references to heap objects. The heap contains all reference type objects. Objects can contain expanded type objects or references to other reference type objects (or to itself).

Stack layout

Each routine has access to its portion of the stack its so called stack frame.

    stack frame of a routine:
        bp
        v
    ... a1 a2 ... an [res] l1 l2 ... lm  tmps

Command overview

   cr               i    create object of type of used feature i
   uc/cl   [tgt]    i    call used feature i as un/qualified call  
   ccl     [tgt]    i    call used feature i as creation call
   arg/loc [tgt]    i    load argument/local i
   stl/sta          i    store local i / attribute(used feature i)
   const_int/bool   i    load value i
   op_id/nid	    i	 perform [not] identity with type pair i
   op_eq/neq	    i	 perform [not] equality with type pair i
		       
   jmp_if_not       i    jump i steps if top false, pop top  
   jmp_end_if       i    jump i steps
		       
   var_max          [i]  load maximum unsigned integer
   skip_inv         i    jump i steps, if loop invariant/variant not monitored
   jmp_end_loop     i    jump i+1 steps
   jmp_next_loop    i    jump i steps backward to next loop iteration
   assert_var            check the 2 topmost values v0 v1 as loop variant 
   			 (v0 > v1 > 0) v0:= v1 pop one
   assert_check     i    assert top, pop if ok, i indicates tag	
   assert_loop      i             "
   assert_inv       i             "
   assert_post      i             "
   assert_pre       i             "

Creation of an object

  cr i
  stack before: ...
	after:  ... obj
        obj: if expanded type:  the object, 
	     if reference type: a reference to the object

An expanded type object is allocated on the stack and a reference type object is allocated on the heap and a reference is pushed onto the stack.

Allocation includes zeroing out the object.

Calling of a routine

  uc/cl [tgt] i
  stack before: ... [tgt] a1 a2 ... an
	after:  ... [res]

Beginning of the stack frame of the called routine

A routine r can be executed, if its arguments are all on top of the stack.

The routine r knows its argument types and via r.argument_size the start of the stack frame of the called routine is known.

Target

The target is either the current target (in case of an unqualified call) or the target is located immediately before the arguments. The target is a pointer to the object in case of reference type objects or non basic expanded type objects or the object itself in case of a basic type object (like INTEGER, CHARACTER, etc.)

Dynamic bind

Now the target is known. If it is a reference type, the dynamic type of the target will be found in the object tag.

If the call is qualified and r.type is a reference type, the called routine has to be found via dynamic bind (if r.type is expanded, no dynamic bind is necessary).

   if r.type.is_reference and then r.type /= tgt.type
   then
       dyn_feat  :=  tgt.type.dyn_bind ( r )
   else
       dyn_feat  :=  r
   end
 

This is necessary, if the type of the called routine r is different from the target type. The target type might have a redefined version of the routine r. This redefined version has to be called.

If the dynamic target type is expanded and different from the static type, the target is a boxed expanded. In that case, the target has to be unboxed, because the dynamic routine expects the expanded target unboxed.

The dynamic feature can be a routine with covariantly redefined arguments. Conformance between actual argument types and formal argument types of the dynamic routine has to be checked before entering the routine. Also boxed expandeds which are expected unboxed by the dynamic routine have to be unboxed.

It might turn out after dynamic bind, that the called routine has been redefined into an attribute. Then no routine has to be called. The attribute has to be loaded (see Load data item). However the target has to be removed before loading the attribute in case of a qualified call (calls with dynbind are always qualified).

Creation of the complete stack frame

Now space for the locals (zeroed out) and the temporaries have to be allocated on the stack. If the procedure is a query, the result object has to be included in the local variables.

   stack:  ... a1 a2 ... an [res] l1 l2 ... lm  tmps

Execution of the called routine

Now the routine can be executed

   execute ( routine, target )
 

Clean up

   stack before:  ... [tgt] a1 a2 ... an [res] l1 l2 ... lm  tmps
    	 after:   ... [res]

As the last step, the stack has to be cleaned up. Cleaning up is removing the stack frame and in case of a qualified call also the target .

For a command, the temporary space of the calling routine will be cleaned up (freed) also.

For a query the result has to be moved to its destination before removing the stack frame.

The moving of the result is just a copying, if the result is not used as a target or is a basic expanded type object.

If the result is used as a target and the result is a non basic expanded type object, some special treatment is necessary. This is because the target is needed as a reference (see note above). Therefore the result is pushed to the temporary space of the calling routine and instead of the result object a reference to the temporary space is left on the stack.

Creation call

    ccl [tgt] i
    stack before: ... obj a1 a2 ... an
    	  after:  ... obj

A creation call is basically a qualified call. In case of an expanded type object the object (never a reference) will be located in front of the arguments, even in case of a non basic expanded type.

In order to find the target, the object size and the argument size has to be used.

In the cleanup only the stack frame without the target is removed. The creation call must leave a reference to the object in case of reference type or the object in case of expanded type on the stack.

Usually the created object is used for storing. In this case everything is ok, because storing requires a reference for reference type objects and the object itself for expanded type objects.

But the created object might be also used as the target of a call (creation expression used as target, this is a very unusual case, because the created object is basically unusable except as a target of a qualified call).

For expanded type objects, the target is still in front of the arguments (the full object on the stack, no reference!). The call with the target from the creation expression is executed like any other qualified call with an expanded target.

Old version, discard

In this case a special treatment is necessary for non basic expanded types, because a reference instead of the object is necessary on top of the stack. In this case, the object is moved to the top the the temporary space of the calling routine and a reference is pushed onto the stack.

    stack before: .......... obj a1 a2 ... an
    	  after:  ..obj..... oref

Moving data around

Objects or references to objects are moved around by the the following procedures:

Load a data item

    arg/loc/uc [tgt] i

The object or the reference to the object is copied from its origin to the top of the stack. If the entity is an argument or a local variable, its origin is in the stack frame of the current routine. If the entity is an attribute, its origin is in the current object. The current object might reside on the stack or on the heap.

E.g. arg 2: load argument 2

    stack before: ... a1 a2 ... an [res] l1 l2 ... lm  tmps ...
    	  after:  ... a1 a2 ... an [res] l1 l2 ... lm  tmps ... a2

At least one word is occupied on top of the stack, because each item on the stack occupies at least one word, even if its size is only one byte. In case of expanded type objects, the size might exceed one word. In this case the complete object is moved.

In case that an expanded type object has a customized copy procedure some modification is necessary, because field by field copy does not yield the required result.

    push empty object                            ... empty
    get pointer aptr to empty object
    push field by field copy of original object  ... empty  obj
    get copy routine r
    execute( r, aptr )                           ... copied obj
    remove field by field copy from stack        ... copied

If the data is loaded as a target and the object is of non basic expanded type, some special treatment is necessary. The object will not be moved to the top of the stack but to the top of the temporary space of the current routine and a reference to the temporary space is pushed onto the stack. If the object has a customized copy procedure, the copying to the temporary space is done via the custumized copy procedure.

    stack before: ... a1 a2 ... an [res] l1 l2 ... lm  t1 ... tn .......
    	  after:  ... a1 a2 ... an [res] l1 l2 ... lm  t1 ... tn val ... vptr

Some subtleties with loading

For reference types loading implies just loading of the reference. This is straightforward.

For user defined expanded types without customized copy routine loading is a field by field copy from the source, regardless whether the source resides on the heap (as an attribute of a reference type object) or on the stack (local variable, attribute or attribute of an expanded type object). It is a one to one copy byte by byte.

For user defined expanded types with customized copy routine and basic types the thing gets more complex.

1. User defined expanded types with redefined copy routine:

  load empty object:                  ... empty
  load field by field copy:           ... empty   fieldcpy  -- optional boxing
  execute creation call with copy:    ... filled

Boxing might be necessary, if in the signature of copy

    class ANY feature ...
       copy (other: like Current)   do ... end
    end
 
    class P inherit 
       ANY redefine copy end  
    feature ...
       copy (other: P)    do ... end
    end
 
    class EXP inherit P ... end
 

the argument `other' has been unanchored in a parent `P' of the expanded class. This is a strange case, but not forbidden.

2. Basic types:

Subword basic types (e.g. CHARACTER_8, NATURAL_8/16, ...) are stored packed within objects. On the stack they occupy one word and have to be embedded into the corresponding one word type. This depends on the endianness of the architecture.

But this applies only to loading from an attribute. Loading from the stack (local variable, argument) always loads the complete word as is (without any embedding).

The exception cases are:

Store a data item

    stl/sta i

The destination can be either a local variable (including Result) or an attribute of the current object.

The source is the top of the stack.

    stack before: ... val
    	  after:  ...

For expanded type objects (basic or non basic type) the value is on the stack (never a reference), for reference type objects a reference to the object is on the stack.

The object or its reference is copied to its destination and removed from the stack.

If the object is smaller than a word and the destination is an attribute, care has to be taken, that only the real object size is copied. If the destination is a local variable, no problem of this type arise because on the stack at least one word is reserved for an object even if its size is smaller.

Custumized copy need not to be taken into account, because a customized copy procedure will be applied at load time if necessary. Since the object is already on the stack, it has already been loaded with its customized copy procedure if necessary.

Comparing objects

Comparing objects is done by the equality operators "~" and "/~" and the identity operators "=" and "/=".

For expanded type objects the semantic of equality and identity is the same. For reference type objects the identity operators just compare references and the equality operators compare the content.

The notion of equality is based on the feature is_equal from the class ANY. Two entities e1 and e2 (or expressions) are equal if and only if they are both attached to objects and the feature call e1.is_equal(e2) evaluates true.

  stack before: ... e1    e2
	after:  ... e1~e2

Feature call to is_equal

                    tgt   bp
                    v     v
  stack before: ... e1    e2
        in call:... e1    e2  res locs
	after:  ... e1~e2

Flow control

Conditional

  if                       --
         e1                --     e1 
  then                     --     jmp_if_not L1
         s1                --     s1
                           -- L1: jmp_end_if LE
  elseif                   --
         e2	           --     e2
  then                     --     jmp_if_not L2
	 s2                --     s2
	   		   -- L2: jmp_end_if LE
  else                     --
         s3                --     s3
  end                      -- LE:
 

Loop

  from                     --     var_max L0           stack: ...v0
  	       s1          --     s1
  invariant                -- L0: skip_inv L1
  	       inv         --     inv
  until                    --
	       ec          -- L1: ec                   stack: ...v0 ec
                           --     jmp_end_loop LE      stack: ...
  loop                     --
	       s2          --     s2
  variant                  --     skip_inv LE
               var_exp     --     var_exp              stack: ...v0 v1
	       assert_var  --     check variant        stack: ...v1
  end                      -- LE: jmp_next_loop L0
 

Remark: var_max does not need a label. However using it, it is easy to scan the structure of the loop statement by just following the jump instructions. jmp_end_loop LE does not jump to LE but to LE+1.

Multibranch

  inspect
      exp           --      exp              exp
  when              --
    'a',            --      'a'              exp  a
                    --      mb_comp type i   exp  bool
                    --      jmp_mb_then LB1
    'b'             --      'b'              exp  b
                    --      mb_comp type i   exp  bool
                    --      jmp_mb_then LB1
    then            -- LB1: jmp_mb_nxt L02
      ...           --      ...
                    -- L02: jmp_mb_end LE
  when              --
    'c',            --      'c'              exp  c
                    --      mb_comp type i   exp  bool
                    --      jmp_mb_then LB2
    'd'..'f'        --      'd'              exp  d
                    --      'f'              exp  d     f
                    --      mb_compi type i  exp  bool
                    --      jmp_mb_then LB2
                    -- LB2: jmp_mb_else LBE
    then            --
      ...           --      ...
                    -- LBE: jmp_mb_end LE
  else              --
      ...           --       ...
  end               -- LE:  
 

Backpatch:

  jmp_mb_end LE:    chain all commands and backpatch, when end LE reached
  jmp_mb_when LBi:  chain all choices and backpatch, 
                    when start of branch LBi reached
  jmp_mb_nxt L0i:   backpatch, when start of next alternative reached

Commands:

  stack before: exp val
  stack after:  exp bool
  check, if exp is equal to val (both of type i)
  stack before: exp v1   v2
  stack after:  exp bool
  check, if exp is in the interval v1 .. v2 (exp, v1 and v2 of type i)
  stack before:         exp bool
  stack after (true):    
  stack after (false):  exp    
  if bool, unstack exp,bool and continue at cur_pos+i+1  
  if not bool unstack bool
  stack before: ...
  stack after:  ...
  continue at cur_pos+i

Validation:

All choices and intervals need to be collected during validation. It has to be checked, that all are of the same type (the type of exp) and no overlap occurs.

If the else part is missing an empty else with bad inspect value has to be inserted (it is an illegal path).

Semistrict boolean expressions

  a  and then b      --     a
     	      	     --     jmp_and_then L1     jump if false otherwise pop
		     --     b
		     -- L1: ...                 top of stack has value
 
  a  or else  b      --     a
                     --     jmp_or_else  L1     jump if true otherwise pop
                     --     b
		     -- L1: ...                 top of stack has value
 
  a  implies  b      --     a
     	      	     --     jmp_implies  L1     jump if false and toggle, 
		     --                         otherwise pop
		     --     b
		     -- L1: ...			top of stack has value
 

Assertions

Each assertion (require, ensure, check, loop invariant, loop variant, invariant) is a sequence of tagged or untagged boolean expressions (or comments).

The sequence of boolean expression are and thened. I.e. all boolean expression must hold. The first boolean expression that fails, fails the whole assertion without evaluating the remaining expressions.

If assertion monitoring is disabled for a specific type of assertions (e.g. loop invariant), the corresponding assertions are not evaluated.

Bypassing an assertion is done by the commands:

  skip_inv        i
  skip_check      i

Evaluating a boolean expression leaves a boolean value on the stack. In case of True, the boolean is popped off the stack and the next tagged or untagged boolean expression of the assertion has to be evaluated. In case of False, the assertion fails, i.e. a diagnostic message is written with the assertion type, the file/line information and the optional tag and the execution stops (not taking into account exception handling).

The commands

  ass_chk       i
  ass_loop      i
  ass_inv       i
  ass_ens       i
  ass_req       i

check the boolean value on top of the stack. In case of True they just pop the value. In case of False the argument i indicates the optional tag. A tag zero indicates no tag.

 Local Variables: 
 mode: outline
 coding: iso-latin-1
 outline-regexp: "=\\(=\\)*"
 End:
Table of contents

- General principles

- Objects

- Context

- Stack layout

- Command overview

- Creation of an object

- Calling of a routine

- Beginning of the stack frame of the called routine

- Target

- Dynamic bind

- Creation of the complete stack frame

- Execution of the called routine

- Clean up

- Creation call

- Moving data around

- Load a data item

- Some subtleties with loading

- Store a data item

- Comparing objects

- Flow control

- Conditional

- Loop

- Multibranch

- Semistrict boolean expressions

- Assertions


ip-location