SourceForge.net LogoThe Eiffel Compiler / Interpreter (tecomp)

doc/lang/basic

Standard Types, Operators and Expressions

 ----------------------------
 --------- DRAFT ------------
 ----------------------------

Entities in Eiffel are

All entities have a type, there are no untyped entities in Eiffel. An entity of a type T can be attached to objects which conform to T or convert to T (conformance and conversion see later). The attachment happens either by assignment or by argument passing. It is not possible to attach objects to constant attributes and to Current. Current is always attached to the current object, a constant attribute is always attached to the constant.

All objects have a type. The simplest types are the basic types.

This chapter explains the basic data types, the operators and expressions. Since complex data types can be built arbitrarily from the basic types it is important to understand the basic types first.

Many languages define builtin types. On top of the builtin types they build more complex types. In the Eiffel the standard types are not outside the type system. Each of the standard types like INTEGER, REAL, etc. is represented by an Eiffel class written in a class file (e.g. integer_32.e) in the kernel library.

However the compiler must "know" these types because it must represent these types by corresponding types of the concrete machine. Most features (operations like "+", "-", etc.) of these standard types are builtin features because they cannot be written with Eiffel. The Eiffel compiler, interpreter of virtual machine must make sure, that it translates the builtin features/operations into specific machine code operations to meet the below described semantics.

The standard types consist of the basic types (INTEGER, REAL, ... ) and some other types (like STRING) with clearly defined standardized semantics.

Basic types

The basic types available in Eiffel are

   BOOLEAN
   CHARACTER, CHARACTER_8, CHARACTER_32
   INTEGER,   INTEGER_8,   INTEGER_16,   INTEGER_32,   INTEGER_64
   NATURAL,   NATURAL_8,   NATURAL_16,   NATURAL_32,   NATURAL_64
   REAL,      REAL_32,     REAL_64
 

The basic types are all expanded. I.e. an entity of type INTEGER represents an integer value (i.e. an integer object) and not a reference to an integer object. All expanded types have copy semantics, i.e. assignment causes a copy of the value and not just the assignment of a reference.

The basic types are not just expanded, they are also immutable. It is not possible to change the value of an INTEGER, i.e. with a variable i of type INTEGER, there is no operation like i.increment. The only possibility to change the value of a variable of a basic type is to assign a new value to it, e.g. like i := i + 1.

All basic types have default values. They need not be initialized explicitely. The default values are zero, False or the character with code zero for INTEGERs/REALs, BOOLEANs and CHARACTERs respectively.

CHARACTER, INTEGER, NATURAL and REAL have the above written sized variants. CHARACTER, INTEGER, NATURAL and REAL are not individual types, they are just synonyms for one of their sized variants. The possibilities are

   CHARACTER: CHARACTER_8 or CHARACTER_32
   INTEGER:   INTEGER_32  or INTEGER_64
   NATURAL:   NATURAL_32  or NATURAL_64
   REAL:      REAL_32     or REAL_64
 

The sized variant can be chosen by a compiler option. The usual default is CHARACTER_8, INTEGER_32, NATURAL_32 and REAL_32.

A BOOLEAN can hold the truth values True and False.

An INTEGER_n represents a signed integer value in the range -2^(n-1) .. 2^(n-1) - 1, where n is either 8, 16, 32 or 64. I.e.

   INTEGER_8:     -128                  ..   127
   INTEGER_16:    -32768                ..   32767
   INTEGER_32:    -2147483648           ..   2147483647
   INTEGER_64:    -9223372036854775808  ..   9223372036854775807
 

The NATURALs are unsigned are represent values in the range 0 .. 2^(n-1)

   NATURAL_8:     0  ..   255
   NATURAL_16:    0  ..   65535
   NATURAL_32:    0  ..   4294967295
   NATURAL_64:    0  ..   18446744073709551615
 

The REALs are floating point number in IEEE format. There is a 32 bit REAL_32 and a 64 bit REAL_64 floating point number.

Constants

BOOLEAN constants are True and False.

An INTEGER constant is any sequence of decimal digits within the range of INTEGER (remember INTEGER is either a synonym for INTEGER_32 or INTEGER_64). For better readability underscores can be used to group the digits (recommendation: groups of three decimal digits).

Examples of decimal INTEGERs:

    1234
    1_000_000_000
    -256
 

INTEGER constants can be given in hexadecimal (base 16, prefix 0x), octal (base 8, prefix 0c) and binary (base 2, prefix 0b) as well. Eiffel uses prefixes to indicate the number base.

    0xFF                -- decimal value 255
    0xa                 -- decimal value 10
    0x8000_0000         -- decimal value -2147483648
    0xffff_ffff         -- decimal value -1
 
    0c40                -- decimal value 32 = 4*8
    0c77                -- decimal value 63 = 7*8 + 7
 
    0b1_0000_0000       -- decimal value 256 = 2^8
    0b1111              -- decimal value 15  = 2^4 - 1
 

Since integer constants must be of type INTEGER (usually INTEGER_32) they must be in that range. If INTEGER is a synonym for INTEGER_32, numbers exceeding that range cannot be represented. In order to represent them properly, they have to be prefixed with the type which has an appropriate range. E.g.

    {INTEGER_64} -9223372036854775808 -- value exceeds INTEGER_32 range
    {NATURAL}    4294967295           -- value exceeds INTEGER_32 range
    {NATURAL_64} 18446744073709551615 -- value exceeds even INTEGER_64 range
 

A CHARACTER constant is written as one printable CHARACTER within single quotes like e.g.

    'a'
    '#'
    '"'
    '@'
    '0'
 

Non printable characters can be represented by the escape sequences

    '%A'   -- At-sign         @
    '%B'   -- Backspace       BS   *
    '%C'   -- Circumflex      ^
    '%D'   -- Dollar          $
    '%F'   -- Formfeed        FF   *
    '%H'   -- BackslasH       BS
    '%L'   -- TiLde           ~
    '%N'   -- Newline         NL   *
    '%Q'   -- BackQuote       `
    '%R'   -- CarriageReturn  CR   *
    '%S'   -- Sharp           #
    '%T'   -- HorizontalTab   HT   *
    '%U'   -- NUll            NUL  *
    '%V'   -- Vertical bar    |
    '%%'   -- Percent         %    *
    '%''   -- Single quote    '    *
    '%"'   -- Double quote    "
    '%('   -- Opening bracket [
    '%)'   -- Closing bracket ]
    '%<'   -- Opening brace   {
    '%>'   -- Closing brace   }
 

It is also possible to define character constants by its character code in the form '%/code/'. The character code can be given in decimal, hexadecimal, octal or binary form. E.g.

   '%/32/'       -- character 32, i.e. blank in decimal,
   '%/0x20/'     -- in hexadecimal,
   '%/0c40/'     -- in octal,
   '%/0b1_0000/' -- and in binary notation
 

Valid REAL constants are

   1.
   1.0
   1e4
   .5
   0.5
 

A string constant, or string literal, is a sequence of zero or more characters surrounded by double quotes, as in

   "I am a string"
 

or

   ""        -- the empty string
 

The quotes are not part of the string, but serve only to delimit it. The same escape sequences used in character constants apply in strings; %" represents the double quote character. Example of a string with embedded escape sequences:

   "A string with double quote %" and non printables like %T"
 

A long string can be line wrapped across several source lines, e.g.

   "hello, %
   %world"
 

is equivalent to

   "hello, world"
 

Another possibility is to use verbatim strings. The verbatim string

   "{
      Hello, 
      world.
      Don't forget me!
   }"
 

is equivalent to

   "      Hello,%N      world.%N      Don't forget me!%N"
 

Since the line sequence delimited by "{ and }" is taken verbatim, the blanks in front of the text on the lines are taken verbatim as well. This is sometimes not wanted. There is a variant which strips off any common initial blanks and tabs which uses the delimiters "[ and ]". The verbatim string

   "[
      Hello, 
      world.
      Don't forget me!
    ]"
 
 

is equivalent to

   "Hello,%Nworld.%NDon't forget me!%N"
 

Only the indentation common to all lines is stripped off. If one or more lines are indented relative to the others, that indentation is kept. E.g.

   "[
      Hello, 
          world.
      Don't forget me!
    ]"
 
 

is equivalent to

   "Hello,%N    world.%NDon't forget me!%N"
 

Declarations

All entities (except Result and Current) must be declared. The following example shows typical declarations.

  class
     C
  ...
  feature
 
     Pi: REAL = 3.14159265358979323846  -- a real constant
 
     Name: STRING = "Joe Cartwright"    -- a string constant
 
     ival1, ival2, ival3: INTEGER       -- variable attributes
     rval:                REAL
 
     some_function (i,j,k: INTEGER; r: REAL): INTEGER
        local
           m,n: INTEGER
           s:   REAL
        do
           -- some_function has access to all class level entities
           -- (Pi, Name, ival1, ival2, ival3, rval)
           -- to all formal arguments (i,j,k,r)
           -- to all local variables (m,n,s)
           -- to the entity Result (of type INTEGER)
           -- and to the entity Current (of type C)
        end
 
     ...
  end
 

The order of the feature declarations is not relevant. The attributes Pi, Name, ival1, ... could have been declared before or after the routines using them.

Nearly all semicolons in Eiffel are optional. They are inserted for better readability if more than one declaration or statement is placed on one line. The routine declaration

     ...
     some_function (i,j,k: INTEGER r: REAL): INTEGER
        ...
     ...
 

is legal without the semicolon. But this is not the recommended style.

Symbolic constants can be declared only at the class level (constant attributes). There are no local and no globel symbolic constants. If a class wants access to symbolic constants it either has to declare them as constant attributes in its class text or inherit them as constant attributes from a parent class.

A class is a namespace. All features in a class must have different names. The names of formal arguments and local variables have routine scope. They must be different from the names of all features (attributes or routines) and different from each other. Since the scope of formal arguments and local variables is local to a routine, their names can be reused in another routine.

The features of a class are the features declared in a class and the inherited features. Therefore it is not a good practice to give features very short names (like "i" or "n") because of the high probability to clash with the names of formal arguments and local variables (which are usually short). The style guide is to name features descriptive (e.g. count, capacity, put, etc.).

Operators

In Eiffel operators are just aliases for feature names. The expression

  a + b * c
 

is a shorthand for

  a.plus (b.product (c))
 

An operator alias is declared like

  class INTEGER feature
     ...
     plus alias "+" (other: like Current): like Current
         do ... end
 
     plus product "*" (other: like Current): like Current
         do ... end
     ...
  end
 

Operators allow us to write expressions in a more natural manner. Furthermore operators have precedences which allow us to avoid a lot of parentheses and make the source code more readable.

Any class can use any operator for an alias of its features as long as there is no name clash (i.e. different features must have different names and different aliases). The precedence of the operators cannot be changed, the precedence is defined by the language.

In the following we discuss the use of operators in the basic types.

Arithmetic operators

The class INTEGER uses the binary arithmetic operators +,-,* the integer division //, the real division / and the power operator ^.

Integer division truncates the fractional part (i.e. 5//2 = 2), the expression

   x \\ y
 

produces the remainder when x is divided by y, and thus is zero when y divides x exactly.

E.g., a year is a leap year if it is divisible by 4 but not by 100, except that years divisible by 400 are leap years. Therefore

   local
      year: INTEGER
   do
     ...
     if year \\ 4 = 0 and year \\ 100 /= 0 or year \\ 400 = 0 then
         print ( year.out + " is a leap year%N" )
     else
         print ( year.out + " is a leap year%N" )
     end
     ...
   end
 

This example already shows that the binary arithmetic operators have precedence over the relational operators (=, /=, ~, /~, ...). The relational operators have precedence over the boolean binary operators (and, or, ...) and and takes precedence over or (datailed precedence table see below)

Real division / applied to INTEGERs returns a REAL (i.e. 1/2 = 0.5).

Division by zero (all numeric types) results in an exception.

For negative operands the direction of the truncation of the integer division a //b and the sign of the result of a\\b is undefined. However the consistency relation

   a = a//b * b   +   a\\b
 

is guaranteed

Overflow during arithmetic operations is not detected by the runtime. Addition and substraction is done with circular arithmetic (i.e. Largest_integer + 1 = Smallest_integer). Multiplication on n-bit INTEGER/NATURALs is done as if it were done with 2n bit size and the result truncated to n bits (i.e. the most significant n bits removed).

The INTEGERs/NATURALs have a power operator ^ to do the exponentiation a^b. The exponent must not be negative. The exponentiation a^b returns the same result as the repeated multiplication a*a*...*a (b times, with b>=0).

The REALs have an exponentiation operator as well. The exponentiation a^b with REALs evaluates to a^b = exp(b*log(a)), were exp(x) is the exponential function and log(x) is the natural logarithm. For a <= 0 the runtime throws an arithmetic exception.

The operators // and \\ are not defined for REALs.

Relational operators

The relational operators are

  <  <=  >  >=  =  /~
 

They all have the same precedence and are not associative. Expression like

   a < b < c    -- invalid expression
 

or

   a = b = c    -- invalid expression
 

are invalid and rejected by the parser.

If you want to test, if a=b and c=d are either both True or both False you have to write

   ( a = b ) = ( c = d )
 

Boolean operators

The boolean operators are

   not                        -- unary   
   and   or  xor              -- binary strict
   and then  or else  implies -- binary semistrict 
 

The binary operators and, or and xor are strict. In exp1 and exp2 both expressions exp1, exp2 are evaluted and then the boolean value of exp1 and exp2 will be evaluated.

The operators and then, or else and implies are semistrict. Evaluation stops as soon as the truth or falsehood of the result is known. Therefore in some cases only the first operand will be evalutated by the runtime. We get the semantics

   a and then b    -- evaluate a; if a is false the result is false
                   -- if a is true, the result is the value of b
 
   a or else b     -- evaluate a; if a is true the result is true
                   -- if a is false, the result is the value of b
 
   a implies b     -- evaluate a; if a is false the result is true
                   -- if a is true, the result is the value of b
 

You may have already noted the equivalence

   ( a implies b )   =  ( not a or else b ) -- definition of implication
 

The relative precedence of the boolean operators is

     not                            -- highest
     and     and then
     or      xor        or else
     implies                        -- lowest
 

All binary boolean operators associate left to right. This is inline with general practice in most modern programming language. The only unusual thing might be that

     a implies b implies c
 

is equivalent to

     ( a implies b ) implies c
 

because implies is an operator which is not available in most other programming languages.

 

Interval operator

The intervall operator is

  ..

TBD

Free operators

In Eiffel you can define free operators like e.g.

    !-!    
    @
    |>    
    <|
    -|->
    <-|-
    ==>
    <==
    ++
 

You can form free operators by a sequence of the operator symbols

  : \ ? = ~ / ! # $ % & * + - / < > @ ^ ` |

but you are not allowed to clash with sequences which have already a defined meaning. Some examples of invalid free operators

   --      -- -- initiates a comment
   -->     -- -- initiates a comment
   ?       -- ? alone is a placeholder for agents, combinations ?/? are valid 
   +       -- + is a standard operator and not a free operator
   <=      -- <= is a standard operator for "less equal"
   =       -- = is the standard identity operator
   /=      -- /= is the standard not identity operator
   ->      -- -> already used for constraints of formal generics
 
 

Precedence

The following table summarizes all precedence and associativity rules. Note that the rules are not complicated and in line with common practice. In order to minimize parentheses and maximize readability it is worthwhile to know these rules.

  precedence  associativity    operators
  10                           old not + - (unary) all free unary operators
   9                           all free binary operators
   8          right to left    ^
   7          left to right    * / // \\
   6          left to right    + - (binary)
   5                           ..
   4                           = /= ~ /~ < > <= >=
   3          left to right    and  and then
   2          left to right    or  xor  or else
   1          left to right    implies
 Local Variables: 
 mode: outline
 coding: iso-latin-1
 outline-regexp: "=\\(=\\)*"
 End:
Table of contents

- Basic types

- Constants

- Declarations

- Operators

- Arithmetic operators

- Relational operators

- Boolean operators

- Interval operator

- Free operators

- Precedence


ip-location