October 30, 2009: New Rules for Conditional Expressions and Coercion

A gap in the definition of Fortress so far has to do with the interaction of union types and coercion:

a. The type of certain conditional (or "branching") expressions such as if and case and typecase is defined to be a union type, namely the union of the types of the block expressions in its clauses ([ Fortress 1.0 beta], pages 110-113).

b. Coercion is implicitly defined for tuple types and arrow types, but not for union types ([ Fortress 1.0 beta], pages 150-151).

Coercion is currently not implemented in the Fortress interpreter. Some language features that were originally intended to be implemented through coercion are supported in the interpreter by contorting the type hierarchy in unfortunate ways; for example, it isn't quite right for ZZ32 to be considered a subtype of ZZ64, nor for ZZ64 to be a subtype of ZZ. In the future, these types, as well as the type of integer literals, will be disjoint from each other, and coercions will handle the conventional conversions among them. This is what we plan to implement in the Fortress compiler.

Another problem arises from the interaction of union types with variable declarations that do not explicitly specify a type for the variable. In the code fragment

do
  x: ZZ32 = 5
  y = if p then x else 7 end
  f(y)
end

the type of the variable y is not ZZ32, but the union of ZZ32 and the type of the integer literal 7, which is a special literal type and not any of ZZ, ZZ32, ZZ64, etc. This in turn can have surprising consequences when calling the method f.

The Sun Labs Fortress team discussed these problems this week and decided to make three changes to the definition of Fortress:

1. The rule for the type of a "branching" expression is revised to read: The type of an if expression (or other similar construct) is determined in one of two ways: If the block expression in any one clause has a type for which the types of all other clauses are substitutable (that is, either a subtype or convertible by coercion), then that type is also the type of the if expression, and coercions are performed as necessary on the values of all the other block expressions; otherwise, the type of the if expression is the union of the types of the block expressions in its clauses.

(Note that if there is a clause with a type for which the types of all other clauses are substitutable, then that clause is necessarily unique; this follows from the rules that forbid circularity of coercions and subtypes. This is a fairly conservative rule, but we think it is adequate to handle most situations of practical interest. If necessary, it could be extended compatibly by allowing the compiler to search for a unique minimal type that is not the type of any clause's block expression but for which the types of clauses are substitutable.)

2. It is a static error if a variable declaration (either top-level or local) binds a variable without specifying a type for the variable and the type of the expression that provides the value for that variable is a union type.

(Readers familiar with the Haskell programming language may find this new rule vaguely reminiscent of that language's "monomorphism restriction".)

3. If (i) type C provides a coercion from type A, and (ii) type B is substitutable for type C, and (iii) type B excludes type A, then a coercion is implicitly provided from type (A union B) to type C.

(This coercion is implemented by using the coercion from A to C if the object is of type A, and by using the coercion from B to C if B is not a subtype of C. In situations where the union type arises from a conditional expression, it is typically not necessary to perform a run-time test to determine whether the object is of type A or type B if a simple optimization is performed; see Example 3 below.)

The intent of these specific choices is to make certain common kinds of code easy to write, while trying to limit confusion in more complicated situations. If the second rule produces a static error, there are two possible workarounds: either insert one or more explicit coercions, or provide an explicit type in the variable declaration.

Example 1: The classic factorial operator

opr (n: NN)! : NN = if n = 0 then 1 else n (n-1)! end

This is technically forbidden by the Fortress 1.0 beta specification, because the type of the if expression is the union of the type NN and the integer literal type; while a return type NN is given in the code, there is no rule allowing the union type to be coerced to NN.

Under the first new rule given above, the type of the if expression is NN because the type NN provides a coercion from (nonnegative) integer literals, and under the rule such a coercion will be implicitly applied to the integer literal 1 in the then clause.

Example 2: Local variable binding, simple case

do
  x: ZZ32 = 43
  z = (if p then 17 else x)
  z+2
end

Under the Fortress 1.0 beta specification, the local variable declaration of z is permitted, but the addition z+2 will likely fail because there is no overloading of + that is explicitly defined to accept an argument of union type, there is no numeric type of which that union type is a subtype that provides a concrete definition of +, and there is no rule allowing coercion of the union type.

Under the first new rule, the type of the if expression is ZZ32 because ZZ32 provides a coercion from the integer literal 17, and there is no further difficulty.

Example 3: Local variable binding, complicated case

do
  x: ZZ32 = 43
  y: NN32 = 2^31
  z = (if p then 17 elif q then x else y)
  z+2
end

Under the Fortress 1.0 beta specification, this example likely fails for reasons similar to those in the previous example.

Under the first new rule, the type of the if expression is still a union type because neither of ZZ32 nor NN32 provides a coercion from the other. Therefore a static error always occurs according to the second new rule. We believe that it is desirable for this example to produce a static error.

One way to make this code correct is to provide an appropriate type for the binding of z:

do
  x: ZZ32 = 43
  y: NN32 = 2^31
  z: ZZ64 = (if p then 17 elif q then x else y)
  z+2
end

Then an implicit coercion can be provided under the third new rule. In practice, the compiler will then rewrite the code from something like

do
  x: ZZ32 = 43
  y: NN32 = 2^31
  z: ZZ64 = coerce[\ZZ64\](if p then 17 elif q then x else y)
  z+2
end

to something like

do
  x: ZZ32 = 43
  y: NN32 = 2^31
  z: ZZ64 = (if p then coerce[\ZZ64\](17)
           elif q then coerce[\ZZ64\](x)
                  else coerce[\ZZ64\](y))
  z+2
end

This rewriting is similar to the rewritings associated with widening coercions ([ Fortress 1.0 beta], pages 151-153).

Comments

1. dmitrey -- 2009-11-01 01:08

For simple cases I would prefer shorter Python-like definition x = a if someCondition else b

Let me note - your mail lists mentioned in http://projectfortress.sun.com/Projects/Community don't work yet, links to them are dead (as well as that one mentioned in wikipedia.org entry).

2. gls -- 2009-11-03 12:11

Hmm! The syntax "a if p else b" is an interesting possibility. We will have to look into whether it can be handled compatibly with the current grammar, or perhaps handled by the syntax extension mechanism.

The email lists should be working now.