In my last blog post I introduced DOT, a minimal calculus that underlies much of Scala. DOT is much more than an academic exercise, because it gives us guidelines on how to design a sound type system for full Scala.
As was argued in the previous blog post, the danger a path-dependent type system like Scala's faces is inconsistent bounds or aliases. For instance, you might have a type alias
type T = String
in scope in some part of the program, but in another part the same
T is known as
type T = Int
If you connect the two parts, you end up allowing assigning a
Int and vice versa, which is unsound - it will crash at
runtime with a
ClassCastException. The problem is that there
is no obvious, practical, compile time analysis for DOT or
Scala that ensures that all types have good bounds. Types can contain
abstract type members with bounds that can be refined elsewhere and
several independent refinements might lead together to a bad bound
problem. Barring a whole program analysis there is no specific
point in the program where we can figure this out straightforwardly.
In DOT, the problem is resolved by insisting that every path prefix
of a type
p.T is at runtime a concrete value. That way, we only have
to check for good bounds when objects are created with
that check is easy: When objects are created, we know their class and
we can insist that all nested types in that class are aliases or
have consistent bounds. So far so good.
But if we want to scale up the DOT result for full Scala, several
loopholes open up. These come all down to the fact that the prefix of
a type selection might not be a value that's constructed with a
new at run time. The loopholes can be classified into three
The prefix value might be lazy, and never instantiated to anything, as in:
lazy val p: S = p ... p.T ...
Note that trying to access the lazy value
p would result in an infinite loop. But using
p in a type does not force its evaluation, so we might never evaluate
p is not initialized with a
new, bad bounds for
T would go undetected.
The prefix value might be initialized to
null, as in
val p: S = null ... p.T ...
The problem here is similar to the first one.
p is not initialized
new so we know nothing about the bounds of
Tin a type projection
T # A, where
Tis not associated with a runtime value.
We can in fact construct soundness issues in all of these cases. Look
at the discussion for issues #50
and #1050 in the
on GitHub. All issues work fundamentally in the same way: Construct a type
which has a type member
T with bad bounds, say
Any <: T <: Nothing
Then, use the left subtyping to turn an expression of type
an expression of type
T and use the right subtyping to turn that
expression into an expression of type
def f(x: Any): p.T = x def g(x: p.T): Nothing = x
g(f(x)) will convert every expression into an
expression of type
Nothing is a subtype of every
other type, this means you can convert an arbitrary expression to have
any type you choose. Such a feat is an impossible promise, of
course. The promise is usually broken at run-time by failing with a
To get back to soundness we need to plug the loopholes. Some of the necessary measures are taken in pull request #1051. That pull request
T # Aonly if
Tis a concrete type with known consistent bounds.
It looks like this is sufficient to plug soundness problems (1) and (3). To plug (2), we need to make the type system track nullability in more detail than we do it now. Nullability tracking is a nice feature in its own right, but now we have an added incentive for implementing it: it would help to ensure type soundness.
There's one sub-case of nullability checking which is much harder to do
than the others. An object reference
x.f might be
null at run time
because the field
f is not yet initialized. This can lead to a
soundness problem, but in a more roundabout way than the other issues
we have identified. In fact, Scala guarantees that in a program that
runs to completion without aborting, every field will eventually be
initialized, so every non-null field will have good bounds. Therefore,
the only way an initialized field
f could cause a soundness problem
is if the program in question would never get to initialize
either because it goes into an infinite loop or because it aborts with
an exception or
System.exit call before reaching the initialization
f. It's a valid question whether type soundness guarantees
should extend to this class of "strange" programs. We might want to
draw the line here and resort to runtime checks or exclude "strange"
programs from any soundness guarantees we can give. The research community
has coined the term soundiness for
this kind of approach and has advocated for it.
The necessary restrictions on type projection
T # A are problematic
because they invalidate some idioms in type-level programming. For
instance, the cute trick of making Scala's type system Turing complete
by having it simulate SK
would no longer work since that one relies on unrestricted type
projections. The same holds for some of the encodings of type-level
To ease the transition, we will continue for a while to allow unrestricted type
projections under a flag, even though they are potentially
unsound. In the current dotty compiler, that flag is a language import
-language:Scala2, but it could be something different for other
-unsafe. Maybe we can find rules that are less
restrictive than the ones we have now, and are still sound. But one
aspect should be non-negotiable: Any fundamental deviations from the
principles laid down by DOT needs to be proven mechanically correct
just like DOT was. We have achieved a lot with the DOT proofs, so we
should make sure not to back-slide. And if the experience of the past
10 years has taught us one thing, it is that the meta theory of type
systems has many more surprises in store than one might think. That's
why mechanical proofs are essential.