Safe Initialization
Dotty implements experimental safe initialization check, which can be enabled by the compiler option -Ycheck-init
.
A Quick Glance
To get a feel of how it works, we first show several examples below.
Parent-Child Interaction
Given the following code snippet:
abstract class AbstractFile {
def name: String
val extension: String = Path.extension(name)
}
class RemoteFile(url: String) extends AbstractFile {
val localFile: String = url.hashCode + ".tmp" // error: usge of `localFile` before it's initialized
def name: String = localFile
}
The checker will report:
-- Warning: tests/init/neg/AbstractFile.scala:7:4 ------------------------------
7 | val localFile: String = url.hashCode + ".tmp" // error
| ^
| Access non-initialized field value localFile. Calling trace:
| -> val extension: String = name.substring(4) [ AbstractFile.scala:3 ]
| -> def name: String = localFile [ AbstractFile.scala:8 ]
Inner-Outer Interaction
Given the code below:
object Trees {
class ValDef { counter += 1 }
class EmptyValDef extends ValDef
val theEmptyValDef = new EmptyValDef
private var counter = 0 // error
}
The checker will report:
-- Warning: tests/init/neg/trees.scala:5:14 ------------------------------------
5 | private var counter = 0 // error
| ^
| Access non-initialized field variable counter. Calling trace:
| -> val theEmptyValDef = new EmptyValDef [ trees.scala:4 ]
| -> class EmptyValDef extends ValDef [ trees.scala:3 ]
| -> class ValDef { counter += 1 } [ trees.scala:2 ]
Functions
Given the code below:
abstract class Parent {
val f: () => String = () => this.message
def message: String
}
class Child extends Parent {
val a = f()
val b = "hello" // error
def message: String = b
}
The checker reports:
-- Warning: tests/init/neg/features-high-order.scala:7:6 -----------------------
7 | val b = "hello" // error
| ^
|Access non-initialized field value b. Calling trace:
| -> val a = f() [ features-high-order.scala:6 ]
| -> val f: () => String = () => this.message [ features-high-order.scala:2 ]
| -> def message: String = b [ features-high-order.scala:8 ]
Design Goals
We establish the following design goals:
- Sound: checking always terminates, and is sound for common and reasonable usage (over-approximation)
- Expressive: support common and reasonable initialization patterns
- Friendly: simple rules, minimal syntactic overhead, informative error messages
- Modular: modular checking, no analysis beyond project boundary
- Fast: instant feedback
- Simple: no changes to core type system, explainable by a simple theory
By reasonable usage, we include the following use cases (but not restricted to them):
- Access fields on
this
and outerthis
during initialization - Call methods on
this
and outerthis
during initialization - Instantiate inner class and call methods on such instances during initialization
- Capture fields in functions
Principles and Rules
To achieve the goals, we uphold three fundamental principles: stackability, monotonicity and scopability.
Stackability means that objects are initialized in stack order: if the
object b
is created during the initialization of object a
, then
all fields of b
should become initialized before or at the same time
as a
. Scala enforces this property in syntax by demanding that all
fields are initialized at the end of the primary constructor, except
for the language feature below:
var x: T = _
Control effects such as exceptions may break this property, as the following example shows:
class MyException(val b: B) extends Exception("")
class A {
val b = try { new B } catch { case myEx: MyException => myEx.b }
println(b.a)
}
class B {
throw new MyException(this)
val a: Int = 1
}
In the code above, the control effect teleport the uninitialized value wrapped in an exception. In the implementation, we avoid the problem by ensuring that the values that are thrown must be transitively initialized.
Monotonicity means that the initialization status of an object should not go backward: initialized fields continue to be initialized, a field points to an initialized object may not later point to an object under initialization. As an example, the following code will be rejected:
trait Reporter { def report(msg: String): Unit }
class FileReporter(ctx: Context) extends Reporter {
ctx.typer.reporter = this // ctx now reaches an uninitialized object
val file: File = new File("report.txt")
def report(msg: String) = file.write(msg)
}
In the code above, suppose ctx
points to a transitively initialized
object. Now the assignment at line 3 makes this
, which is not fully
initialized, reachable from ctx
. This makes field usage dangerous,
as it may indirectly reach uninitialized fields.
Monotonicity is based on a well-known technique called heap monotonic typestate to ensure soundness in the presence of aliasing [1]. Otherwise, either soundness will be compromised or we have to disallow the usage of already initialized fields.
Scopability means that an expression may only access existing objects via formal
parameters and this
. More precisely, given any environment ρ
(which are the
value bindings for method parameters and this
) and heap σ
for evaluating an expression
e
, if the resulting value reaches an object o
pre-existent in σ
, then o
is reachable from ρ
in σ
. Control effects like coroutines, delimited
control, resumable exceptions may break the property, as they can transport a
value upper in the stack (not in scope) to be reachable from the current scope.
Static fields can also serve as a teleport thus breaks this property. In the
implementation, we need to enforce that teleported values are transitively
initialized.
With the established principles and design goals, following rules are imposed:
-
In an assignment
o.x = e
, the expressione
may only point to transitively initialized objects.This is how monotonicity is enforced in the system. Note that in an initialization
val f: T = e
, the expressione
may point to an object under initialization. This requires a distinction between mutation and initialization in order to enforce different rules. Scala has different syntax for them, it thus is not an issue. -
References to objects under initialization may not be passed as arguments to method calls or constructors.
Escape of
this
in the constructor is commonly regarded as an anti-pattern, and it's rarely used in practice. This rule is simple for the programmer to reason about initialization and it simplifies implementation. The theory supports safe escape ofthis
with the help of annotations, we delay the extension until there is a strong need. -
Local definitions may only refer to transitively initialized objects.
It means that in a local definition
val x: T = e
, the expressione
may only evaluate to transitively initialized objects. The same goes for local lazy variables and methods. This rule is again motivated for simplicity in reasoning about initialization: programmers may safely assume that all local definitions only point to transitively initialized objects.
Modularity
For modularity, we forbid subtle initialization interaction beyond project boundaries. For example, the following code passes the check when the two classes are defined in the same project:
class Base {
private val map: mutable.Map[Int, String] = mutable.Map.empty
def enter(k: Int, v: String) = map(k) = v
}
class Child extends Base {
enter(1, "one")
enter(2, "two")
}
However, when the class Base
and Child
are defined in two different
projects, the check will emit a warning for the calls to enter
in the class
Child
. This restricts subtle initialization within project boundaries,
and avoids accidental violation of contracts across library versions.
We impose the following rules to enforce modularity:
-
A class or trait that may be extended in another project should not call virtual methods on
this
in its template/mixin evaluation, directly or indirectly. -
The method call
o.m(args)
is forbidden ifo
is not transitively initialized and the target ofm
is defined in an external project. - The expression
new p.C(args)
is forbidden, ifp
is not transitively initialized andC
is defined in an external project.
Theoretically, we may analyze across project boundaries based on tasty. However, from our experience with Dotty community projects, most subtle initialization patterns are restricted in the same project. As the rules only report warnings instead of errors, we think it is good to first impose more strict rules, The feedback from the community is welcome.
Theory
The theory is based on type-and-effect systems [2]. We introduce two concepts, effects and potentials:
π = C.this | Warm(C, π) | π.f | π.m | π.super[D] | Cold | Fun(Π, Φ) | Outer(C, π)
ϕ = π↑ | π.f! | π.m!
Potentials (π) represent values that are possibly under initialization.
C.this
: current objectWarm(C, π)
: an object of typeC
where all its fields are assigned, and the potential forthis
of its enclosing class isπ
.π.f
: the potential of the fieldf
in the potentialπ
π.m
: the potential of the fieldf
in the potentialπ
π.super[D]
: essentially the object π, used for virtual method resolutionCold
: an object with unknown initialization statusFun(Π, Φ)
: a function, when called produce effects Φ and return potentials Π.Outer(C, π)
: the potential ofthis
for the enclosing class ofC
whenC.this
isπ
.
Effects are triggered from potentials:
π↑
: promote the object pointed to by the potentialπ
to fully-initializedπ.f!
: access fieldf
on the potentialπ
π.m!
: call the methodm
on the potentialπ
To ensure that the checking always terminate and for better performance, we restrict the length of potentials to be finite (by default 2). If the potential is too long, the checker stops tracking it by checking that the potential is actually transitively initialized.
For an expression e
, it may be summarized by the pair (Π, Φ)
,
which means evaluation of e
may produce the effects Φ and return the
potentials Π. Each field and method is associated with such a pair.
We call such a pair summary. The expansion of proxy potentials and effects,
such as π.f
, π.m
and π.m!
, will take advantage of the summaries.
Depending on the potential π
for this
, the summaries need to be rebased (asSeenFrom
) before usage.
The checking treats the templates of concrete classes as entry points.
It maintains the set of initialized fields as initialization
progresses, and check that only initialized fields are accessed during
the initialization and there is no leaking of values under initialization.
Virtual method calls on this
is not a problem,
as they can always be resolved statically.
More details can be found in a forthcoming paper.
Back Doors
Occasionally you may want to suppress warnings reported by the
checker. You can either write e: @unchecked
to tell the checker to
skip checking for the expression e
, or you may use the old trick:
mark some fields as lazy.
Caveats
The system cannot handle static fields, nor does it provide safety guarantee when extending Java or Scala 2 classes. Calling methods of Java or Scala 2 is always safe.
References
- Fähndrich, M. and Leino, K.R.M., 2003, July. Heap monotonic typestates. In International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (IWACO).
- Lucassen, J.M. and Gifford, D.K., 1988, January. Polymorphic effect systems. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (pp. 47-57). ACM.