Capture Checking Basics
Introduction
Capture checking can be enabled by the language import
import language.experimental.captureChecking
At present, capture checking is still highly experimental and unstable, and it evolves quickly. Before trying it out, make sure you have the latest version of Scala.
To get an idea what capture checking can do, let's start with a small example:
def usingLogFile[T](op: FileOutputStream => T): T =
val logFile = FileOutputStream("log")
val result = op(logFile)
logFile.close()
result
The usingLogFile
method invokes a given operation with a fresh log file as parameter. Once the operation has ended, the log file is closed and the operation's result is returned. This is a typical try-with-resources pattern, similar to many other such patterns which are often supported by special language constructs in other languages.
The problem is that usingLogFile
's implementation is not entirely safe. One can undermine it by passing an operation that performs the logging at some later point after it has terminated. For instance:
val later = usingLogFile { file => () => file.write(0) }
later() // crash
When later
is executed it tries to write to a file that is already closed, which results in an uncaught IOException
.
Capture checking gives us the mechanism to prevent such errors statically. To prevent unsafe usages of usingLogFile
, we can declare it like this:
def usingLogFile[T](op: FileOutputStream^ => T): T =
// same body as before
The only thing that's changed is that the FileOutputStream
parameter of op
is now followed by ^
. We'll see that this turns the parameter into a capability whose lifetime is tracked.
If we now try to define the problematic value later
, we get a static error:
| val later = usingLogFile { f => () => f.write(0) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^
| Found: (f: java.io.FileOutputStream^'s1) ->'s2 () ->{f} Unit
| Required: java.io.FileOutputStream^ => () ->'s3 Unit
|
| Note that capability f cannot be included in outer capture set 's3.
In this case, it was easy to see that the logFile
capability escapes in the closure passed to usingLogFile
. But capture checking also works for more complex cases. For instance, capture checking is able to distinguish between the following safe code:
val xs = usingLogFile { f =>
List(1, 2, 3).map { x => f.write(x); x * x }
}
and the following unsafe one:
val xs = usingLogFile { f =>
LzyList(1, 2, 3).map { x => f.write(x); x * x }
}
An error would be issued in the second case, but not the first one (this assumes a capture-aware formulation LzyList
of lazily evaluated lists, which we will present later in this page).
It turns out that capture checking has very broad applications. Besides the various try-with-resources patterns, it can also be a key part to the solutions of many other long standing problems in programming languages. Among them:
- How to have a simple and flexible system for checked exceptions. We show later how capture checking enables a clean and fully safe system for checked exceptions in Scala.
- How to address the problem of effect polymorphism in general.
- How to solve the "what color is your function?" problem of mixing synchronous and asynchronous computations.
- How to do region-based allocation, safely.
- How to reason about capabilities associated with memory locations.
The following sections explain in detail how capture checking works in Scala 3.
Capabilities and Capturing Types
The capture checker extension introduces a new kind of types and it enforces some rules for working with these types.
Capture checking is done in terms of capturing types of the form T^{c₁, ..., cᵢ}
. Here T
is a type, and {c₁, ..., cᵢ}
is a capture set consisting of references to capabilities c₁, ..., cᵢ
.
An object capability is syntactically a method- or class-parameter, a local variable, or the this
of an enclosing class. The type of a capability must be a capturing type with a non-empty capture set. We also say that variables that are capabilities are tracked.
In a sense, every capability gets its authority from some other, more sweeping capability which it captures. The recursion stops with a universal capability, written cap
, from which all other capabilities are ultimately derived. If T
is a type, then T^
is a shorthand for T^{cap}
, meaning T
can capture arbitrary capabilities.
Here is an example:
class FileSystem
class Logger(fs: FileSystem^):
def log(s: String): Unit = ... // Write to a log file, using `fs`
def test(fs: FileSystem^) =
val l: Logger^{fs} = Logger(fs)
l.log("hello world!")
val xs: LzyList[Int]^{l} =
LzyList.from(1)
.map { i =>
l.log(s"computing elem # $i")
i * i
}
xs
Here, the test
method takes a FileSystem
as a parameter. fs
is a capability since its type has a non-empty capture set. The capability is passed to the Logger
constructor and retained as a field in class Logger
. Hence, the local variable l
has type Logger^{fs}
: it is a Logger
which retains the fs
capability.
The second variable defined in test
is xs
, a lazy list that is obtained from LzyList.from(1)
by logging and mapping consecutive numbers. Since the list is lazy, it needs to retain the reference to the logger l
for its computations. Hence, the type of the list is LzyList[Int]^{l}
. On the other hand, since xs
only logs but does not do other file operations, it retains the fs
capability only indirectly. That's why fs
does not show up in the capture set of xs
.
Capturing types come with a subtype relation where types with "smaller" capture sets are subtypes of types with larger sets (the subcapturing relation is defined in more detail below). If a type T
does not have a capture set, it is called pure, and is a subtype of any capturing type that adds a capture set to T
.
Function Types
The usual function type A => B
now stands for a function that can capture arbitrary capabilities. We call such functions impure. By contrast, the new single arrow function type A -> B
stands for a function that cannot capture any capabilities, or otherwise said, is pure. One can add a capture set after the arrow of an otherwise pure function. For instance, A ->{c, d} B
would be a function that can capture capabilities c
and d
, but no others. This type is a shorthand for (A -> B)^{c, d}
, i.e. the function type A -> B
with possible captures {c, d}
.
The impure function type A => B
is treated as an alias for A ->{cap} B
. That is, impure functions are functions that can capture anything.
A capture annotation ^
binds more strongly than a function arrow. So A -> B^{c}
is read as A -> (B^{c})
and A -> B^
is read as A -> (B^)
.
Analogous conventions apply to context function types. A ?=> B
is an impure context function, with A ?-> B
as its pure complement.
Note 1: The identifiers ->
and ?->
are now treated as soft keywords when used as infix type operators. They are still available as regular identifiers for terms. For instance, the mapping syntax Map("x" -> 1, "y" -> 2)
is still supported since it only applies to terms.
Note 2: The distinctions between pure vs impure function types do not apply to methods. In fact, since methods are not values they never capture anything directly. References to capabilities in a method are instead counted in the capture set of the enclosing object.
By-Name Parameter Types
A convention analogous to function types also extends to by-name parameters. In
def f(x: => Int): Int
the actual argument can refer to arbitrary capabilities. So the following would be OK:
f(if p(y) then throw Ex() else 1)
On the other hand, if f
was defined like this
def f(x: -> Int): Int
the actual argument to f
could not refer to any capabilities, so the call above would be rejected. One can also allow specific capabilities like this:
def f(x: ->{c} Int): Int
Here, the actual argument to f
is allowed to use the c
capability but no others.
Subtyping and Subcapturing
Capturing influences subtyping. As usual we write T₁ <: T₂
to express that the type T₁
is a subtype of the type T₂
, or equivalently, that T₁
conforms to T₂
. An analogous subcapturing relation applies to capture sets. If C₁
and C₂
are capture sets, we write C₁ <: C₂
to express that C₁
is covered by C₂
, or, swapping the operands, that C₂
covers C₁
.
Subtyping extends as follows to capturing types:
- Pure types are subtypes of capturing types. That is,
T <: T ^ C
, for any typeT
, capturing setC
. - For capturing types, smaller capturing sets produce subtypes:
T₁ ^ C₁ <: T₂ ^ C₂
ifC₁ <: C₂
andT₁ <: T₂
.
A subcapturing relation C₁ <: C₂
holds if C₂
accounts for every element c
in C₁
. This means one of the following three conditions must be true:
c ∈ C₂
,c
refers to a parameter of some classCls
andC₂
containsCls.this
,c
's type has capturing setC
andC₂
accounts for every element ofC
(that is,C <: C₂
).
Example 1. Given
fs: FileSystem^
ct: CanThrow[Exception]^
l : Logger^{fs}
we have
{l} <: {fs} <: {cap}
{fs} <: {fs, ct} <: {cap}
{ct} <: {fs, ct} <: {cap}
The set consisting of the root capability {cap}
covers every other capture set. This is a consequence of the fact that, ultimately, every capability is created from cap
.
Example 2. Consider again the FileSystem/Logger example from before. LzyList[Int]
is a proper subtype of LzyList[Int]^{l}
. So if the test
method in that example was declared with a result type LzyList[Int]
, we'd get a type error. Here is the error message:
11 |def test(using fs: FileSystem^): LzyList[Int] = {
| ^
| Found: LzyList[Int]^{fs}
| Required: LzyList[Int]
Why does it say LzyList[Int]^{fs}
and not LzyList[Int]^{l}
, which is, after all, the type of the returned value xs
? The reason is that l
is a local variable in the body of test
, so it cannot be referred to in a type outside that body. What happens instead is that the type is widened to the smallest supertype that does not mention l
. Since l
has capture set fs
, we have that {fs}
covers {l}
, and {fs}
is acceptable in a result type of test
, so {fs}
is the result of that widening. This widening is called avoidance; it is not specific to capture checking but applies to all variable references in Scala types.
Capability Classes
Classes like CanThrow
or FileSystem
have the property that their values are always intended to be capabilities. We can make this intention explicit and save boilerplate by letting these classes extend the SharedCapability
class defined in object cap
.
The capture set of type extending SharedCapability
is always {cap}
. This means we could equivalently express the FileSystem
and Logger
classes as follows:
import caps.SharedCapability
class FileSystem extends SharedCapability
class Logger(using FileSystem):
def log(s: String): Unit = ???
def test(using fs: FileSystem) =
val l: Logger^{fs} = Logger()
...
In this version, FileSystem
is a capability class, which means that the {cap}
capture set is implied on the parameters of Logger
and test
.
Another, unrelated change in the version of the last example here is that the FileSystem
capability is now passed as an implicit parameter. It is quite natural to model capabilities with implicit parameters since it greatly reduces the wiring overhead once multiple capabilities are in play.
Escape Checking
Capabilities follow the usual scoping discipline, which means that capture sets can contain only capabilities that are visible at the point where the set is defined.
We now reconstruct how this principle produced the error in the introductory example, where usingLogFile
was declared like this:
def usingLogFile[T](op: FileOutputStream^ => T): T = ...
The error message was:
| val later = usingLogFile { f => () => f.write(0) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^
| Found: (f: java.io.FileOutputStream^'s1) ->'s2 () ->{f} Unit
| Required: java.io.FileOutputStream^ => () ->'s3 Unit
|
| Note that capability f cannot be included in outer capture set 's3.
This error message was produced by the following logic:
- The
f
parameter has typeFileOutputStream^
, which makes it a capability. - Therefore, the type of the expression
() => f.write(0)
is() ->{f} Unit
. - This makes the type of the whole closure passed to
usingLogFile
the dependent function type(f: FileOutputStream^'s1) ->'s2 () ->{f} Unit
, for some as yet uncomputed capture sets's1
and's2
. - The expected type of the closure is a simple, parametric, impure function type
FileOutputStream^ => T
, for some instantiation of the type variableT
. - Matching with the found type,
T
must have the shape() ->'s3 Unit
, for some capture set's3
defined at the level of valuelater
. - That capture set cannot include the capability
f
sincef
is locally bound. This causes the error.
An analogous restriction applies to the type of a mutable variable. Another way one could try to undermine capture checking would be to assign a closure with a local capability to a global variable. Maybe like this:
var loophole: () => Unit = () => ()
usingLogFile { f =>
loophole = () => f.write(0)
}
loophole()
But this will not compile either, since the capture set of the mutable variable loophole
cannot refer to variable f
, which is not visible where loophole
is defined.
Looking at object graphs, we observe a monotonicity property: The capture set of an object x
covers the capture sets of all objects reachable through x
. This property is reflected in the type system by the following monotonicity rule:
- In a class
C
with a fieldf
, the capture set{this}
covers the capture set{this.f}
as well as the capture set of any application ofthis.f
to pure arguments.