TastyFormat
********************************************************** Notation:
********************************************************** Notation:
We use BNF notation. Terminal symbols start with at least two consecutive upper case letters. Each terminal is represented as a single byte tag. Non-terminals are mixed case. Prefixes of the form lower case letter*_ are for explanation of semantic content only, they can be dropped without changing the grammar.
Micro-syntax:
LongInt = Digit* StopDigit -- big endian 2's complement, value fits in a Long w/o overflow Int = LongInt -- big endian 2's complement, fits in an Int w/o overflow Nat = LongInt -- non-negative value, fits in an Int without overflow Digit = 0 | ... | 127 StopDigit = 128 | ... | 255 -- value = digit - 128
Macro-format:
File = Header majorVersion_Nat minorVersion_Nat experimentalVersion_Nat VersionString UUID nameTable_Length Name* Section* Header = 0x5CA1AB1F UUID = Byte*16 -- random UUID VersionString = Length UTF8-CodePoint* -- string that represents the compiler that produced the TASTy
Section = NameRef Length Bytes Length = Nat -- length of rest of entry in bytes
Name = UTF8 Length UTF8-CodePoint* QUALIFIED Length qualified_NameRef selector_NameRef -- A.B EXPANDED Length qualified_NameRef selector_NameRef -- A$$B, semantically a NameKinds.ExpandedName EXPANDPREFIX Length qualified_NameRef selector_NameRef -- A$B, prefix of expanded name, see NamedKinds.ExpandPrefixName
UNIQUE Length separator_NameRef uniqid_Nat underlying_NameRef? -- Unique name A
SUPERACCESSOR Length underlying_NameRef -- super$A INLINEACCESSOR Length underlying_NameRef -- inline$A OBJECTCLASS Length underlying_NameRef -- A$ (name of the module class for module A)
SIGNED Length original_NameRef resultSig_NameRef ParamSig* -- name + signature TARGETSIGNED Length original_NameRef target_NameRef resultSig_NameRef ParamSig*
ParamSig = Int // If negative, the absolute value represents the length of a type parameter section // If positive, this is a NameRef for the fully qualified name of a term parameter.
NameRef = Nat // ordinal number of name in name table, starting from 1.
Note: Unqualified names in the name table are strings. The context decides whether a name is a type-name or a term-name. The same string can represent both.
Standard-Section: "ASTs" TopLevelStat*
TopLevelStat = PACKAGE Length Path TopLevelStat* -- package path { topLevelStats } Stat
Stat = Term ValOrDefDef TYPEDEF Length NameRef (type_Term | Template) Modifier* -- modifiers type name (= type | bounds) | modifiers class name template IMPORT Length qual_Term Selector* -- import qual selectors EXPORT Length qual_Term Selector* -- export qual selectors ValOrDefDef = VALDEF Length NameRef type_Term rhs_Term? Modifier* -- modifiers val name : type (= rhs)? DEFDEF Length NameRef Param* returnType_Term rhs_Term? Modifier* -- modifiers def name [typeparams] paramss : returnType (= rhs)? Selector = IMPORTED name_NameRef -- name, "_" for normal wildcards, "" for given wildcards RENAMED to_NameRef -- => name BOUNDED type_Term -- type bound
TypeParam = TYPEPARAM Length NameRef type_Term Modifier* -- modifiers name bounds
TermParam = PARAM Length NameRef type_Term rhs_Term? Modifier* -- modifiers name : type (= rhs_Term)?. rhsTerm
is present in the case of an aliased class parameter
EMPTYCLAUSE -- an empty parameter clause ()
SPLITCLAUSE -- splits two non-empty parameter clauses of the same kind
Param = TypeParam
TermParam
Template = TEMPLATE Length TypeParam* TermParam* parent_Term* Self?
Stat* -- [typeparams] paramss extends parents { self => stats }, where Stat* always starts with the primary constructor.
Self = SELFDEF selfName_NameRef selfType_Term -- selfName : selfType
Term = Path -- Paths represent both types and terms
IDENT NameRef Type -- Used when term ident’s type is not a TermRef
SELECT possiblySigned_NameRef qual_Term -- qual.name
SELECTin Length possiblySigned_NameRef qual_Term owner_Type -- qual.name, referring to a symbol declared in owner that has the given signature (see note below)
QUALTHIS typeIdent_Tree -- id.this, different from THIS in that it contains a qualifier ident with position.
NEW clsType_Term -- new cls
THROW throwableExpr_Term -- throw throwableExpr
NAMEDARG paramName_NameRef arg_Term -- paramName = arg
APPLY Length fn_Term arg_Term* -- fn(args)
TYPEAPPLY Length fn_Term arg_Type* -- fn[args]
SUPER Length this_Term mixinTypeIdent_Tree? -- super[mixin]
TYPED Length expr_Term ascriptionType_Term -- expr: ascription
ASSIGN Length lhs_Term rhs_Term -- lhs = rhs
BLOCK Length expr_Term Stat* -- { stats; expr }
INLINED Length expr_Term call_Term? ValOrDefDef* -- Inlined code from call, with given body expr
and given bindings
LAMBDA Length meth_Term target_Type? -- Closure over method f
of type target
(omitted id target
is a function type)
IF Length [INLINE] cond_Term then_Term else_Term -- inline? if cond then thenPart else elsePart
MATCH Length (IMPLICIT | [INLINE] sel_Term) CaseDef* -- (inline? sel | implicit) match caseDefs
TRY Length expr_Term CaseDef* finalizer_Term? -- try expr catch {casdeDef} (finally finalizer)?
RETURN Length meth_ASTRef expr_Term? -- return expr?, methASTRef
is method from which is returned
WHILE Length cond_Term body_Term -- while cond do body
REPEATED Length elem_Type elem_Term* -- Varargs argument of type elem
SELECTouter Length levels_Nat qual_Term underlying_Type -- Follow levels
outer links, starting from qual
, with given underlying
type
-- patterns:
BIND Length boundName_NameRef patType_Type pat_Term -- name @ pat, wherev patType
is the type of the bound symbol
ALTERNATIVE Length alt_Term* -- alt1 | ... | altn as a pattern
UNAPPLY Length fun_Term ImplicitArg* pat_Type pat_Term* -- Unapply node fun(_: pat_Type)(implicitArgs)
flowing into patterns pat
.
-- type trees:
IDENTtpt NameRef Type -- Used for all type idents
SELECTtpt NameRef qual_Term -- qual.name
SINGLETONtpt ref_Term -- ref.type
REFINEDtpt Length underlying_Term refinement_Stat* -- underlying {refinements}
APPLIEDtpt Length tycon_Term arg_Term* -- tycon [args]
LAMBDAtpt Length TypeParam* body_Term -- [TypeParams] => body
TYPEBOUNDStpt Length low_Term high_Term? -- >: low <: high
ANNOTATEDtpt Length underlying_Term fullAnnotation_Term -- underlying @ annotation
MATCHtpt Length bound_Term? sel_Term CaseDef* -- sel match { CaseDef } where bound
is optional upper bound of all rhs
BYNAMEtpt underlying_Term -- => underlying
SHAREDterm term_ASTRef -- Link to previously serialized term
HOLE Length idx_Nat arg_Tree* -- Hole where a splice goes with sequence number idx, splice is applied to arguments arg
s
CaseDef = CASEDEF Length pat_Term rhs_Tree guard_Tree? -- case pat if guard => rhs ImplicitArg = IMPLICITARG arg_Term -- implicit unapply argument
ASTRef = Nat -- Byte position in AST payload
Path = Constant
TERMREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol.
TERMREFsymbol sym_ASTRef qual_Type -- A reference qual.sym
to a local member with prefix qual
TERMREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name
TERMREF possiblySigned_NameRef qual_Type -- A reference qual.name
to a non-local member
TERMREFin Length possiblySigned_NameRef qual_Type owner_Type -- A reference qual.name
referring to a non-local symbol declared in owner that has the given signature (see note below)
THIS clsRef_Type -- cls.this
RECthis recType_ASTRef -- The this
in a recursive refined type recType
.
SHAREDtype path_ASTRef -- link to previously serialized path
Constant = UNITconst -- () FALSEconst -- false TRUEconst -- true BYTEconst Int -- A byte number SHORTconst Int -- A short number CHARconst Nat -- A character INTconst Int -- An int number LONGconst LongInt -- A long number FLOATconst Int -- A float number DOUBLEconst LongInt -- A double number STRINGconst NameRef -- A string literal NULLconst -- null CLASSconst Type -- classOf[Type]
Type = Path -- Paths represent both types and terms
TYPEREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol.
TYPEREFsymbol sym_ASTRef qual_Type -- A reference qual.sym
to a local member with prefix qual
TYPEREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name
TYPEREF NameRef qual_Type -- A reference qual.name
to a non-local member
TYPEREFin Length NameRef qual_Type namespace_Type -- A reference qual.name
to a non-local member that's private in namespace
.
RECtype parent_Type -- A wrapper for recursive refined types
SUPERtype Length this_Type underlying_Type -- A super type reference to underlying
REFINEDtype Length underlying_Type refinement_NameRef info_Type -- underlying { refinement_name : info }
APPLIEDtype Length tycon_Type arg_Type* -- tycon[args]
TYPEBOUNDS Length lowOrAlias_Type high_Type? Variance* -- = alias or >: low <: high, possibly with variances of lambda parameters
ANNOTATEDtype Length underlying_Type annotation_Term -- underlying @ annotation
ANDtype Length left_Type right_Type -- left & right
ORtype Length left_Type right_Type -- lefgt | right
MATCHtype Length bound_Type sel_Type case_Type* -- sel match {cases} with optional upper bound
MATCHCASEtype Length pat_type rhs_Type -- match cases are MATCHCASEtypes or TYPELAMBDAtypes over MATCHCASEtypes
BIND Length boundName_NameRef bounds_Type Modifier* -- boundName @ bounds, for type-variables defined in a type pattern
BYNAMEtype underlying_Type -- => underlying
PARAMtype Length binder_ASTRef paramNum_Nat -- A reference to parameter # paramNum in lambda type binder
POLYtype Length result_Type TypesNames -- A polymorphic method type [TypesNames]result
, used in refinements
METHODtype Length result_Type TypesNames Modifier* -- A method type (Modifier* TypesNames)result
, needed for refinements, with optional modifiers for the parameters
TYPELAMBDAtype Length result_Type TypesNames -- A type lambda [TypesNames] => result
SHAREDtype type_ASTRef -- link to previously serialized type
TypesNames = TypeName*
TypeName = typeOrBounds_ASTRef paramName_NameRef -- (termName
: type
) or (typeName
bounds
)
Modifier = PRIVATE -- private
INTERNAL -- package private (not yet used)
PROTECTED -- protected
PRIVATEqualified qualifier_Type -- private[qualifier] (to be dropped(?)
PROTECTEDqualified qualifier_Type -- protecred[qualifier] (to be dropped(?)
ABSTRACT -- abstract
FINAL -- final
SEALED -- sealed
CASE -- case (for classes or objects)
IMPLICIT -- implicit
GIVEN -- given
ERASED -- erased
LAZY -- lazy
OVERRIDE -- override
OPAQUE -- opaque, also used for classes containing opaque aliases
INLINE -- inline
MACRO -- Inline method containing toplevel splices
INLINEPROXY -- Symbol of binding with an argument to an inline method as rhs (TODO: do we still need this?)
STATIC -- Mapped to static Java member
OBJECT -- An object or its class
TRAIT -- A trait
ENUM -- A enum class or enum case
LOCAL -- private[this] or protected[this], used in conjunction with PRIVATE or PROTECTED
SYNTHETIC -- Generated by Scala compiler
ARTIFACT -- To be tagged Java Synthetic
MUTABLE -- A var
FIELDaccessor -- A getter or setter (note: the corresponding field is not serialized)
CASEaccessor -- A getter for a case class parameter
COVARIANT -- A type parameter marked “+”
CONTRAVARIANT -- A type parameter marked “-”
HASDEFAULT -- Parameter with default arg; method with default parameters (default arguments are separate methods with DEFAULTGETTER names)
STABLE -- Method that is assumed to be stable, i.e. its applications are legal paths
EXTENSION -- An extension method
PARAMsetter -- The setter part x_=
of a var parameter x
which itself is pickled as a PARAM
PARAMalias -- Parameter is alias of a superclass parameter
EXPORTED -- An export forwarder
OPEN -- an open class
INVISIBLE -- invisible during typechecking
Annotation
Variance = STABLE -- invariant | COVARIANT | CONTRAVARIANT
Annotation = ANNOTATION Length tycon_Type fullAnnotation_Term -- An annotation, given (class) type of constructor, and full application tree
Note: The signature of a SELECTin or TERMREFin node is the signature of the selected symbol, not the signature of the reference. The latter undergoes an asSeenFrom but the former does not.
Note: Tree tags are grouped into 5 categories that determine what follows, and thus allow to compute the size of the tagged tree in a generic way.
Category 1 (tags 1-59) : tag
Category 2 (tags 60-89) : tag Nat
Category 3 (tags 90-109) : tag AST
Category 4 (tags 110-127): tag Nat AST
Category 5 (tags 128-255): tag Length
Standard-Section: "Positions" LinesSizes Assoc*
LinesSizes = Nat Nat* // Number of lines followed by the size of each line not counting the trailing \n
Assoc = Header offset_Delta? offset_Delta? point_Delta? | SOURCE nameref_Int Header = addr_Delta + // in one Nat: difference of address to last recorded node << 3 + hasStartDiff + // one bit indicating whether there follows a start address delta << 2 hasEndDiff + // one bit indicating whether there follows an end address delta << 1 hasPoint // one bit indicating whether the new position has a point (i.e ^ position) // Nodes which have the same positions as their parents are omitted. // offset_Deltas give difference of start/end offset wrt to the // same offset in the previously recorded node (or 0 for the first recorded node) Delta = Int // Difference between consecutive offsets, SOURCE = 4 // Impossible as header, since addr_Delta = 0 implies that we refer to the // same tree as the previous one, but then hasStartDiff = 1 implies that // the tree's range starts later than the range of itself.
All elements of a position section are serialized as Ints
Standard Section: "Comments" Comment*
Comment = Length Bytes LongInt // Raw comment's bytes encoded as UTF-8, followed by the comment's coordinates.
************************************************************************************
Type members
Classlikes
Tags used to serialize names, should update TastyFormat$.nameTagToString if a new constant is added
Tags used to serialize names, should update TastyFormat$.nameTagToString if a new constant is added
- Companion
- object
Value members
Concrete methods
This method implements a binary relation (<:<
) between two TASTy versions.
This method implements a binary relation (<:<
) between two TASTy versions.
We label the lhs file
and rhs compiler
.
if file <:< compiler
then the TASTy file is valid to be read.
A TASTy version, e.g. v := 28.0-3
is composed of three fields:
- v.major == 28
- v.minor == 0
- v.experimental == 3
TASTy versions have a partial order, for example,
a <:< b
and b <:< a
are both false if
a
andb
have differentmajor
fields.a
andb
have differentexperimental
fields, both non-zero.
A zero value for an experimental
field is considered
to be a stable TASTy version, and has higher precedence
than a TASTy version with the same major
/minor
fields
but a non-zero experimental
field.
We follow the given algorithm:
if file.major != compiler.major then
return incompatible
if compiler.experimental == 0 then
if file.experimental != 0 then
return incompatible
if file.minor > compiler.minor then
return incompatible
else
return compatible
else invariant[compiler.experimental != 0]
if file.experimental == compiler.experimental then
if file.minor == compiler.minor then
return compatible (all fields equal)
else
return incompatible
else if file.experimental == 0,
if file.minor < compiler.minor then
return compatible (an experimental version can read a previous released version)
else
return incompatible (an experimental version cannot read its own minor version or any later version)
else invariant[file.experimental is non-0 and different than compiler.experimental]
return incompatible
Concrete fields
Natural Number.
Natural Number. The ExperimentalVersion
allows for
experimentation with changes to TASTy without committing
to any guarantees of compatibility.
A zero value indicates that the TASTy version is from a stable, final release.
A strictly positive value indicates that the TASTy
version is experimental. An experimental TASTy file
can only be read by a tool with the same version.
However, tooling with an experimental TASTy version
is able to read final TASTy documents if the file's
MinorVersion
is strictly less than the current value.
Natural number.
Natural number. Each increment of the MajorVersion
begins a
new series of backward compatible TASTy versions.
A TASTy file in either the preceeding or succeeding series is incompatible with the current value.
Natural number.
Natural number. Each increment of the MinorVersion
, within
a series declared by the MajorVersion
, breaks forward
compatibility, but remains backwards compatible, with all
preceeding MinorVersion
.
The first four bytes of a TASTy file, followed by four values:
- MajorVersion: Int
- see definition in TastyFormat
- MinorVersion: Int
- see definition in TastyFormat
- ExperimentalVersion: Int
- see definition in TastyFormat
- ToolingVersion: String
- arbitrary length string representing the tool that produced the TASTy.
The first four bytes of a TASTy file, followed by four values:
- MajorVersion: Int
- see definition in TastyFormat
- MinorVersion: Int
- see definition in TastyFormat
- ExperimentalVersion: Int
- see definition in TastyFormat
- ToolingVersion: String
- arbitrary length string representing the tool that produced the TASTy.