TastyFormat

object TastyFormat

********************************************************** Notation:

********************************************************** Notation:

We use BNF notation. Terminal symbols start with at least two consecutive upper case letters. Each terminal is represented as a single byte tag. Non-terminals are mixed case. Prefixes of the form lower case letter*_ are for explanation of semantic content only, they can be dropped without changing the grammar.

Micro-syntax:

LongInt = Digit* StopDigit -- big endian 2's complement, value fits in a Long w/o overflow Int = LongInt -- big endian 2's complement, fits in an Int w/o overflow Nat = LongInt -- non-negative value, fits in an Int without overflow Digit = 0 | ... | 127 StopDigit = 128 | ... | 255 -- value = digit - 128

Macro-format:

File = Header majorVersion_Nat minorVersion_Nat experimentalVersion_Nat VersionString UUID nameTable_Length Name* Section* Header = 0x5CA1AB1F UUID = Byte*16 -- random UUID VersionString = Length UTF8-CodePoint* -- string that represents the compiler that produced the TASTy

Section = NameRef Length Bytes Length = Nat -- length of rest of entry in bytes

Name = UTF8 Length UTF8-CodePoint* QUALIFIED Length qualified_NameRef selector_NameRef -- A.B EXPANDED Length qualified_NameRef selector_NameRef -- A$$B, semantically a NameKinds.ExpandedName EXPANDPREFIX Length qualified_NameRef selector_NameRef -- A$B, prefix of expanded name, see NamedKinds.ExpandPrefixName

UNIQUE Length separator_NameRef uniqid_Nat underlying_NameRef? -- Unique name A DEFAULTGETTER Length underlying_NameRef index_Nat -- DefaultGetter$

SUPERACCESSOR Length underlying_NameRef -- super$A INLINEACCESSOR Length underlying_NameRef -- inline$A OBJECTCLASS Length underlying_NameRef -- A$ (name of the module class for module A)

SIGNED Length original_NameRef resultSig_NameRef ParamSig* -- name + signature TARGETSIGNED Length original_NameRef target_NameRef resultSig_NameRef ParamSig*

ParamSig = Int // If negative, the absolute value represents the length of a type parameter section // If positive, this is a NameRef for the fully qualified name of a term parameter.

NameRef = Nat // ordinal number of name in name table, starting from 1.

Note: Unqualified names in the name table are strings. The context decides whether a name is a type-name or a term-name. The same string can represent both.

Standard-Section: "ASTs" TopLevelStat*

TopLevelStat = PACKAGE Length Path TopLevelStat* -- package path { topLevelStats } Stat

Stat = Term ValOrDefDef TYPEDEF Length NameRef (type_Term | Template) Modifier* -- modifiers type name (= type | bounds) | modifiers class name template IMPORT Length qual_Term Selector* -- import qual selectors EXPORT Length qual_Term Selector* -- export qual selectors ValOrDefDef = VALDEF Length NameRef type_Term rhs_Term? Modifier* -- modifiers val name : type (= rhs)? DEFDEF Length NameRef Param* returnType_Term rhs_Term? Modifier* -- modifiers def name [typeparams] paramss : returnType (= rhs)? Selector = IMPORTED name_NameRef -- name, "_" for normal wildcards, "" for given wildcards RENAMED to_NameRef -- => name BOUNDED type_Term -- type bound

TypeParam = TYPEPARAM Length NameRef type_Term Modifier* -- modifiers name bounds TermParam = PARAM Length NameRef type_Term rhs_Term? Modifier* -- modifiers name : type (= rhs_Term)?. rhsTerm is present in the case of an aliased class parameter EMPTYCLAUSE -- an empty parameter clause () SPLITCLAUSE -- splits two non-empty parameter clauses of the same kind Param = TypeParam TermParam Template = TEMPLATE Length TypeParam* TermParam* parent_Term* Self? Stat* -- [typeparams] paramss extends parents { self => stats }, where Stat* always starts with the primary constructor. Self = SELFDEF selfName_NameRef selfType_Term -- selfName : selfType

Term = Path -- Paths represent both types and terms IDENT NameRef Type -- Used when term ident’s type is not a TermRef SELECT possiblySigned_NameRef qual_Term -- qual.name SELECTin Length possiblySigned_NameRef qual_Term owner_Type -- qual.name, referring to a symbol declared in owner that has the given signature (see note below) QUALTHIS typeIdent_Tree -- id.this, different from THIS in that it contains a qualifier ident with position. NEW clsType_Term -- new cls THROW throwableExpr_Term -- throw throwableExpr NAMEDARG paramName_NameRef arg_Term -- paramName = arg APPLY Length fn_Term arg_Term* -- fn(args) TYPEAPPLY Length fn_Term arg_Type* -- fn[args] SUPER Length this_Term mixinTypeIdent_Tree? -- super[mixin] TYPED Length expr_Term ascriptionType_Term -- expr: ascription ASSIGN Length lhs_Term rhs_Term -- lhs = rhs BLOCK Length expr_Term Stat* -- { stats; expr } INLINED Length expr_Term call_Term? ValOrDefDef* -- Inlined code from call, with given body expr and given bindings LAMBDA Length meth_Term target_Type? -- Closure over method f of type target (omitted id target is a function type) IF Length [INLINE] cond_Term then_Term else_Term -- inline? if cond then thenPart else elsePart MATCH Length (IMPLICIT | [INLINE] sel_Term) CaseDef* -- (inline? sel | implicit) match caseDefs TRY Length expr_Term CaseDef* finalizer_Term? -- try expr catch {casdeDef} (finally finalizer)? RETURN Length meth_ASTRef expr_Term? -- return expr?, methASTRef is method from which is returned WHILE Length cond_Term body_Term -- while cond do body REPEATED Length elem_Type elem_Term* -- Varargs argument of type elem SELECTouter Length levels_Nat qual_Term underlying_Type -- Follow levels outer links, starting from qual, with given underlying type -- patterns: BIND Length boundName_NameRef patType_Type pat_Term -- name @ pat, wherev patType is the type of the bound symbol ALTERNATIVE Length alt_Term* -- alt1 | ... | altn as a pattern UNAPPLY Length fun_Term ImplicitArg* pat_Type pat_Term* -- Unapply node fun(_: pat_Type)(implicitArgs) flowing into patterns pat. -- type trees: IDENTtpt NameRef Type -- Used for all type idents SELECTtpt NameRef qual_Term -- qual.name SINGLETONtpt ref_Term -- ref.type REFINEDtpt Length underlying_Term refinement_Stat* -- underlying {refinements} APPLIEDtpt Length tycon_Term arg_Term* -- tycon [args] LAMBDAtpt Length TypeParam* body_Term -- [TypeParams] => body TYPEBOUNDStpt Length low_Term high_Term? -- >: low <: high ANNOTATEDtpt Length underlying_Term fullAnnotation_Term -- underlying @ annotation MATCHtpt Length bound_Term? sel_Term CaseDef* -- sel match { CaseDef } where bound is optional upper bound of all rhs BYNAMEtpt underlying_Term -- => underlying SHAREDterm term_ASTRef -- Link to previously serialized term HOLE Length idx_Nat arg_Tree* -- Hole where a splice goes with sequence number idx, splice is applied to arguments args

CaseDef = CASEDEF Length pat_Term rhs_Tree guard_Tree? -- case pat if guard => rhs ImplicitArg = IMPLICITARG arg_Term -- implicit unapply argument

ASTRef = Nat -- Byte position in AST payload

Path = Constant TERMREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol. TERMREFsymbol sym_ASTRef qual_Type -- A reference qual.sym to a local member with prefix qual TERMREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name TERMREF possiblySigned_NameRef qual_Type -- A reference qual.name to a non-local member TERMREFin Length possiblySigned_NameRef qual_Type owner_Type -- A reference qual.name referring to a non-local symbol declared in owner that has the given signature (see note below) THIS clsRef_Type -- cls.this RECthis recType_ASTRef -- The this in a recursive refined type recType. SHAREDtype path_ASTRef -- link to previously serialized path

Constant = UNITconst -- () FALSEconst -- false TRUEconst -- true BYTEconst Int -- A byte number SHORTconst Int -- A short number CHARconst Nat -- A character INTconst Int -- An int number LONGconst LongInt -- A long number FLOATconst Int -- A float number DOUBLEconst LongInt -- A double number STRINGconst NameRef -- A string literal NULLconst -- null CLASSconst Type -- classOf[Type]

Type = Path -- Paths represent both types and terms TYPEREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol. TYPEREFsymbol sym_ASTRef qual_Type -- A reference qual.sym to a local member with prefix qual TYPEREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name TYPEREF NameRef qual_Type -- A reference qual.name to a non-local member TYPEREFin Length NameRef qual_Type namespace_Type -- A reference qual.name to a non-local member that's private in namespace. RECtype parent_Type -- A wrapper for recursive refined types SUPERtype Length this_Type underlying_Type -- A super type reference to underlying REFINEDtype Length underlying_Type refinement_NameRef info_Type -- underlying { refinement_name : info } APPLIEDtype Length tycon_Type arg_Type* -- tycon[args] TYPEBOUNDS Length lowOrAlias_Type high_Type? Variance* -- = alias or >: low <: high, possibly with variances of lambda parameters ANNOTATEDtype Length underlying_Type annotation_Term -- underlying @ annotation ANDtype Length left_Type right_Type -- left & right ORtype Length left_Type right_Type -- lefgt | right MATCHtype Length bound_Type sel_Type case_Type* -- sel match {cases} with optional upper bound MATCHCASEtype Length pat_type rhs_Type -- match cases are MATCHCASEtypes or TYPELAMBDAtypes over MATCHCASEtypes BIND Length boundName_NameRef bounds_Type Modifier* -- boundName @ bounds, for type-variables defined in a type pattern BYNAMEtype underlying_Type -- => underlying PARAMtype Length binder_ASTRef paramNum_Nat -- A reference to parameter # paramNum in lambda type binder POLYtype Length result_Type TypesNames -- A polymorphic method type [TypesNames]result, used in refinements METHODtype Length result_Type TypesNames Modifier* -- A method type (Modifier* TypesNames)result, needed for refinements, with optional modifiers for the parameters TYPELAMBDAtype Length result_Type TypesNames -- A type lambda [TypesNames] => result SHAREDtype type_ASTRef -- link to previously serialized type TypesNames = TypeName* TypeName = typeOrBounds_ASTRef paramName_NameRef -- (termName: type) or (typeName bounds)

Modifier = PRIVATE -- private INTERNAL -- package private (not yet used) PROTECTED -- protected PRIVATEqualified qualifier_Type -- private[qualifier] (to be dropped(?) PROTECTEDqualified qualifier_Type -- protecred[qualifier] (to be dropped(?) ABSTRACT -- abstract FINAL -- final SEALED -- sealed CASE -- case (for classes or objects) IMPLICIT -- implicit GIVEN -- given ERASED -- erased LAZY -- lazy OVERRIDE -- override OPAQUE -- opaque, also used for classes containing opaque aliases INLINE -- inline MACRO -- Inline method containing toplevel splices INLINEPROXY -- Symbol of binding with an argument to an inline method as rhs (TODO: do we still need this?) STATIC -- Mapped to static Java member OBJECT -- An object or its class TRAIT -- A trait ENUM -- A enum class or enum case LOCAL -- private[this] or protected[this], used in conjunction with PRIVATE or PROTECTED SYNTHETIC -- Generated by Scala compiler ARTIFACT -- To be tagged Java Synthetic MUTABLE -- A var FIELDaccessor -- A getter or setter (note: the corresponding field is not serialized) CASEaccessor -- A getter for a case class parameter COVARIANT -- A type parameter marked “+” CONTRAVARIANT -- A type parameter marked “-” HASDEFAULT -- Parameter with default arg; method with default parameters (default arguments are separate methods with DEFAULTGETTER names) STABLE -- Method that is assumed to be stable, i.e. its applications are legal paths EXTENSION -- An extension method PARAMsetter -- The setter part x_= of a var parameter x which itself is pickled as a PARAM PARAMalias -- Parameter is alias of a superclass parameter EXPORTED -- An export forwarder OPEN -- an open class INVISIBLE -- invisible during typechecking Annotation

Variance = STABLE -- invariant | COVARIANT | CONTRAVARIANT

Annotation = ANNOTATION Length tycon_Type fullAnnotation_Term -- An annotation, given (class) type of constructor, and full application tree

Note: The signature of a SELECTin or TERMREFin node is the signature of the selected symbol, not the signature of the reference. The latter undergoes an asSeenFrom but the former does not.

Note: Tree tags are grouped into 5 categories that determine what follows, and thus allow to compute the size of the tagged tree in a generic way.

Category 1 (tags 1-59) : tag Category 2 (tags 60-89) : tag Nat Category 3 (tags 90-109) : tag AST Category 4 (tags 110-127): tag Nat AST Category 5 (tags 128-255): tag Length

Standard-Section: "Positions" LinesSizes Assoc*

LinesSizes = Nat Nat* // Number of lines followed by the size of each line not counting the trailing \n

Assoc = Header offset_Delta? offset_Delta? point_Delta? | SOURCE nameref_Int Header = addr_Delta + // in one Nat: difference of address to last recorded node << 3 + hasStartDiff + // one bit indicating whether there follows a start address delta << 2 hasEndDiff + // one bit indicating whether there follows an end address delta << 1 hasPoint // one bit indicating whether the new position has a point (i.e ^ position) // Nodes which have the same positions as their parents are omitted. // offset_Deltas give difference of start/end offset wrt to the // same offset in the previously recorded node (or 0 for the first recorded node) Delta = Int // Difference between consecutive offsets, SOURCE = 4 // Impossible as header, since addr_Delta = 0 implies that we refer to the // same tree as the previous one, but then hasStartDiff = 1 implies that // the tree's range starts later than the range of itself.

All elements of a position section are serialized as Ints

Standard Section: "Comments" Comment*

Comment = Length Bytes LongInt // Raw comment's bytes encoded as UTF-8, followed by the comment's coordinates.

************************************************************************************

class Object
trait Matchable
class Any

Type members

Classlikes

class NameTags

Tags used to serialize names, should update TastyFormat$.nameTagToString if a new constant is added

Tags used to serialize names, should update TastyFormat$.nameTagToString if a new constant is added

Companion
object
object NameTags extends NameTags
Companion
class

Value members

Concrete methods

def isLegalTag(tag: Int): Boolean

Useful for debugging

Useful for debugging

def isParamTag(tag: Int): Boolean
def isVersionCompatible(fileMajor: Int, fileMinor: Int, fileExperimental: Int, compilerMajor: Int, compilerMinor: Int, compilerExperimental: Int): Boolean

This method implements a binary relation (<:<) between two TASTy versions.

This method implements a binary relation (<:<) between two TASTy versions.

We label the lhs file and rhs compiler. if file <:< compiler then the TASTy file is valid to be read.

A TASTy version, e.g. v := 28.0-3 is composed of three fields:

  • v.major == 28
  • v.minor == 0
  • v.experimental == 3

TASTy versions have a partial order, for example, a <:< b and b <:< a are both false if

  • a and b have different major fields.
  • a and b have different experimental fields, both non-zero.

A zero value for an experimental field is considered to be a stable TASTy version, and has higher precedence than a TASTy version with the same major/minor fields but a non-zero experimental field.

We follow the given algorithm:

if file.major != compiler.major then
 return incompatible
if compiler.experimental == 0 then
 if file.experimental != 0 then
   return incompatible
 if file.minor > compiler.minor then
   return incompatible
 else
   return compatible
else invariant[compiler.experimental != 0]
 if file.experimental == compiler.experimental then
   if file.minor == compiler.minor then
     return compatible (all fields equal)
   else
     return incompatible
 else if file.experimental == 0,
   if file.minor < compiler.minor then
     return compatible (an experimental version can read a previous released version)
   else
     return incompatible (an experimental version cannot read its own minor version or any later version)
 else invariant[file.experimental is non-0 and different than compiler.experimental]
   return incompatible

Should be kept in sync with NameTags.

Should be kept in sync with NameTags. Converts constants to a String representing their identifier, or NotANameTag(tag) if unrecognised.

For debugging purposes when unpickling names in a TASTy file.

def numRefs(tag: Int): Int
Returns

If non-negative, the number of leading references (represented as nats) of a length/trees entry. If negative, minus the number of leading non-reference trees.

Concrete fields

final val ABSTRACT: 9
final val ALTERNATIVE: 151
final val ANDtype: 165
final val ANNOTATEDtpt: 154
final val ANNOTATEDtype: 153
final val ANNOTATION: 173
final val APPLIEDtpt: 162
final val APPLIEDtype: 161
final val APPLY: 136
final val ARTIFACT: 24
final val ASSIGN: 139
final val ASTsSection: "ASTs"
final val BIND: 150
final val BLOCK: 140
final val BOUNDED: 102
final val BYNAMEtpt: 94
final val BYNAMEtype: 93
final val BYTEconst: 67
final val CASE: 12
final val CASEDEF: 155
final val CASEaccessor: 27
final val CHARconst: 69
final val CLASSconst: 92
final val CONTRAVARIANT: 29
final val COVARIANT: 28
final val CommentsSection: "Comments"
final val DEFDEF: 130
final val DOUBLEconst: 73
final val EMPTYCLAUSE: 45
final val ENUM: 21
final val ERASED: 34
final val EXPORT: 177
final val EXPORTED: 39
final val EXTENSION: 36

Natural Number.

Natural Number. The ExperimentalVersion allows for experimentation with changes to TASTy without committing to any guarantees of compatibility.

A zero value indicates that the TASTy version is from a stable, final release.

A strictly positive value indicates that the TASTy version is experimental. An experimental TASTy file can only be read by a tool with the same version. However, tooling with an experimental TASTy version is able to read final TASTy documents if the file's MinorVersion is strictly less than the current value.

final val FALSEconst: 3
final val FIELDaccessor: 26
final val FINAL: 10
final val FLOATconst: 72
final val GIVEN: 37
final val HASDEFAULT: 31
final val HOLE: 255
final val IDENT: 110
final val IDENTtpt: 111
final val IF: 141
final val IMPLICIT: 13
final val IMPLICITarg: 97
final val IMPORT: 132
final val IMPORTED: 75
final val INFIX: 43
final val INLINE: 17
final val INLINED: 147
final val INLINEPROXY: 16
final val INTERNAL: 7
final val INTconst: 70
final val INVISIBLE: 44
final val LAMBDA: 142
final val LAMBDAtpt: 171
final val LAZY: 14
final val LOCAL: 22
final val LONGconst: 71
final val MACRO: 33
final val MATCH: 143
final val MATCHCASEtype: 192
final val MATCHtpt: 191
final val MATCHtype: 190
final val METHODtype: 180
final val MUTABLE: 25
final val MajorVersion: Int

Natural number.

Natural number. Each increment of the MajorVersion begins a new series of backward compatible TASTy versions.

A TASTy file in either the preceeding or succeeding series is incompatible with the current value.

final val MinorVersion: Int

Natural number.

Natural number. Each increment of the MinorVersion, within a series declared by the MajorVersion, breaks forward compatibility, but remains backwards compatible, with all preceeding MinorVersion.

final val NAMEDARG: 119
final val NEW: 95
final val NULLconst: 5
final val OBJECT: 19
final val OPAQUE: 35
final val OPEN: 40
final val ORtype: 167
final val OVERRIDE: 15
final val PACKAGE: 128
final val PARAM: 134
final val PARAMalias: 41
final val PARAMsetter: 38
final val PARAMtype: 172
final val POLYtype: 169
final val PRIVATE: 6
final val PRIVATEqualified: 98
final val PROTECTED: 8
final val PROTECTEDqualified: 99
final val PositionsSection: "Positions"
final val QUALTHIS: 91
final val RECthis: 66
final val RECtype: 100
final val REFINEDtpt: 160
final val REFINEDtype: 159
final val RENAMED: 76
final val REPEATED: 149
final val RETURN: 144
final val SEALED: 11
final val SELECT: 112
final val SELECTin: 176
final val SELECTouter: 148
final val SELECTtpt: 113
final val SELFDEF: 118
final val SHAREDterm: 60
final val SHAREDtype: 61
final val SHORTconst: 68
final val SINGLETONtpt: 101
final val SOURCE: 4
final val SPLITCLAUSE: 46
final val STABLE: 32
final val STATIC: 18
final val STRINGconst: 74
final val SUPER: 157
final val SUPERtype: 158
final val SYNTHETIC: 23
final val TEMPLATE: 156
final val TERMREF: 115
final val TERMREFdirect: 62
final val TERMREFin: 174
final val TERMREFpkg: 64
final val TERMREFsymbol: 114
final val THIS: 90
final val THROW: 96
final val TRAIT: 20
final val TRANSPARENT: 42
final val TRUEconst: 4
final val TRY: 146
final val TYPEAPPLY: 137
final val TYPEBOUNDS: 163
final val TYPEBOUNDStpt: 164
final val TYPED: 138
final val TYPEDEF: 131
final val TYPELAMBDAtype: 170
final val TYPEPARAM: 133
final val TYPEREF: 117
final val TYPEREFdirect: 63
final val TYPEREFin: 175
final val TYPEREFpkg: 65
final val TYPEREFsymbol: 116
final val UNAPPLY: 152
final val UNITconst: 2
final val VALDEF: 129
final val WHILE: 145
final val firstASTTreeTag: 90
final val firstLengthTreeTag: 128
final val firstNatASTTreeTag: 110
final val firstNatTreeTag: 60
final val firstSimpleTreeTag: 2
final val header: Array[Int]

The first four bytes of a TASTy file, followed by four values: - MajorVersion: Int - see definition in TastyFormat - MinorVersion: Int - see definition in TastyFormat - ExperimentalVersion: Int - see definition in TastyFormat - ToolingVersion: String - arbitrary length string representing the tool that produced the TASTy.

The first four bytes of a TASTy file, followed by four values: - MajorVersion: Int - see definition in TastyFormat - MinorVersion: Int - see definition in TastyFormat - ExperimentalVersion: Int - see definition in TastyFormat - ToolingVersion: String - arbitrary length string representing the tool that produced the TASTy.