class TastyBuffer(initialSize: Int)

A byte array buffer that can be filled with bytes or natural numbers in TASTY format, and that supports reading and patching addresses represented as natural numbers.

********************************************************** Notation:

We use BNF notation. Terminal symbols start with at least two consecutive upper case letters. Each terminal is represented as a single byte tag. Non-terminals are mixed case. Prefixes of the form lower case letter*_ are for explanation of semantic content only, they can be dropped without changing the grammar.


LongInt = Digit* StopDigit -- big endian 2's complement, value fits in a Long w/o overflow Int = LongInt -- big endian 2's complement, fits in an Int w/o overflow Nat = LongInt -- non-negative value, fits in an Int without overflow Digit = 0 | ... | 127 StopDigit = 128 | ... | 255 -- value = digit - 128


File = Header majorVersion_Nat minorVersion_Nat experimentalVersion_Nat VersionString UUID nameTable_Length Name* Section* Header = 0x5CA1AB1F UUID = Byte*16 -- random UUID VersionString = Length UTF8-CodePoint* -- string that represents the compiler that produced the TASTy

Section = NameRef Length Bytes Length = Nat -- length of rest of entry in bytes

Name = UTF8 Length UTF8-CodePoint* QUALIFIED Length qualified_NameRef selector_NameRef -- A.B EXPANDED Length qualified_NameRef selector_NameRef -- A$$B, semantically a NameKinds.ExpandedName EXPANDPREFIX Length qualified_NameRef selector_NameRef -- A$B, prefix of expanded name, see NamedKinds.ExpandPrefixName

UNIQUE Length separator_NameRef uniqid_Nat underlying_NameRef? -- Unique name A DEFAULTGETTER Length underlying_NameRef index_Nat -- DefaultGetter$

SUPERACCESSOR Length underlying_NameRef -- super$A INLINEACCESSOR Length underlying_NameRef -- inline$A OBJECTCLASS Length underlying_NameRef -- A$ (name of the module class for module A) BODYRETAINER Length underlying_NameRef -- A$retainedBody

SIGNED Length original_NameRef resultSig_NameRef ParamSig* -- name + signature TARGETSIGNED Length original_NameRef target_NameRef resultSig_NameRef ParamSig*

ParamSig = Int // If negative, the absolute value represents the length of a type parameter section // If positive, this is a NameRef for the fully qualified name of a term parameter.

NameRef = Nat // ordinal number of name in name table, starting from 1.

Note: Unqualified names in the name table are strings. The context decides whether a name is a type-name or a term-name. The same string can represent both.

Standard-Section: "ASTs" TopLevelStat*

TopLevelStat = PACKAGE Length Path TopLevelStat* -- package path { topLevelStats } Stat

Stat = Term ValOrDefDef TYPEDEF Length NameRef (type_Term | Template) Modifier* -- modifiers type name (= type | bounds) | modifiers class name template IMPORT Length qual_Term Selector* -- import qual selectors EXPORT Length qual_Term Selector* -- export qual selectors ValOrDefDef = VALDEF Length NameRef type_Term rhs_Term? Modifier* -- modifiers val name : type (= rhs)? DEFDEF Length NameRef Param* returnType_Term rhs_Term? Modifier* -- modifiers def name [typeparams] paramss : returnType (= rhs)? Selector = IMPORTED name_NameRef -- name, "_" for normal wildcards, "" for given wildcards RENAMED to_NameRef -- => name BOUNDED type_Term -- type bound

TypeParam = TYPEPARAM Length NameRef type_Term Modifier* -- modifiers name bounds TermParam = PARAM Length NameRef type_Term Modifier* -- modifiers name : type. EMPTYCLAUSE -- an empty parameter clause () SPLITCLAUSE -- splits two non-empty parameter clauses of the same kind Param = TypeParam TermParam Template = TEMPLATE Length TypeParam* TermParam* parent_Term* Self? Stat* -- [typeparams] paramss extends parents { self => stats }, where Stat* always starts with the primary constructor. Self = SELFDEF selfName_NameRef selfType_Term -- selfName : selfType

Term = Path -- Paths represent both types and terms IDENT NameRef Type -- Used when term ident’s type is not a TermRef SELECT possiblySigned_NameRef qual_Term -- SELECTin Length possiblySigned_NameRef qual_Term owner_Type --, referring to a symbol declared in owner that has the given signature (see note below) QUALTHIS typeIdent_Tree -- id.this, different from THIS in that it contains a qualifier ident with position. NEW clsType_Term -- new cls THROW throwableExpr_Term -- throw throwableExpr NAMEDARG paramName_NameRef arg_Term -- paramName = arg APPLY Length fn_Term arg_Term* -- fn(args) TYPEAPPLY Length fn_Term arg_Type* -- fn[args] SUPER Length this_Term mixinTypeIdent_Tree? -- super[mixin] TYPED Length expr_Term ascriptionType_Term -- expr: ascription ASSIGN Length lhs_Term rhs_Term -- lhs = rhs BLOCK Length expr_Term Stat* -- { stats; expr } INLINED Length expr_Term call_Term? ValOrDefDef* -- Inlined code from call, with given body expr and given bindings LAMBDA Length meth_Term target_Type? -- Closure over method f of type target (omitted id target is a function type) IF Length [INLINE] cond_Term then_Term else_Term -- inline? if cond then thenPart else elsePart MATCH Length (IMPLICIT | [INLINE] sel_Term) CaseDef* -- (inline? sel | implicit) match caseDefs TRY Length expr_Term CaseDef* finalizer_Term? -- try expr catch {casdeDef} (finally finalizer)? RETURN Length meth_ASTRef expr_Term? -- return expr?, methASTRef is method from which is returned WHILE Length cond_Term body_Term -- while cond do body REPEATED Length elem_Type elem_Term* -- Varargs argument of type elem SELECTouter Length levels_Nat qual_Term underlying_Type -- Follow levels outer links, starting from qual, with given underlying type -- patterns: BIND Length boundName_NameRef patType_Type pat_Term -- name @ pat, wherev patType is the type of the bound symbol ALTERNATIVE Length alt_Term* -- alt1 | ... | altn as a pattern UNAPPLY Length fun_Term ImplicitArg* pat_Type pat_Term* -- Unapply node fun(_: pat_Type)(implicitArgs) flowing into patterns pat. -- type trees: IDENTtpt NameRef Type -- Used for all type idents SELECTtpt NameRef qual_Term -- SINGLETONtpt ref_Term -- ref.type REFINEDtpt Length underlying_Term refinement_Stat* -- underlying {refinements} APPLIEDtpt Length tycon_Term arg_Term* -- tycon [args] LAMBDAtpt Length TypeParam* body_Term -- [TypeParams] => body TYPEBOUNDStpt Length low_Term high_Term? -- >: low <: high ANNOTATEDtpt Length underlying_Term fullAnnotation_Term -- underlying @ annotation MATCHtpt Length bound_Term? sel_Term CaseDef* -- sel match { CaseDef } where bound is optional upper bound of all rhs BYNAMEtpt underlying_Term -- => underlying SHAREDterm term_ASTRef -- Link to previously serialized term HOLE Length idx_Nat arg_Tree* -- Hole where a splice goes with sequence number idx, splice is applied to arguments args

CaseDef = CASEDEF Length pat_Term rhs_Tree guard_Tree? -- case pat if guard => rhs ImplicitArg = IMPLICITARG arg_Term -- implicit unapply argument

ASTRef = Nat -- Byte position in AST payload

Path = Constant TERMREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol. TERMREFsymbol sym_ASTRef qual_Type -- A reference qual.sym to a local member with prefix qual TERMREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name TERMREF possiblySigned_NameRef qual_Type -- A reference to a non-local member TERMREFin Length possiblySigned_NameRef qual_Type owner_Type -- A reference referring to a non-local symbol declared in owner that has the given signature (see note below) THIS clsRef_Type -- cls.this RECthis recType_ASTRef -- The this in a recursive refined type recType. SHAREDtype path_ASTRef -- link to previously serialized path

Constant = UNITconst -- () FALSEconst -- false TRUEconst -- true BYTEconst Int -- A byte number SHORTconst Int -- A short number CHARconst Nat -- A character INTconst Int -- An int number LONGconst LongInt -- A long number FLOATconst Int -- A float number DOUBLEconst LongInt -- A double number STRINGconst NameRef -- A string literal NULLconst -- null CLASSconst Type -- classOf[Type]

Type = Path -- Paths represent both types and terms TYPEREFdirect sym_ASTRef -- A reference to a local symbol (without a prefix). Reference is to definition node of symbol. TYPEREFsymbol sym_ASTRef qual_Type -- A reference qual.sym to a local member with prefix qual TYPEREFpkg fullyQualified_NameRef -- A reference to a package member with given fully qualified name TYPEREF NameRef qual_Type -- A reference to a non-local member TYPEREFin Length NameRef qual_Type namespace_Type -- A reference to a non-local member that's private in namespace. RECtype parent_Type -- A wrapper for recursive refined types SUPERtype Length this_Type underlying_Type -- A super type reference to underlying REFINEDtype Length refinement_NameRef underlying_Type info_Type -- underlying { refinement_name : info } APPLIEDtype Length tycon_Type arg_Type* -- tycon[args] TYPEBOUNDS Length lowOrAlias_Type high_Type? Variance* -- = alias or >: low <: high, possibly with variances of lambda parameters ANNOTATEDtype Length underlying_Type annotation_Term -- underlying @ annotation ANDtype Length left_Type right_Type -- left & right ORtype Length left_Type right_Type -- lefgt | right MATCHtype Length bound_Type sel_Type case_Type* -- sel match {cases} with optional upper bound MATCHCASEtype Length pat_type rhs_Type -- match cases are MATCHCASEtypes or TYPELAMBDAtypes over MATCHCASEtypes BIND Length boundName_NameRef bounds_Type Modifier* -- boundName @ bounds, for type-variables defined in a type pattern BYNAMEtype underlying_Type -- => underlying PARAMtype Length binder_ASTRef paramNum_Nat -- A reference to parameter # paramNum in lambda type binder POLYtype Length result_Type TypesNames -- A polymorphic method type [TypesNames]result, used in refinements METHODtype Length result_Type TypesNames Modifier* -- A method type (Modifier* TypesNames)result, needed for refinements, with optional modifiers for the parameters TYPELAMBDAtype Length result_Type TypesNames -- A type lambda [TypesNames] => result SHAREDtype type_ASTRef -- link to previously serialized type TypesNames = TypeName* TypeName = typeOrBounds_ASTRef paramName_NameRef -- (termName: type) or (typeName bounds)

Modifier = PRIVATE -- private PROTECTED -- protected PRIVATEqualified qualifier_Type -- private[qualifier] (to be dropped(?) PROTECTEDqualified qualifier_Type -- protecred[qualifier] (to be dropped(?) ABSTRACT -- abstract FINAL -- final SEALED -- sealed CASE -- case (for classes or objects) IMPLICIT -- implicit GIVEN -- given ERASED -- erased LAZY -- lazy OVERRIDE -- override OPAQUE -- opaque, also used for classes containing opaque aliases INLINE -- inline MACRO -- Inline method containing toplevel splices INLINEPROXY -- Symbol of binding with an argument to an inline method as rhs (TODO: do we still need this?) STATIC -- Mapped to static Java member OBJECT -- An object or its class TRAIT -- A trait ENUM -- A enum class or enum case LOCAL -- private[this] or protected[this], used in conjunction with PRIVATE or PROTECTED SYNTHETIC -- Generated by Scala compiler ARTIFACT -- To be tagged Java Synthetic MUTABLE -- A var FIELDaccessor -- A getter or setter (note: the corresponding field is not serialized) CASEaccessor -- A getter for a case class parameter COVARIANT -- A type parameter marked “+” CONTRAVARIANT -- A type parameter marked “-” HASDEFAULT -- Parameter with default arg; method with default parameters (default arguments are separate methods with DEFAULTGETTER names) STABLE -- Method that is assumed to be stable, i.e. its applications are legal paths EXTENSION -- An extension method PARAMsetter -- The setter part x_= of a var parameter x which itself is pickled as a PARAM PARAMalias -- Parameter is alias of a superclass parameter EXPORTED -- An export forwarder OPEN -- an open class INVISIBLE -- invisible during typechecking Annotation

Variance = STABLE -- invariant | COVARIANT | CONTRAVARIANT

Annotation = ANNOTATION Length tycon_Type fullAnnotation_Term -- An annotation, given (class) type of constructor, and full application tree

Note: The signature of a SELECTin or TERMREFin node is the signature of the selected symbol, not the signature of the reference. The latter undergoes an asSeenFrom but the former does not.

Note: Tree tags are grouped into 5 categories that determine what follows, and thus allow to compute the size of the tagged tree in a generic way.

Category 1 (tags 1-59) : tag Category 2 (tags 60-89) : tag Nat Category 3 (tags 90-109) : tag AST Category 4 (tags 110-127): tag Nat AST Category 5 (tags 128-255): tag Length

Standard-Section: "Positions" LinesSizes Assoc*

LinesSizes = Nat Nat* // Number of lines followed by the size of each line not counting the trailing \n

Assoc = Header offset_Delta? offset_Delta? point_Delta? | SOURCE nameref_Int Header = addr_Delta + // in one Nat: difference of address to last recorded node << 3 + hasStartDiff + // one bit indicating whether there follows a start address delta << 2 hasEndDiff + // one bit indicating whether there follows an end address delta << 1 hasPoint // one bit indicating whether the new position has a point (i.e ^ position) // Nodes which have the same positions as their parents are omitted. // offset_Deltas give difference of start/end offset wrt to the // same offset in the previously recorded node (or 0 for the first recorded node) Delta = Int // Difference between consecutive offsets, SOURCE = 4 // Impossible as header, since addr_Delta = 0 implies that we refer to the // same tree as the previous one, but then hasStartDiff = 1 implies that // the tree's range starts later than the range of itself.

All elements of a position section are serialized as Ints

Standard Section: "Comments" Comment*

Comment = Length Bytes LongInt // Raw comment's bytes encoded as UTF-8, followed by the comment's coordinates.


sealed abstract case class TastyHeader(uuid: UUID, majorVersion: Int, minorVersion: Int, experimentalVersion: Int, toolingVersion: String)

The Tasty Header consists of four fields: - uuid

  • contains a hash of the sections of the TASTy file - majorVersion

  • matching the TASTy format version that last broke backwards compatibility - minorVersion

  • matching the TASTy format version that last broke forward compatibility - experimentalVersion

  • 0 for final compiler version

  • positive for between minor versions and forward compatibility is broken since the previous stable version. - toolingVersion

  • arbitrary string representing the tooling that produced the TASTy

class TastyReader(val bytes: Array[Byte], start: Int, end: Int, val base: Int)

A byte array buffer that can be filled with bytes or natural numbers in TASTY format, and that supports reading and patching addresses represented as natural numbers.

Value parameters:

The index referenced by the logical zero address Addr(0)


The array containing data


The position one greater than the last byte to be read


The position from which to read