From 2db602c2092dc8aaf4e5100aff7c059a382b52cf Mon Sep 17 00:00:00 2001 From: Mario Carneiro Date: Fri, 3 May 2024 07:47:23 -0400 Subject: [PATCH] doc: layout algorithm (#3915) The layout algorithm, while somewhat finicky, is (unfortunately) necessary for C code to interface with lean structures. This adds a (AFAIK) complete description of the layout algorithm, including a worked example large enough to make it possible to reconstruct the whole decision diagram. --------- Co-authored-by: Sebastian Ullrich --- doc/dev/ffi.md | 51 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/doc/dev/ffi.md b/doc/dev/ffi.md index 632965f216..790d88556e 100644 --- a/doc/dev/ffi.md +++ b/doc/dev/ffi.md @@ -53,10 +53,59 @@ In the case of `@[extern]` all *irrelevant* types are removed first; see next se Its runtime value is either a pointer to an opaque bignum object or, if the lowest bit of the "pointer" is 1 (`lean_is_scalar`), an encoded unboxed natural number (`lean_box`/`lean_unbox`). * A universe `Sort u`, type constructor `... → Sort u`, or proposition `p : Prop` is *irrelevant* and is either statically erased (see above) or represented as a `lean_object *` with the runtime value `lean_box(0)` * Any other type is represented by `lean_object *`. - Its runtime value is a pointer to an object of a subtype of `lean_object` (see respective declarations in `lean.h`) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters. + Its runtime value is a pointer to an object of a subtype of `lean_object` (see the "Inductive types" section below) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters. Example: the runtime value of `u : Unit` is always `lean_box(0)`. +#### Inductive types + +For inductive types which are in the fallback `lean_object *` case above and not trivial constructors, the type is stored as a `lean_ctor_object`, and `lean_is_ctor` will return true. A `lean_ctor_object` stores the constructor index in the header, and the fields are stored in the `m_objs` portion of the object. + +The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows: + +* Non-scalar fields stored as `lean_object *` +* Fields of type `USize` +* Other scalar fields, in decreasing order by size + +Within each group the fields are ordered in declaration order. **Warning**: Trivial wrapper types still count toward a field being treated as non-scalar for this purpose. + +* To access fields of the first kind, use `lean_ctor_get(val, i)` to get the `i`th non-scalar field. +* To access `USize` fields, use `lean_ctor_get_usize(val, n+i)` to get the `i`th usize field and `n` is the total number of fields of the first kind. +* To access other scalar fields, use `lean_ctor_get_uintN(val, off)` or `lean_ctor_get_usize(val, off)` as appropriate. Here `off` is the byte offset of the field in the structure, starting at `n*sizeof(void*)` where `n` is the number of fields of the first two kinds. + +For example, a structure such as +```lean +structure S where + ptr_1 : Array Nat + usize_1 : USize + sc64_1 : UInt64 + ptr_2 : { x : UInt64 // x > 0 } -- wrappers don't count as scalars + sc64_2 : Float -- `Float` is 64 bit + sc8_1 : Bool + sc16_1 : UInt16 + sc8_2 : UInt8 + sc64_3 : UInt64 + usize_2 : USize + ptr_3 : Char -- trivial wrapper around `UInt32` + sc32_1 : UInt32 + sc16_2 : UInt16 +``` +would get re-sorted into the following memory order: + +* `S.ptr_1` - `lean_ctor_get(val, 0)` +* `S.ptr_2` - `lean_ctor_get(val, 1)` +* `S.ptr_3` - `lean_ctor_get(val, 2)` +* `S.usize_1` - `lean_ctor_get_usize(val, 3)` +* `S.usize_2` - `lean_ctor_get_usize(val, 4)` +* `S.sc64_1` - `lean_ctor_get_uint64(val, sizeof(void*)*5)` +* `S.sc64_2` - `lean_ctor_get_float(val, sizeof(void*)*5 + 8)` +* `S.sc64_3` - `lean_ctor_get_uint64(val, sizeof(void*)*5 + 16)` +* `S.sc32_1` - `lean_ctor_get_uint32(val, sizeof(void*)*5 + 24)` +* `S.sc16_1` - `lean_ctor_get_uint16(val, sizeof(void*)*5 + 28)` +* `S.sc16_2` - `lean_ctor_get_uint16(val, sizeof(void*)*5 + 30)` +* `S.sc8_1` - `lean_ctor_get_uint8(val, sizeof(void*)*5 + 32)` +* `S.sc8_2` - `lean_ctor_get_uint8(val, sizeof(void*)*5 + 33)` + ### Borrowing By default, all `lean_object *` parameters of an `@[extern]` function are considered *owned*, i.e. the external code is passed a "virtual RC token" and is responsible for passing this token along to another consuming function (exactly once) or freeing it via `lean_dec`.