doc: layout algorithm (#3915)

The layout algorithm, while somewhat finicky, is (unfortunately)
necessary for C code to interface with lean structures. This adds a
(AFAIK) complete description of the layout algorithm, including a worked
example large enough to make it possible to reconstruct the whole
decision diagram.

---------

Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This commit is contained in:
Mario Carneiro 2024-05-03 07:47:23 -04:00 committed by GitHub
parent 00cf5771f3
commit 2db602c209
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -53,10 +53,59 @@ In the case of `@[extern]` all *irrelevant* types are removed first; see next se
Its runtime value is either a pointer to an opaque bignum object or, if the lowest bit of the "pointer" is 1 (`lean_is_scalar`), an encoded unboxed natural number (`lean_box`/`lean_unbox`).
* A universe `Sort u`, type constructor `... → Sort u`, or proposition `p : Prop` is *irrelevant* and is either statically erased (see above) or represented as a `lean_object *` with the runtime value `lean_box(0)`
* Any other type is represented by `lean_object *`.
Its runtime value is a pointer to an object of a subtype of `lean_object` (see respective declarations in `lean.h`) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters.
Its runtime value is a pointer to an object of a subtype of `lean_object` (see the "Inductive types" section below) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters.
Example: the runtime value of `u : Unit` is always `lean_box(0)`.
#### Inductive types
For inductive types which are in the fallback `lean_object *` case above and not trivial constructors, the type is stored as a `lean_ctor_object`, and `lean_is_ctor` will return true. A `lean_ctor_object` stores the constructor index in the header, and the fields are stored in the `m_objs` portion of the object.
The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows:
* Non-scalar fields stored as `lean_object *`
* Fields of type `USize`
* Other scalar fields, in decreasing order by size
Within each group the fields are ordered in declaration order. **Warning**: Trivial wrapper types still count toward a field being treated as non-scalar for this purpose.
* To access fields of the first kind, use `lean_ctor_get(val, i)` to get the `i`th non-scalar field.
* To access `USize` fields, use `lean_ctor_get_usize(val, n+i)` to get the `i`th usize field and `n` is the total number of fields of the first kind.
* To access other scalar fields, use `lean_ctor_get_uintN(val, off)` or `lean_ctor_get_usize(val, off)` as appropriate. Here `off` is the byte offset of the field in the structure, starting at `n*sizeof(void*)` where `n` is the number of fields of the first two kinds.
For example, a structure such as
```lean
structure S where
ptr_1 : Array Nat
usize_1 : USize
sc64_1 : UInt64
ptr_2 : { x : UInt64 // x > 0 } -- wrappers don't count as scalars
sc64_2 : Float -- `Float` is 64 bit
sc8_1 : Bool
sc16_1 : UInt16
sc8_2 : UInt8
sc64_3 : UInt64
usize_2 : USize
ptr_3 : Char -- trivial wrapper around `UInt32`
sc32_1 : UInt32
sc16_2 : UInt16
```
would get re-sorted into the following memory order:
* `S.ptr_1` - `lean_ctor_get(val, 0)`
* `S.ptr_2` - `lean_ctor_get(val, 1)`
* `S.ptr_3` - `lean_ctor_get(val, 2)`
* `S.usize_1` - `lean_ctor_get_usize(val, 3)`
* `S.usize_2` - `lean_ctor_get_usize(val, 4)`
* `S.sc64_1` - `lean_ctor_get_uint64(val, sizeof(void*)*5)`
* `S.sc64_2` - `lean_ctor_get_float(val, sizeof(void*)*5 + 8)`
* `S.sc64_3` - `lean_ctor_get_uint64(val, sizeof(void*)*5 + 16)`
* `S.sc32_1` - `lean_ctor_get_uint32(val, sizeof(void*)*5 + 24)`
* `S.sc16_1` - `lean_ctor_get_uint16(val, sizeof(void*)*5 + 28)`
* `S.sc16_2` - `lean_ctor_get_uint16(val, sizeof(void*)*5 + 30)`
* `S.sc8_1` - `lean_ctor_get_uint8(val, sizeof(void*)*5 + 32)`
* `S.sc8_2` - `lean_ctor_get_uint8(val, sizeof(void*)*5 + 33)`
### Borrowing
By default, all `lean_object *` parameters of an `@[extern]` function are considered *owned*, i.e. the external code is passed a "virtual RC token" and is responsible for passing this token along to another consuming function (exactly once) or freeing it via `lean_dec`.