doc: replace ffi.md with links to the reference manual (#11737)
This PR replaces `ffi.md` with links to the corresponding sections of the manual, so we don't have to keep two documents up to date. A corresponding reference manual PR re-synchronizes them: https://github.com/leanprover/reference-manual/pull/714
This commit is contained in:
parent
cee149cc1f
commit
73ff198d11
1 changed files with 5 additions and 185 deletions
190
doc/dev/ffi.md
190
doc/dev/ffi.md
|
|
@ -1,189 +1,9 @@
|
|||
# Foreign Function Interface
|
||||
|
||||
NOTE: The current interface was designed for internal use in Lean and should be considered **unstable**.
|
||||
It will be refined and extended in the future.
|
||||
The Lean FFI documentation is now part of the [Lean language reference](https://lean-lang.org/doc/reference/latest/).
|
||||
|
||||
As Lean is written partially in Lean itself and partially in C++, it offers efficient interoperability between the two languages (or rather, between Lean and any language supporting C interfaces).
|
||||
This support is however currently limited to transferring Lean data types; in particular, it is not possible yet to pass or return compound data structures such as C `struct`s by value from or to Lean.
|
||||
* [General FFI](https://lean-lang.org/doc/reference/latest/find/?domain=Verso.Genre.Manual.section&name=ffi)
|
||||
* [Representation of inductive types](https://lean-lang.org/doc/reference/latest/find/?domain=Verso.Genre.Manual.section&name=inductive-types-ffi)
|
||||
* [String](https://lean-lang.org/doc/reference/latest/find/?domain=Verso.Genre.Manual.section&name=string-ffi)
|
||||
* [Array](https://lean-lang.org/doc/reference/latest/find/?domain=Verso.Genre.Manual.section&name=array-ffi)
|
||||
|
||||
There are two primary attributes for interoperating with other languages:
|
||||
* `@[extern "sym"] constant leanSym : ...` binds a Lean declaration to the external symbol `sym`.
|
||||
It can also be used with `def` to provide an internal definition, but ensuring consistency of both definitions is up to the user.
|
||||
* `@[export sym] def leanSym : ...` exports `leanSym` under the unmangled symbol name `sym`.
|
||||
|
||||
For simple examples of how to call foreign code from Lean and vice versa, see <https://github.com/leanprover/lean4/blob/master/src/lake/examples/ffi> and <https://github.com/leanprover/lean4/blob/master/src/lake/examples/reverse-ffi>, respectively.
|
||||
|
||||
## The Lean ABI
|
||||
|
||||
The Lean Application Binary Interface (ABI) describes how the signature of a Lean declaration is encoded as a native calling convention.
|
||||
It is based on the standard C ABI and calling convention of the target platform.
|
||||
For a Lean declaration marked with either `@[extern "sym"]` or `@[export sym]` for some symbol name `sym`, let `α₁ → ... → αₙ → β` be the normalized declaration's type.
|
||||
If `n` is 0, the corresponding C declaration is
|
||||
```c
|
||||
extern s sym;
|
||||
```
|
||||
where `s` is the C translation of `β` as specified in the next section.
|
||||
In the case of an `@[extern]` definition, the symbol's value is guaranteed to be initialized only after calling the Lean module's initializer or that of an importing module; see [Initialization](#initialization).
|
||||
|
||||
If `n` is greater than 0, the corresponding C declaration is
|
||||
```c
|
||||
s sym(t₁, ..., tₘ);
|
||||
```
|
||||
where the parameter types `tᵢ` are the C translation of the `αᵢ` as in the next section.
|
||||
In the case of `@[extern]` all *irrelevant* types are removed first; see next section.
|
||||
|
||||
### Translating Types from Lean to C
|
||||
|
||||
* The integer types `UInt8`, ..., `UInt64`, `USize` are represented by the C types `uint8_t`, ..., `uint64_t`, `size_t`, respectively
|
||||
* `Char` is represented by `uint32_t`
|
||||
* `Float` is represented by `double`
|
||||
* An *enum* inductive type of at least 2 and at most 2^32 constructors, each of which with no parameters, is represented by the first type of `uint8_t`, `uint16_t`, `uint32_t` that is sufficient to represent all constructor indices.
|
||||
|
||||
For example, the type `Bool` is represented as `uint8_t` with values `0` for `false` and `1` for `true`.
|
||||
* `Decidable α` is represented the same way as `Bool`
|
||||
* An inductive type with a *trivial structure*, that is,
|
||||
* it is none of the types described above
|
||||
* it is not marked `unsafe`
|
||||
* it has a single constructor with a single parameter of *relevant* type
|
||||
|
||||
is represented by the representation of that parameter's type.
|
||||
|
||||
For example, `{ x : α // p }`, the `Subtype` structure of a value of type `α` and an irrelevant proof, is represented by the representation of `α`.
|
||||
Similarly, the signed integer types `Int8`, ..., `Int64`, `ISize` are also represented by the unsigned C types `uint8_t`, ..., `uint64_t`, `size_t`, respectively, because they have a trivial structure.
|
||||
* `Nat` and `Int` are represented by `lean_object *`.
|
||||
Their runtime values is either a pointer to an opaque bignum object or, if the lowest bit of the "pointer" is 1 (`lean_is_scalar`), an encoded unboxed natural number or integer (`lean_box`/`lean_unbox`).
|
||||
* A universe `Sort u`, type constructor `... → Sort u`, `Void α` or proposition `p : Prop` is *irrelevant* and is either statically erased (see above) or represented as a `lean_object *` with the runtime value `lean_box(0)`
|
||||
* Any other type is represented by `lean_object *`.
|
||||
Its runtime value is a pointer to an object of a subtype of `lean_object` (see the "Inductive types" section below) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters.
|
||||
|
||||
Example: the runtime value of `u : Unit` is always `lean_box(0)`.
|
||||
|
||||
#### Inductive types
|
||||
|
||||
For inductive types which are in the fallback `lean_object *` case above and not trivial constructors, the type is stored as a `lean_ctor_object`, and `lean_is_ctor` will return true. A `lean_ctor_object` stores the constructor index in the header, and the fields are stored in the `m_objs` portion of the object.
|
||||
|
||||
The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows:
|
||||
|
||||
* Non-scalar fields stored as `lean_object *`
|
||||
* Fields of type `USize`
|
||||
* Other scalar fields, in decreasing order by size
|
||||
|
||||
Within each group the fields are ordered in declaration order. Trivial wrapper types count as their underlying wrapped type for this purpose.
|
||||
|
||||
* To access fields of the first kind, use `lean_ctor_get(val, i)` to get the `i`th non-scalar field.
|
||||
* To access `USize` fields, use `lean_ctor_get_usize(val, n+i)` to get the `i`th usize field and `n` is the total number of fields of the first kind.
|
||||
* To access other scalar fields, use `lean_ctor_get_uintN(val, off)` or `lean_ctor_get_usize(val, off)` as appropriate. Here `off` is the byte offset of the field in the structure, starting at `n*sizeof(void*)` where `n` is the number of fields of the first two kinds.
|
||||
|
||||
For example, a structure such as
|
||||
```lean
|
||||
structure S where
|
||||
ptr_1 : Array Nat
|
||||
usize_1 : USize
|
||||
sc64_1 : UInt64
|
||||
sc64_2 : { x : UInt64 // x > 0 } -- wrappers of scalars count as scalars
|
||||
sc64_3 : Float -- `Float` is 64 bit
|
||||
sc8_1 : Bool
|
||||
sc16_1 : UInt16
|
||||
sc8_2 : UInt8
|
||||
sc64_4 : UInt64
|
||||
usize_2 : USize
|
||||
sc32_1 : Char -- trivial wrapper around `UInt32`
|
||||
sc32_2 : UInt32
|
||||
sc16_2 : UInt16
|
||||
```
|
||||
would get re-sorted into the following memory order:
|
||||
|
||||
* `S.ptr_1` - `lean_ctor_get(val, 0)`
|
||||
* `S.usize_1` - `lean_ctor_get_usize(val, 1)`
|
||||
* `S.usize_2` - `lean_ctor_get_usize(val, 2)`
|
||||
* `S.sc64_1` - `lean_ctor_get_uint64(val, sizeof(void*)*3)`
|
||||
* `S.sc64_2` - `lean_ctor_get_uint64(val, sizeof(void*)*3 + 8)`
|
||||
* `S.sc64_3` - `lean_ctor_get_float(val, sizeof(void*)*3 + 16)`
|
||||
* `S.sc64_4` - `lean_ctor_get_uint64(val, sizeof(void*)*3 + 24)`
|
||||
* `S.sc32_1` - `lean_ctor_get_uint32(val, sizeof(void*)*3 + 32)`
|
||||
* `S.sc32_2` - `lean_ctor_get_uint32(val, sizeof(void*)*3 + 36)`
|
||||
* `S.sc16_1` - `lean_ctor_get_uint16(val, sizeof(void*)*3 + 40)`
|
||||
* `S.sc16_2` - `lean_ctor_get_uint16(val, sizeof(void*)*3 + 42)`
|
||||
* `S.sc8_1` - `lean_ctor_get_uint8(val, sizeof(void*)*3 + 44)`
|
||||
* `S.sc8_2` - `lean_ctor_get_uint8(val, sizeof(void*)*3 + 45)`
|
||||
|
||||
### Borrowing
|
||||
|
||||
By default, all `lean_object *` parameters of an `@[extern]` function are considered *owned*, i.e. the external code is passed a "virtual RC token" and is responsible for passing this token along to another consuming function (exactly once) or freeing it via `lean_dec`.
|
||||
To reduce reference counting overhead, parameters can be marked as *borrowed* by prefixing their type with `@&`.
|
||||
Borrowed objects must only be passed to other non-consuming functions (arbitrarily often) or converted to owned values using `lean_inc`.
|
||||
In `lean.h`, the `lean_object *` aliases `lean_obj_arg` and `b_lean_obj_arg` are used to mark this difference on the C side.
|
||||
|
||||
Return values and `@[export]` parameters are always owned at the moment.
|
||||
|
||||
## Initialization
|
||||
|
||||
When including Lean code as part of a larger program, modules must be *initialized* before accessing any of their declarations.
|
||||
Module initialization entails
|
||||
* initialization of all "constants" (nullary functions), including closed terms lifted out of other functions
|
||||
* execution of all `[init]` functions
|
||||
* execution of all `[builtin_init]` functions, if the `builtin` parameter of the module initializer has been set
|
||||
|
||||
The module initializer is automatically run with the `builtin` flag for executables compiled from Lean code and for "plugins" loaded with `lean --plugin`.
|
||||
For all other modules imported by `lean`, the initializer is run without `builtin`.
|
||||
Thus `[init]` functions are run iff their module is imported, regardless of whether they have native code available or not, while `[builtin_init]` functions are only run for native executable or plugins, regardless of whether their module is imported or not.
|
||||
`lean` uses built-in initializers for e.g. registering basic parsers that should be available even without importing their module (which is necessary for bootstrapping).
|
||||
|
||||
The initializer for module `A.B` in a package `foo` is called `initialize_foo_A_B`. For modules in the Lean core (e.g., `Init.Prelude`), the initializer is called `initialize_Init_Prelude`. Module initializers will automatically initialize any imported modules. They are also idempotent (when run with the same `builtin` flag), but not thread-safe.
|
||||
|
||||
**Important for process-related functionality**: If your application needs to use process-related functions from libuv, such as `Std.Internal.IO.Process.getProcessTitle` and `Std.Internal.IO.Process.setProcessTitle`, you must call `lean_setup_args(argc, argv)` (which returns a potentially modified `argv` that must be used in place of the original) **before** calling `lean_initialize()` or `lean_initialize_runtime_module()`. This sets up process handling capabilities correctly, which is essential for certain system-level operations that Lean's runtime may depend on.
|
||||
|
||||
Together with initialization of the Lean runtime, you should execute code like the following exactly once before accessing any Lean declarations:
|
||||
|
||||
```c
|
||||
void lean_initialize_runtime_module();
|
||||
void lean_initialize();
|
||||
char ** lean_setup_args(int argc, char ** argv);
|
||||
|
||||
lean_object * initialize_A_B(uint8_t builtin);
|
||||
lean_object * initialize_C(uint8_t builtin);
|
||||
...
|
||||
|
||||
argv = lean_setup_args(argc, argv); // if using process-related functionality
|
||||
lean_initialize_runtime_module();
|
||||
//lean_initialize(); // necessary (and replaces `lean_initialize_runtime_module`) if you (indirectly) access the `Lean` package
|
||||
|
||||
lean_object * res;
|
||||
// use same default as for Lean executables
|
||||
uint8_t builtin = 1;
|
||||
res = initialize_A_B(builtin);
|
||||
if (lean_io_result_is_ok(res)) {
|
||||
lean_dec_ref(res);
|
||||
} else {
|
||||
lean_io_result_show_error(res);
|
||||
lean_dec(res);
|
||||
return ...; // do not access Lean declarations if initialization failed
|
||||
}
|
||||
res = initialize_C(builtin);
|
||||
if (lean_io_result_is_ok(res)) {
|
||||
...
|
||||
|
||||
//lean_init_task_manager(); // necessary if you (indirectly) use `Task`
|
||||
lean_io_mark_end_initialization();
|
||||
```
|
||||
|
||||
In addition, any other thread not spawned by the Lean runtime itself must be initialized for Lean use by calling
|
||||
```c
|
||||
void lean_initialize_thread();
|
||||
```
|
||||
and should be finalized in order to free all thread-local resources by calling
|
||||
```c
|
||||
void lean_finalize_thread();
|
||||
```
|
||||
|
||||
## `@[extern]` in the Interpreter
|
||||
|
||||
The interpreter can run Lean declarations for which symbols are available in loaded shared libraries, which includes `@[extern]` declarations.
|
||||
Thus to e.g. run `#eval` on such a declaration, you need to
|
||||
1. compile (at least) the module containing the declaration and its dependencies into a shared library, and then
|
||||
1. pass this library to `lean --load-dynlib=` to run code `import`ing this module.
|
||||
|
||||
Note that it is not sufficient to load the foreign library containing the external symbol because the interpreter depends on code that is emitted for each `@[extern]` declaration.
|
||||
Thus it is not possible to interpret an `@[extern]` declaration in the same file.
|
||||
|
||||
See [`tests/compiler/foreign`](https://github.com/leanprover/lean4/tree/master/tests/compiler/foreign/) for an example.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue