# Foreign Function Interface NOTE: The current interface was designed for internal use in Lean and should be considered **unstable**. It will be refined and extended in the future. As Lean is written partially in Lean itself and partially in C++, it offers efficient interoperability between the two languages (or rather, between Lean and any language supporting C interfaces). This support is however currently limited to transferring Lean data types; in particular, it is not possible yet to pass or return compound data structures such as C `struct`s by value from or to Lean. There are two primary attributes for interoperating with other languages: * `@[extern "sym"] constant leanSym : ...` binds a Lean declaration to the external symbol `sym`. It can also be used with `def` to provide an internal definition, but ensuring consistency of both definitions is up to the user. * `@[export sym] def leanSym : ...` exports `leanSym` under the unmangled symbol name `sym`. For simple examples of how to call foreign code from Lean and vice versa, see and , respectively. ## The Lean ABI The Lean Application Binary Interface (ABI) describes how the signature of a Lean declaration is encoded as a native calling convention. It is based on the standard C ABI and calling convention of the target platform. For a Lean declaration marked with either `@[extern "sym"]` or `@[export sym]` for some symbol name `sym`, let `α₁ → ... → αₙ → β` be the normalized declaration's type. If `n` is 0, the corresponding C declaration is ```c extern s sym; ``` where `s` is the C translation of `β` as specified in the next section. In the case of an `@[extern]` definition, the symbol's value is guaranteed to be initialized only after calling the Lean module's initializer or that of an importing module; see [Initialization](#initialization). If `n` is greater than 0, the corresponding C declaration is ```c s sym(t₁, ..., tₘ); ``` where the parameter types `tᵢ` are the C translation of the `αᵢ` as in the next section. In the case of `@[extern]` all *irrelevant* types are removed first; see next section. ### Translating Types from Lean to C * The integer types `UInt8`, ..., `UInt64`, `USize` are represented by the C types `uint8_t`, ..., `uint64_t`, `size_t`, respectively * `Char` is represented by `uint32_t` * `Float` is represented by `double` * An *enum* inductive type of at least 2 and at most 2^32 constructors, each of which with no parameters, is represented by the first type of `uint8_t`, `uint16_t`, `uint32_t` that is sufficient to represent all constructor indices. For example, the type `Bool` is represented as `uint8_t` with values `0` for `false` and `1` for `true`. * `Decidable α` is represented the same way as `Bool` * An inductive type with a *trivial structure*, that is, * it is none of the types described above * it is not marked `unsafe` * it has a single constructor with a single parameter of *relevant* type is represented by the representation of that parameter's type. For example, `{ x : α // p }`, the `Subtype` structure of a value of type `α` and an irrelevant proof, is represented by the representation of `α`. Similarly, the signed integer types `Int8`, ..., `Int64`, `ISize` are also represented by the unsigned C types `uint8_t`, ..., `uint64_t`, `size_t`, respectively, because they have a trivial structure. * `Nat` and `Int` are represented by `lean_object *`. Their runtime values is either a pointer to an opaque bignum object or, if the lowest bit of the "pointer" is 1 (`lean_is_scalar`), an encoded unboxed natural number or integer (`lean_box`/`lean_unbox`). * A universe `Sort u`, type constructor `... → Sort u`, or proposition `p : Prop` is *irrelevant* and is either statically erased (see above) or represented as a `lean_object *` with the runtime value `lean_box(0)` * Any other type is represented by `lean_object *`. Its runtime value is a pointer to an object of a subtype of `lean_object` (see the "Inductive types" section below) or the unboxed value `lean_box(cidx)` for the `cidx`th constructor of an inductive type if this constructor does not have any relevant parameters. Example: the runtime value of `u : Unit` is always `lean_box(0)`. #### Inductive types For inductive types which are in the fallback `lean_object *` case above and not trivial constructors, the type is stored as a `lean_ctor_object`, and `lean_is_ctor` will return true. A `lean_ctor_object` stores the constructor index in the header, and the fields are stored in the `m_objs` portion of the object. The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows: * Non-scalar fields stored as `lean_object *` * Fields of type `USize` * Other scalar fields, in decreasing order by size Within each group the fields are ordered in declaration order. Trivial wrapper types count as their underlying wrapped type for this purpose. * To access fields of the first kind, use `lean_ctor_get(val, i)` to get the `i`th non-scalar field. * To access `USize` fields, use `lean_ctor_get_usize(val, n+i)` to get the `i`th usize field and `n` is the total number of fields of the first kind. * To access other scalar fields, use `lean_ctor_get_uintN(val, off)` or `lean_ctor_get_usize(val, off)` as appropriate. Here `off` is the byte offset of the field in the structure, starting at `n*sizeof(void*)` where `n` is the number of fields of the first two kinds. For example, a structure such as ```lean structure S where ptr_1 : Array Nat usize_1 : USize sc64_1 : UInt64 sc64_2 : { x : UInt64 // x > 0 } -- wrappers of scalars count as scalars sc64_3 : Float -- `Float` is 64 bit sc8_1 : Bool sc16_1 : UInt16 sc8_2 : UInt8 sc64_4 : UInt64 usize_2 : USize sc32_1 : Char -- trivial wrapper around `UInt32` sc32_2 : UInt32 sc16_2 : UInt16 ``` would get re-sorted into the following memory order: * `S.ptr_1` - `lean_ctor_get(val, 0)` * `S.usize_1` - `lean_ctor_get_usize(val, 1)` * `S.usize_2` - `lean_ctor_get_usize(val, 2)` * `S.sc64_1` - `lean_ctor_get_uint64(val, sizeof(void*)*3)` * `S.sc64_2` - `lean_ctor_get_uint64(val, sizeof(void*)*3 + 8)` * `S.sc64_3` - `lean_ctor_get_float(val, sizeof(void*)*3 + 16)` * `S.sc64_4` - `lean_ctor_get_uint64(val, sizeof(void*)*3 + 24)` * `S.sc32_1` - `lean_ctor_get_uint32(val, sizeof(void*)*3 + 32)` * `S.sc32_2` - `lean_ctor_get_uint32(val, sizeof(void*)*3 + 36)` * `S.sc16_1` - `lean_ctor_get_uint16(val, sizeof(void*)*3 + 40)` * `S.sc16_2` - `lean_ctor_get_uint16(val, sizeof(void*)*3 + 42)` * `S.sc8_1` - `lean_ctor_get_uint8(val, sizeof(void*)*3 + 44)` * `S.sc8_2` - `lean_ctor_get_uint8(val, sizeof(void*)*3 + 45)` ### Borrowing By default, all `lean_object *` parameters of an `@[extern]` function are considered *owned*, i.e. the external code is passed a "virtual RC token" and is responsible for passing this token along to another consuming function (exactly once) or freeing it via `lean_dec`. To reduce reference counting overhead, parameters can be marked as *borrowed* by prefixing their type with `@&`. Borrowed objects must only be passed to other non-consuming functions (arbitrarily often) or converted to owned values using `lean_inc`. In `lean.h`, the `lean_object *` aliases `lean_obj_arg` and `b_lean_obj_arg` are used to mark this difference on the C side. Return values and `@[export]` parameters are always owned at the moment. ## Initialization When including Lean code as part of a larger program, modules must be *initialized* before accessing any of their declarations. Module initialization entails * initialization of all "constants" (nullary functions), including closed terms lifted out of other functions * execution of all `[init]` functions * execution of all `[builtin_init]` functions, if the `builtin` parameter of the module initializer has been set The module initializer is automatically run with the `builtin` flag for executables compiled from Lean code and for "plugins" loaded with `lean --plugin`. For all other modules imported by `lean`, the initializer is run without `builtin`. Thus `[init]` functions are run iff their module is imported, regardless of whether they have native code available or not, while `[builtin_init]` functions are only run for native executable or plugins, regardless of whether their module is imported or not. `lean` uses built-in initializers for e.g. registering basic parsers that should be available even without importing their module (which is necessary for bootstrapping). The initializer for module `A.B` is called `initialize_A_B` and will automatically initialize any imported modules. Module initializers are idempotent (when run with the same `builtin` flag), but not thread-safe. **Important for process-related functionality**: If your application needs to use process-related functions from libuv, such as `Std.Internal.IO.Process.getProcessTitle` and `Std.Internal.IO.Process.setProcessTitle`, you must call `lean_setup_args(argc, argv)` (which returns a potentially modified `argv` that must be used in place of the original) **before** calling `lean_initialize()` or `lean_initialize_runtime_module()`. This sets up process handling capabilities correctly, which is essential for certain system-level operations that Lean's runtime may depend on. Together with initialization of the Lean runtime, you should execute code like the following exactly once before accessing any Lean declarations: ```c void lean_initialize_runtime_module(); void lean_initialize(); char ** lean_setup_args(int argc, char ** argv); lean_object * initialize_A_B(uint8_t builtin, lean_object *); lean_object * initialize_C(uint8_t builtin, lean_object *); ... argv = lean_setup_args(argc, argv); // if using process-related functionality lean_initialize_runtime_module(); //lean_initialize(); // necessary (and replaces `lean_initialize_runtime_module`) if you (indirectly) access the `Lean` package lean_object * res; // use same default as for Lean executables uint8_t builtin = 1; res = initialize_A_B(builtin, lean_io_mk_world()); if (lean_io_result_is_ok(res)) { lean_dec_ref(res); } else { lean_io_result_show_error(res); lean_dec(res); return ...; // do not access Lean declarations if initialization failed } res = initialize_C(builtin, lean_io_mk_world()); if (lean_io_result_is_ok(res)) { ... //lean_init_task_manager(); // necessary if you (indirectly) use `Task` lean_io_mark_end_initialization(); ``` In addition, any other thread not spawned by the Lean runtime itself must be initialized for Lean use by calling ```c void lean_initialize_thread(); ``` and should be finalized in order to free all thread-local resources by calling ```c void lean_finalize_thread(); ``` ## `@[extern]` in the Interpreter The interpreter can run Lean declarations for which symbols are available in loaded shared libraries, which includes `@[extern]` declarations. Thus to e.g. run `#eval` on such a declaration, you need to 1. compile (at least) the module containing the declaration and its dependencies into a shared library, and then 1. pass this library to `lean --load-dynlib=` to run code `import`ing this module. Note that it is not sufficient to load the foreign library containing the external symbol because the interpreter depends on code that is emitted for each `@[extern]` declaration. Thus it is not possible to interpret an `@[extern]` declaration in the same file. See [`tests/compiler/foreign`](https://github.com/leanprover/lean4/tree/master/tests/compiler/foreign/) for an example.