lean4-htt/tests/elab/compile_recursive_array_access.lean
Henrik Böving f9c8b5e93d
fix: potential Array.get!Internal leaks part 1 (#13147)
This PR fixes theoretical leaks in the handling of `Array.get!Internal`
in the code generator.
Currently, the code generator assumes that the value returned by
`get!Internal` is derived from the
`Array` argument. However, this does not generally hold up as we might
also return the `Inhabited`
value in case of an out of bounds access (recall that we continue
execution after panics by
default). This means that we sometimes convert an `Array.get!Internal`
to
`Array.get!InternalBorrowed` when we are not allowed to do so because in
the panic case the
`Inhabited` instance can be returned and if it is an owned value it is
going to leak.

The fix consists of adapting several components to this change:
1. `PropagateBorrow` will only mark the derived value as forcibly
borrowed if both the `Inhabited`
   and `Array` argument are forcibly borrowed.
2. `InferBorrow` will do the same for its data flow analysis
3. The derived value analysis of `ExplicitRC` is extended from a derived
value tree to a derived
value graph where a value may have more than one parent. We only
consider a value borrowed if all
of its parents are still accessible. Then `get!Internal` is equipped
with both its `Inhabited`
   and its `Array` parent.

These changes are sufficient for correctness on their own. However, they
are going to break
`get!Internal` to `get!InternalBorrowed` conversion in most places. This
happens because almost all
`Inhabited` instances are going to be constants. Currently reads from
constants yield semantically
owned values and thus block the `get!InternalBorrowed` conversion. We
would thus prefer for these
constants to be treated as borrows instead.

The owned return is implemented in two ways at the moment:
1. In the C code emitter we do not need to do anything as constants are
marked persistent to begin
   with
2. In the interpreter whenever a constant is pulled from the constant
cache it is `inc`-ed and then
later `dec`-ed somewhere (potentially using a `dec[persistent]` which is
a no-op in C)

This PR changes the semantics of constant reads to instead be borrows
from the constant (they can be
cutely interpreted as "being borrowed from the world"). This enables
many `get!Internal` to have
both their arguments be marked as borrowed and thus still converted to
`get!InternalBorrowed`. Note
that this PR does not yet change the semantics of the interpreter to
account for this
(it will be done in a part 2) and thus introduces (very minor) leaks
temporarily.

Furthermore, we observed code with signatures such as the following:
```lean
@[specialize]
def foo {a : Type} [inst : Inhabited a] (xs : Array a) (f : a -> a -> Bool) ... :=
  ...
  let x := Array.get!Internal inst xs i
  ...
```
being instantiated with `a := UInt32`. This poses a challenge because
`Inhabited` is currently
marked as `nospecialize`, meaning that we are sometimes going to end up
with code such as:
```
def foo._spec (inst : UInt32) (xs : @&Array UInt32) ... :=
  ...
  let inst := box inst
  let x := Array.get!Internal inst xs i
  dec inst
  ...
```
Here `xs` itself was inferred as borrowed, however, the `UInt32`
`Inhabited` instance was not
specialized for (as `Inhabited` is marked `nospecialize`) and thus needs
to be boxed. This causes
the `inst` parameter to `get!Internal` to be owned and thus
`get!InternalBorrowed` conversion fails.
This PR marks `Inhabited` as `weak_specialize` which will make it get
specialized for in this case,
yielding code such as:

```
def foo._spec (xs : @&Array UInt32) ... :=
  ...
  let inst := instInhabitedUInt32
  let inst := box inst
  let x := Array.get!Internal inst xs i
  dec inst
  ...
```
Fortunately the closed term extractor has support for precisely this
feature and thus produces:

```
def inst.boxed_const :=
  let inst := instInhabitedUInt32
  let inst := box inst
  return inst

def foo._spec (xs : @&Array UInt32) ... :=
  ...
  let inst := inst.boxed_const
  let x := Array.get!Internal inst xs i
  ...
```
As described above reads from constants are now interpreted as borrows
and thus the conversion to
`get!InternalBorrowed` becomes legal again.
2026-03-27 00:13:17 +00:00

53 lines
1.6 KiB
Text

/-!
This test asserts that the compiler is able to handle compilation of functions that recurse through
nested arrays in a way that does not unnecessarily remove borrow annotations.
-/
inductive NAryTree where
| tip (x : String)
| node (ys : Array NAryTree)
deriving Inhabited
/--
trace: [Compiler.explicitRc] size: 21
def followPath @&tree @&path : obj :=
cases tree : obj
| NAryTree.tip =>
cases path : obj
| List.nil =>
let x.1 := oproj[0] tree;
inc[ref] x.1;
return x.1
| _ =>
let _x.2 := instInhabitedNAryTree.default._closed_0;
inc[persistent][ref] _x.2;
return _x.2
| NAryTree.node =>
cases path : obj
| List.cons =>
let ys.3 := oproj[0] tree;
let head.4 := oproj[0] path;
let tail.5 := oproj[1] path;
let _x.6 := instInhabitedNAryTree.default;
let _x.7 := Array.get!InternalBorrowed ◾ _x.6 ys.3 head.4;
let _x.8 := followPath _x.7 tail.5;
return _x.8
| _ =>
let _x.9 := instInhabitedNAryTree.default._closed_0;
inc[persistent][ref] _x.9;
return _x.9
[Compiler.explicitRc] size: 3
def followPath._boxed tree path : obj :=
let res := followPath tree path;
dec path;
dec[ref] tree;
return res
-/
#guard_msgs in
set_option trace.Compiler.explicitRc true in
def followPath (tree : NAryTree) (path : List Nat) : String :=
match tree, path with
| .tip x, [] => x
| .node ys, idx :: path => followPath ys[idx]! path
| _, _ => default