Commit graph

740 commits

Author SHA1 Message Date
Henrik Böving
23e49eb519 perf: add prelude to all Lean modules 2024-02-18 14:55:17 -08:00
Sebastian Ullrich
90a516de09
chore: avoid libleanshared symbol limit (#3346) 2024-02-15 11:39:44 +00:00
Henrik Böving
50d661610d
perf: LLVM backend, put all allocas in the first BB to enable mem2reg (#3244)
Again co-developed with @bollu.

Based on top of: #3225 

While hunting down the performance discrepancy on qsort.lean between C
and LLVM we noticed there was a single, trivially optimizeable, alloca
(LLVM's stack memory allocation instruction) that had load/stores in the
hot code path. We then found:
https://groups.google.com/g/llvm-dev/c/e90HiFcFF7Y.

TLDR: `mem2reg`, the pass responsible for getting rid of allocas if
possible, only triggers on an alloca if it is in the first BB. The
allocas of the current implementation get put right at the location
where they are needed -> they are ignored by mem2reg.

Thus we decided to add functionality that allows us to push all allocas
up into the first BB.
We initially wanted to write `buildPrologueAlloca` in a `withReader`
style so:
1. get the current position of the builder
2. jump to first BB and do the thing
3. revert position to the original

However the LLVM C API does not expose an option to obtain the current
position of an IR builder. Thus we ended up at the current
implementation which resets the builder position to the end of the BB
that the function was called from. This is valid because we never
operate anywhere but the end of the current BB in the LLVM emitter.

The numbers on the qsort benchmark got improved by the change as
expected, however we are not fully there yet:
```
C:
Benchmark 1: ./qsort.lean.out 400
  Time (mean ± σ):      2.005 s ±  0.013 s    [User: 1.996 s, System: 0.003 s]
  Range (min … max):    1.993 s …  2.036 s    10 runs

LLVM before aligning the types
Benchmark 1: ./qsort.lean.out 400
  Time (mean ± σ):      2.151 s ±  0.007 s    [User: 2.146 s, System: 0.001 s]
  Range (min … max):    2.142 s …  2.161 s    10 runs

LLVM after aligning the types
Benchmark 1: ./qsort.lean.out 400
  Time (mean ± σ):      2.073 s ±  0.011 s    [User: 2.067 s, System: 0.002 s]
  Range (min … max):    2.060 s …  2.097 s    10 runs

LLVM after this
Benchmark 1: ./qsort.lean.out 400
  Time (mean ± σ):      2.038 s ±  0.009 s    [User: 2.032 s, System: 0.001 s]
  Range (min … max):    2.027 s …  2.052 s    10 runs
```

Note: If you wish to merge this PR independently from its predecessor,
there is no technical dependency between the two, I'm merely stacking
them so we can see the performance impacts of each more clearly.
2024-02-13 14:54:40 +00:00
Henrik Böving
06f73d621b
fix: type mismatches in the LLVM backend (#3225)
Debugged and authored in collaboration with @bollu.

This PR fixes several performance regressions of the LLVM backend
compared to the C backend
as described in #3192. We are now at the point where some benchmarks
from `tests/bench` achieve consistently equal and sometimes ever so
slightly better performance when using LLVM instead of C. However there
are still a few testcases where we are lacking behind ever so slightly.

The PR contains two changes:
1. Using the same types for `lean.h` runtime functions in the LLVM
backend as in `lean.h` it turns out that:
a) LLVM does not throw an error if we declare a function with a
different type than it actually has. This happened on multiple occasions
here, in particular when the function used `unsigned`, as it was
wrongfully assumed to be `size_t` sized.
b) Refuses to inline a function to the call site if such a type mismatch
occurs. This means that we did not inline important functionality such
as `lean_ctor_set` and were thus slowed down compared to the C backend
which did this correctly.
2. While developing this change we noticed that LLVM does treat the
following as invalid: Having a function declared with a certain type but
called with integers of a different type. However this will manifest in
completely nonsensical errors upon optimizing the bitcode file through
`leanc` such as:
```
error: Invalid record (Producer: 'LLVM15.0.7' Reader: 'LLVM 15.0.7')
```
Presumably because the generate .bc file is invalid in the first place.
Thus we added a call to `LLVMVerifyModule` before serializing the module
into a bitcode file. This ended producing the expected type errors from
LLVM an aborting the bitcode file generation as expected.

We manually checked each function in `lean.h` that is mentioned in
`EmitLLVM.lean` to make sure that all of their types align correctly
now.

Quick overview of the fast benchmarks as measured on my machine, 2 runs
of LLVM and 2 runs of C to get a feeling for how far the averages move:
- binarytrees: basically equal performance
- binarytrees.st: basically equal performance
- const_fold: equal if not slightly better for LLVM
- deriv: LLVM has 8% more instructions than C but same wall clock time
- liasolver: basically equal performance
- qsort: LLVM is slower by 7% instructions, 4% time. We have identified
why the generated code is slower (there is a store/load in a hot loop in
LLVM that is not in C) but not figured out why that happens/how to
address it.
- rbmap: LLVM has 3% less instructions and 13% less wall-clock time than
C (woop woop)
- rbmap_1 and rbmap_10 show similar behavior
- rbmap_fbip: LLVM has 2% more instructions but 2% better wall time
- rbmap_library: equal if not slightly better for LLVM
- unionfind: LLVM has 5% more instructions but 4% better wall time

Leaving out benchmarks related to the compiler itself as I was too lazy
to keep recompiling it from scratch until we are on a level with C.

Summing things up, it appears that LLVM has now caught up or surpassed
the C backend in the microbenchmarks for the most part. Next steps from
our side are:
- trying to win the qsort benchmark
- figuring out why/how LLVM runs more instructions for less wall-clock
time. My current guesses would be measurement noise and/or better use of
micro architecture?
- measuring the larger benchmarks as well
2024-02-13 10:57:35 +00:00
Joachim Breitner
368ead54b2 refactor: termination_by changes in stdlib 2024-01-10 17:27:35 +01:00
Kyle Miller
a2226a43ac
feat: encode let_fun using a letFun function (#2973)
Switches from encoding `let_fun` using an annotated `(fun x : t => b) v`
expression to a function application `letFun v (fun x : t => b)`.

---------

Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
2023-12-18 09:01:42 +00:00
Joachim Breitner
b1f2fcf758 fix: Escape ? in C literal strings to avoid trigraphs
This fixes #3829
2023-11-06 16:25:00 +01:00
Siddharth Bhat
0b37bad2cb feat: split bitcode optimization and object file building to be outside lean 2023-11-02 23:21:47 +01:00
Mauricio Collares
cfe5a5f188
chore: change simp default to decide := false (#2722) 2023-11-02 10:06:38 +11:00
int-y1
8d7520b36f chore: fix typos in comments 2023-10-08 10:46:05 +02:00
Sebastian Ullrich
4e52283728 fix: FFI signature mismatches 2023-08-18 19:34:21 +02:00
Sebastian Ullrich
f22695fdc5 fix: interpret module initializer at most once 2023-08-16 10:11:50 -07:00
Henrik
35aa2c91a2 feat: LLVM backend: implement the equivalent of -fstack-clash-protection 2023-08-15 14:45:58 +02:00
Siddharth Bhat
0054f6bfac feat: link 'llvm.h.bc' and then set linkage to internal
This obviates the need to play weak linkage games when
we build `lean.h.bc` from `lean.h`. We perform the following steps:

1. We remove the `static` modifier from all definitions in `lean.h`.
   This makes all definitions have `extern` linkage. Thus, when we build
   a `lean.h.bc` using `clang`, we will actually get definitions
   (instead of an empty file)
2. We build `lean.h.bc` from `lean.h` using `clang`.
3. When it comes time to link, we link
   `current_module.bc := LLVMLinkModules2(current_module.bc, lean.h.bc)`.
4. We loop over every symbol that arrived from `lean.h.bc`
   in `current_module.bc` and we then set this symbol to have
   `internal` linkage. This simulates the effect of
   `#include <lean.h>` where every definition in `lean.h`
   has internal linkage.

This yajna, one hopes, pleases the linker gods.
2023-08-14 13:33:46 +02:00
Tobias Grosser
736a21cd5a chore: remove trailing whitespaces in EmitLLVM
For some reason, these two were missed in the last commit.
2023-08-13 16:18:23 +02:00
Tobias Grosser
a0c0c486fd chore: remove trailing whitespace in EmitLLVM
This patch should not result in any functional changes, but
will reduce the diff of an upcoming PR.
2023-08-13 11:07:14 +02:00
Siddharth Bhat
0eddc167b9 feat: LLVM linkage bindings 2023-08-12 16:51:58 +02:00
Siddharth
b9ec36d089
chore: get rid of all inline C annotations for LLVM (#2363) 2023-07-30 10:39:40 +02:00
Siddharth Bhat
073c8fed86 feat: LLVM backend: support for visibility Style & DLL storage
Changes peeled from:
https://github.com/leanprover/lean4/pull/2340

to allow a `stage0` bump on master before merging in the
changes that allow LLVM to build in stage1+.
2023-07-25 11:03:16 +02:00
Mario Carneiro
76023a7c6f fix: don't run [builtin_init] when builtin = false 2023-07-10 08:58:02 -07:00
Leonardo de Moura
4036be4f50 fix: add missing check at IR checker 2023-06-23 08:43:39 -07:00
Mario Carneiro
e0893b70e5 fix: incorrect type for SpecInfo.argKinds 2023-06-21 22:50:29 -07:00
Sebastian Ullrich
90e2288187 fix: interpret initializers in order 2023-06-05 15:46:35 -07:00
Leonardo de Moura
ebcab266c6 chore: remove empty line 2023-05-05 12:18:36 -07:00
Henrik Böving
0e042d8ef6 fix: LCNF simp forgot to mark normalized decls as simplified 2023-05-05 12:17:26 -07:00
Henrik Böving
36f0acfc51 feat: add timing profiling to the new compiler 2023-04-18 12:20:27 +02:00
Siddharth
5349a089e5
feat: add --target flag for LLVM backend to build objects of a different architecture (#2034)
* feat: add --target flag for LLVM backend to build objects of a different architecture

* chore: remove dead comment

* Update src/Lean/Compiler/IR/EmitLLVM.lean

Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>

* chore: normalize indentation in src/util/shell.cpp

* chore: strip trailing whitespace

Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
2023-01-15 12:00:10 -08:00
Siddharth Bhat
26edfc33f5 chore: remove unused isTaggedPtr from IR.
This reduces the surface area of `unimplemented` in the LLVM backend,
and also removes dead code in the compiler.
2023-01-15 09:24:41 -08:00
Siddharth Bhat
ae4f2de951 fix: metadata codegen for LLVM 2023-01-12 17:39:56 +01:00
Siddharth Bhat
0900aa1348 feat: implement unreachable codegen for LLVM
Also add a test case that exercises `unreachable` code
generation.
2023-01-12 09:17:41 +01:00
Siddharth
5615ba6606
feat: implement uset for LLVM (#2025)
Fixes #1958.
2023-01-09 12:25:37 +00:00
Siddharth
a0a0463451
fix: codegen initUnboxed correctly in LLVM backend (#2015)
Closes #2004.

In porting the bugfix from
6eb852e28f,
I noticed that the LLVM backend was incorrectly generating declaration
initializers (in `callIODeclInitFn`), by assuming the return type of the
initializer is the return type of the declaration. Rather, it must be be
`lean_object`, since the initializer returns an `IO a` value which must be unpacked.

TODO: stop using the `getOrCreateFunction` pattern pervasively.
  perform the `create` at the right location, and the `get`
  at the correct location.
2023-01-07 16:26:53 +01:00
Leonardo de Moura
069f08e3a3 chore: move getUnboxOpName 2023-01-04 08:07:32 -08:00
Leonardo de Moura
6eb852e28f fix: fixes #1998 2023-01-03 16:01:27 -08:00
Siddharth
b6eb780144
feat: LLVM backend (#1837) 2022-12-30 12:45:30 +01:00
Leonardo de Moura
0a031fc9bb chore: replace Expr.forEach with Expr.forEachWhere 2022-12-01 05:19:32 -08:00
Leonardo de Moura
7c5d91ebc3 fix: avoid hygienic ++ hygienic at Specialize.lean 2022-11-30 06:31:03 -08:00
Henrik Böving
5286c2b5aa feat: optimize mul/div into shift operations 2022-11-29 01:05:06 +01:00
Siddharth
4d47c8abc6
feat: add LLVM C API bindings (#1497)
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
Co-authored-by: Gabriel Ebner <gebner@gebner.org>
2022-11-21 09:50:01 +01:00
Sebastian Ullrich
a4abbf07b8 chore: remove remnants of C++ format 2022-11-18 06:11:24 -08:00
Henrik Böving
6fe52cac41 doc: explain some decisions in ElimDeadBranches 2022-11-16 08:17:13 -08:00
Henrik Böving
66009a5cd3 feat: constant folding for decision procedures 2022-11-16 08:17:13 -08:00
Henrik Böving
5a397c8525 feat: ElimDeadBranches for LCNF 2022-11-16 08:17:13 -08:00
Leonardo de Moura
5654d8465d fix: fixes #1822 2022-11-13 18:13:29 -08:00
Leonardo de Moura
a87f0e25de feat: add compiler.checkTypes for sanity checking 2022-11-13 17:45:21 -08:00
Leonardo de Moura
854e655940 chore: document implementedBy := true use at phase 1 2022-11-13 15:47:13 -08:00
Leonardo de Moura
c147059fd7 fix: fixes #1812 2022-11-10 08:58:09 -08:00
Leonardo de Moura
9b02f982e2 chore: add ppCode' 2022-11-10 08:07:35 -08:00
Leonardo de Moura
b4d13a8946 refactor: LetExpr => LetValue
We use "let value" in many other places in the code base.
2022-11-07 18:51:07 -08:00
Leonardo de Moura
e8b4a8875c fix: type argument that is a fvar at FixedParams.lean 2022-11-07 18:18:07 -08:00