lambda-rust issueshttps://gitlab.mpi-sws.org/iris/lambda-rust/-/issues2020-04-17T07:32:27Zhttps://gitlab.mpi-sws.org/iris/lambda-rust/-/issues/6Investigate performance difference caused by merging BI and SBI2020-04-17T07:32:27ZRalf Jungjung@mpi-sws.orgInvestigate performance difference caused by merging BI and SBI@robbertkrebbers, in an experiment, [merged the BI and SBI canonical structures](https://gitlab.mpi-sws.org/iris/iris/-/tree/ci/robbert/merge_sbi) to see if removing the inheritance and coercion between the two makes any difference. Turn...@robbertkrebbers, in an experiment, [merged the BI and SBI canonical structures](https://gitlab.mpi-sws.org/iris/iris/-/tree/ci/robbert/merge_sbi) to see if removing the inheritance and coercion between the two makes any difference. Turns out it does! The [weak_mem branch becomes 37% faster](https://coq-speed.mpi-sws.org/d/1QE_dqjiz/coq-compare?orgId=1&var-project=lambda-rust&var-branch1=masters%2Fweak_mem&var-commit1=bcc7c0be5ba2def0989d727c97d94a8492b9e40b&var-config1=build-coq.8.11.0&var-branch2=ci%2Frobbert%2Fmerge_sbi_weak&var-commit2=4ea92744abcd644f385696c398206a20a2cab7cf&var-config2=build-iris.dev&var-group=().*&var-metric=instructions). Iron, another `monPred` user, [gets 23% faster](https://coq-speed.mpi-sws.org/d/1QE_dqjiz/coq-compare?orgId=1&var-project=iron&var-branch1=master&var-commit1=4dcc142742d846037be4cd94678ff8759465e6c1&var-config1=build-coq.8.11.0&var-branch2=ci%2Frobbert%2Fmerge_sbi&var-commit2=37f90dd22db3ca477377d16ae72d761cb617412c&var-config2=build-iris.dev&var-group=().*&var-metric=instructions). Even lambdaRust [master gains 5%](https://coq-speed.mpi-sws.org/d/1QE_dqjiz/coq-compare?orgId=1&var-project=lambda-rust&var-branch1=master&var-commit1=fe35f4cd910f3f3235fd3c6b6c46fc142d3ce13f&var-config1=build-coq.8.11.0&var-branch2=ci%2Frobbert%2Fmerge_sbi&var-commit2=bea9f4a02efe48a38894fe1286add7ab25a2d2de&var-config2=build-iris.dev&var-group=().*&var-metric=instructions).
This has triggered a [flurry of discussion in Mattermost](https://mattermost.mpi-sws.org/iris/pl/49pr68ips3yuifgbd9qow6u81o), and @janno entered the rabbit hole that is called "unification". In this issue, I hope to collect some of the conclusions and the other data we gathered in that discussion, in case we want to look at this again in the future. Feel free to edit this initial post like a Wiki page to keep it up-to-date.
Cc https://gitlab.mpi-sws.org/iris/lambda-rust/-/issues/3
## `RefCell::try_borrow`
A lot of time was spent looking at `refcell_try_borrow_type`, which went down in type-checking time from 50s to less than 2s.
```
Lemma refcell_try_borrow_type ty `{!TyWf ty} :
typed_val refcell_try_borrow
(fn(∀ α, ∅; &shr{α}(refcell ty)) → option (ref α ty)).
```
### Why was it so slow?
The culprit seems to be Coq trying to unify `ref` with `refmut` during typeclass search.
@janno spent lot of time trying to figure out what Coq is *doing* in these 50s.
(TODO: insert conclusion here)
### When did it become so slow?
After some digging, we narrowed the regression down to [this commit range](https://gitlab.mpi-sws.org/iris/lambda-rust/compare/89e194ca4f17...9424332a485cd6232c3e5091d9b3ad93cd31094f) (confirmed by building first and last commit in Coq 8.8.2 and timing the relevant lemma statement). The final commit here introduces a hack by @jjourdan that temporarily masked the problem:
```
(* FIXME : HACK *)
Local Hint Opaque ref : typeclass_instances.
```
That hack [got removed soon thereafter](https://gitlab.mpi-sws.org/iris/lambda-rust/-/commit/749b6092bbb8c77dcabd3a8ea5925368413a4011), for unclear reasons, as the slowdown was still present. So starting with this commit (confirmed with Coq 8.8.2), type-checking the `refcell_try_borrow_type` statement took 60s (going down to 50s for later Coq versions).
I think this demonstrates that the refcell RA and invariant changes in that range caused the problem (and not some other change in the wider ecosystem).
## `type_sum.v` rewrite
@janno also took a closer look [at this rewrite in `type_sum.v`](https://gitlab.mpi-sws.org/iris/lambda-rust/-/blob/222401db0f6323a35a936b09326e8ed9af57b365/theories/typing/type_sum.v#L249), concluding:
> We start with the task `bi_car (sbi_bi ?M6108) ~= bi_car (monPredI view_bi_index (uPredI (iResUR Σ)))`.
> Here's a rundown of the steps of the version with `sbi`.
> * 1. `bi_car (sbi_bi X) ~= bi_car <something monPredI>`
> * 2. (try to unify head & arguments) `bi_car ~= bi_car` :white_check_mark: That leaves the argument `sbi_bi X ~= monPredI _ _`
> * 3. (reduce) `BI <many terms of the form (sbi_.. X)> ~= BI <many terms with monPred>`.
> * 4. (unify head & arguments) `BI ~= BI` :white_check_mark: & a bunch of arguments. The following steps are repeated for every argument! The first one is `sbi_car X ~= monPred view_bi_index (uPredI (iResUR Σ))`
> * 4.1. (CS search finds `monPredSI`) Sadly, `uPredI (iResUR Σ)` is a `bi`, not an `sbi`. To apply `monPredSI`, we need to find the right `sbi`. Here is where the fun begins! Our new unification task is: `uPredI (iResUR Σ) ~= sbi_bi ?Z`.
> * 4.2 (After running around aimlessly for a bit Coq reduces both sides): `BI <many terms with uPredI > ~= BI <many terms with sbi_ projections>`. EASY! Let's compare all fields separately! The following steps are repeated (this is an inner loop!!) for every argument of `BI`. Let's do the first one for `bi_car` (the projection is not present on the left-hand side).
> * 4.2.1. `uPred (iResUR Σ) ~= sbi_car ?Z`: solved by CS search, unifies `?Z ~= uPredSI _`
> * 4.2.2. repeat for every `bi` field
> * 4.3. repeat for every `bi` field
Note that this unification task pops up as part of a type class search for `Proper`, spawned by one of the lemmas used in the rewrite. Note also that the explanation above applies to weak memory lambda-rust *with* `sbi`. The *SC version* with `sbi` does not have the inner loop. The SC version *without* `sbi` performs only constant work; i.e. the unification task does not scale with the number of fields in `bi`.https://gitlab.mpi-sws.org/iris/lambda-rust/-/issues/5Make lambdaRust language more like MIR2019-06-04T17:41:23ZRalf Jungjung@mpi-sws.orgMake lambdaRust language more like MIRI mostly want to write this down somewhere so that I know where to find it. ;)
### Places / Objects
The key thing that MIR has that we don't is a distinction between "places" and what Rust calls "values". There is unfortunately a clash...I mostly want to write this down somewhere so that I know where to find it. ;)
### Places / Objects
The key thing that MIR has that we don't is a distinction between "places" and what Rust calls "values". There is unfortunately a clash here because in lambdaRust we also use "value" in the sense of "the result of a WP", which is not the same thing. So I'll call Rust's values "objects" here, and our values "primitive values".
Places are a location in memory, and objects are things you can store in a place. Local variables denote places. There are two ways to turn a place into an object: `&place` denotes an object that's just the memory location of the place, and `copy(place)` denotes an object that is the *content* of the place. So if these were typed, then `&` takes `Place<T>` to `Object<&T>`, and `copy` takes `Place<T>` to `Object<T>`. The deref operator `*obj` takes `Object<&T>` to `Place<T>`. Of these, only `copy` actually accesses memory! It is the only "read" operation of the core calculus. The only "write" operation is assignment, `place <- object`. Both are non-atomic. (Of course there's also be atomic operations, but these are not part of the type system anyway.)
Objects include things like n-ary tuples, so they can be larger than a single location. You can think of them as a list of primitive values. Or maybe it is better to represent them as closures that take a place and then "write themselves into that place". I am not sure yet what works better.
This goes together with consistently having all local variables in the "surface language" be mutable and stored on the heap. We currently don't; only local variables with type `own` are actually mutable; we currently also have local variables with type `int` that represent immutable integers. I think we should instead have basically an implicit `own` as part of the "type ascription", so `y: int` means "`y` is a stack slot containing an integer".
As a consequence, the "surface language" (the language on which the type system works, also the language which MIR would be compiled to) moves further away from "lambda-rust core". For example, `let x = a + b` would be `let x = new_place; x <- copy(a) + copy(b)`, writing out the place-to-object conversion explicitly. And `+` would now act on objects, not primitive values.
It might be a good idea to make these "objects" explicit primitives in lambda-rust, so that the elaboration has to do less work? In that case they should likely be represented as lists of primitive values. "places" can also be primitive objects, basically isomorphic to locations -- but "bare primitive values" such as locations would just not arise in a program any more, everything would be either an object or a place.
@haidang has some formalization of this, but I am not sure which choices he made in terms of representations.
### Ownership predicate
Once everything is on the heap anyway, it might make sense to change `ty_own` to take a location (or rather: a place) instead of a list of values (aka an object). That should allow us to get rid of all the hacks that we currently need to make recursive types work. It also prepares for some other possible simplifications / refactorings, like having a `ty_bor` that lets a type define what it means to be mutably borrowed, or pinning.
The main reason why I did not do this to begin with was that I did not have a good idea for how to formulate the fact that you can copy stuff around. But I think now I found something that should actually work:
```
forall loc, ty_own loc -* (exists vs, loc |->* vs * forall loc', loc' |->* vs -* ty_own loc')
```
This basically says that you can turn a `ty_own` into ownership of some memory with some data, and you can move that data to any other location to obtain a `ty_own` for that other location. It is still annoying that you have to prove this (whereas it holds by construction currently), but I feel in an entirely place-based word this will work better.
For `Copy` types, the `forall` part could be persistent.
### Closures
One thing we have that MIR does not is closures on the heap. To fix this we should move to a world where the global state contains not just a heap but also an (immutable) mapping from "function names" to (closed) code. Then we can remove closures from the things you can store on the heap, and instead only allow "function names" (corresponding to C function pointers) to be stored on the heap.https://gitlab.mpi-sws.org/iris/lambda-rust/-/issues/3Investigate performance issues of weak_mem2020-04-16T13:58:25ZJacques-Henri JourdanInvestigate performance issues of weak_memThe weak_mem branch is much slower than master, even though the size of the development is not really much larger. There might be some low-hanging performance improvement that we could do.The weak_mem branch is much slower than master, even though the size of the development is not really much larger. There might be some low-hanging performance improvement that we could do.