Make lambdaRust language more like MIR
I mostly want to write this down somewhere so that I know where to find it. ;)
Places / Objects
The key thing that MIR has that we don't is a distinction between "places" and what Rust calls "values". There is unfortunately a clash here because in lambdaRust we also use "value" in the sense of "the result of a WP", which is not the same thing. So I'll call Rust's values "objects" here, and our values "primitive values".
Places are a location in memory, and objects are things you can store in a place. Local variables denote places. There are two ways to turn a place into an object: &place
denotes an object that's just the memory location of the place, and copy(place)
denotes an object that is the content of the place. So if these were typed, then &
takes Place<T>
to Object<&T>
, and copy
takes Place<T>
to Object<T>
. The deref operator *obj
takes Object<&T>
to Place<T>
. Of these, only copy
actually accesses memory! It is the only "read" operation of the core calculus. The only "write" operation is assignment, place <- object
. Both are non-atomic. (Of course there's also be atomic operations, but these are not part of the type system anyway.)
Objects include things like n-ary tuples, so they can be larger than a single location. You can think of them as a list of primitive values. Or maybe it is better to represent them as closures that take a place and then "write themselves into that place". I am not sure yet what works better.
This goes together with consistently having all local variables in the "surface language" be mutable and stored on the heap. We currently don't; only local variables with type own
are actually mutable; we currently also have local variables with type int
that represent immutable integers. I think we should instead have basically an implicit own
as part of the "type ascription", so y: int
means "y
is a stack slot containing an integer".
As a consequence, the "surface language" (the language on which the type system works, also the language which MIR would be compiled to) moves further away from "lambda-rust core". For example, let x = a + b
would be let x = new_place; x <- copy(a) + copy(b)
, writing out the place-to-object conversion explicitly. And +
would now act on objects, not primitive values.
It might be a good idea to make these "objects" explicit primitives in lambda-rust, so that the elaboration has to do less work? In that case they should likely be represented as lists of primitive values. "places" can also be primitive objects, basically isomorphic to locations -- but "bare primitive values" such as locations would just not arise in a program any more, everything would be either an object or a place.
@haidang has some formalization of this, but I am not sure which choices he made in terms of representations.
Ownership predicate
Once everything is on the heap anyway, it might make sense to change ty_own
to take a location (or rather: a place) instead of a list of values (aka an object). That should allow us to get rid of all the hacks that we currently need to make recursive types work. It also prepares for some other possible simplifications / refactorings, like having a ty_bor
that lets a type define what it means to be mutably borrowed, or pinning.
The main reason why I did not do this to begin with was that I did not have a good idea for how to formulate the fact that you can copy stuff around. But I think now I found something that should actually work:
forall loc, ty_own loc -* (exists vs, loc |->* vs * forall loc', loc' |->* vs -* ty_own loc')
This basically says that you can turn a ty_own
into ownership of some memory with some data, and you can move that data to any other location to obtain a ty_own
for that other location. It is still annoying that you have to prove this (whereas it holds by construction currently), but I feel in an entirely place-based word this will work better.
For Copy
types, the forall
part could be persistent.
Closures
One thing we have that MIR does not is closures on the heap. To fix this we should move to a world where the global state contains not just a heap but also an (immutable) mapping from "function names" to (closed) code. Then we can remove closures from the things you can store on the heap, and instead only allow "function names" (corresponding to C function pointers) to be stored on the heap.