explain simulation in README + more comments for ex1

d96a7447 · Hai Dang · 259b030d · d96a7447 · d96a7447
Commit d96a7447 authored 5 years ago by Hai Dang
--- a/README.md
+++ b/README.md
@@ -156,6 +156,71 @@ The directory structure is as follows.
    `example3` did not appear in the paper but we verified it in
    [opt/ex3.v](theories/opt/ex3.v).
+### SIMULATION FRAMEWORK
+The simulation framework developed in this project is a standard simulation via coinduction, and is in fact a simplified version of [Hur et al.'s POPL'12 work](https://doi.org/10.1145/2103621.2103666).
+In that sense it is not novel.
+The only difference is that our framework depends on the *Iris* Coq library.
+However, note that this kind of simulation proofs has not been done in
+Iris before, and here we in fact do **not** use the Iris logic to build our simulation.
+We only depend on the Iris Coq library for its *resource
+algebras* in order to obtain some modularity in our reasoning.
+#### Behaviors/Observations
+Since we formalize the Stacked Borrows semantics with a rather limited language without external observations (for
+example, without `syscall`'s), a program in our language only has the
+following observable behaviors:
+1. The program gets *stuck* (undefined behavior) after a finite number of steps.
+2. The program terminates to a value after finite number of steps.
+3. The program diverges with infinite number of silent steps ($\tau$-steps).
+The definition `behave_` (in [sim/behavior.v](theories/sim/behavior.v)) lists these cases, and
+then we take the greatest fixpoint of `behave_` (a co-induction definition) as
+`behave`, using the [paco] library.
+The greatest fixpoint is taken so that when the program diverges, its observation actually
+has infinite number of steps.
+This can be seen in the definition of `behmatch`, where we only make a
+corecursive call of `behave` in the `inftau` case.
+The theorem `sim_prog_sim_classical` (in [sim/program.v](theories/sim/program.v)) proves that,
+if the program `prog_src` simulates `prog_tgt`, i.e. every function of `prog_src`
+is in the simulation relation with the corresponding function of `prog_tgt`
+(where "corresponding" means they have the same function id in both programs),
+then the possible behaviors of `prog_tgt` are *included* in the possible
+behaviors of `prog_src`.
+As undefined behavior subsumes any behavior, `sim_prog_sim_classical` states
+that if `prog_src` does not get stuck (no undefined behavior), then any
+observation of `prog_tgt` is also an observation of `prog_src`.
+More specifically,
+* if `prog_tgt` can terminate to a value, `prog_src` must also be
+able to terminate to the same value, and
+* if `prog_tgt` can diverge, `prog_src` must also be able to diverge, and
+* `prog_tgt` in fact does not get stuck, because that is not an observation of `prog_src`.
+#### Simulation relations
+The main simulation relation is defined (in the standard way of [paco] proofs) for language expressions in `sim_local_body` (in [sim/local.v](theories/sim/local.v)), and is then lifted to the *function simulation* relation `sim_local_fun`.
+The function simulation is then lifted again to the *program simulation* relation (`sim_local_funs`), as defined above in the description of `sim_prog_sim_classical`.
+The main simulation relation `sim_local_body` is explained with further comments in
+[sim/local.v](theories/sim/local.v).
+Every pair of expression `(src,tgt)` in the relation is indexed with a
+*pre-condition*, which is some resource *r* (whose type is defined in our
+resource algebra *Res*---see Section 2 of the [appendix] for more detail). They
+also have a post-condition.
+The resource *r* allows us to state abstractly which part of the global state
+the pair `(src,tgt)` focuses on.
+It is a rely-guarantee setup that allows us to reason locally about functions' bodies, as long as we maintain the invariant on the global state.
+#### `FnEntry` or `Default` Retagging?
+In the optimization directory ([opt](theories/opt)), one can find different examples, some of which use `FnEntry` retagging, while others use `Default` retagging.
+Here, `FnEntry` is used to retag arguments passed into a *Rust* function.
+Note that "functions" in our language do **not** always correspond to Rust functions.
+Functions in our language can also be used to capture some arbitrary expressions in Rust.
+In those cases, when creating new pointers, the `Default` retag is used.
+#### Concurrency support
+We do not target concurrency at all. This is left for future work.
 ## Publicly available repositories
 ===============================
@@ -164,3 +229,5 @@ The repository is BSD-licensed.
 The relevant commit hashes (used when generating the artifact) can be found
 in the file [generation_data.txt](generation_data.txt).
+[paco]: https://github.com/snu-sf/paco
--- a/theories/opt/ex1.v
+++ b/theories/opt/ex1.v
@@ -45,6 +45,8 @@ Definition ex1_opt : function :=
    "v"
  .
+(** A sketch of this proof is given in Section 3.4 of the Stacked Borrows paper.
+  Here the expressions are calling unknown functions "f" and "g". *)
 Lemma ex1_sim_fun :
  ⊨ᶠ ex1_unopt ≥ ex1_opt.
 Proof.
@@ -58,6 +60,8 @@ Proof.
  intros sarg ->.
  sim_apply sim_simple_let=>/=.
  apply: sim_simple_result.
+  (* Step (1) in the proof sketch:
+      retagging with a fresh tag `tg_n` on top of "x"'s stack. *)
  (* Retag a local place *)
  sim_apply sim_simple_let=>/=.
  apply Forall2_cons_inv in AREL as [AREL _].
@@ -74,7 +78,7 @@ Proof.
    (* Write local *)
    sim_apply sim_simple_write_local; [solve_sim..|].
    intros s ?. simplify_eq.
-  (* Call unknown function *)
+  (* Call unknown function "f" *)
  sim_apply sim_simple_let=>/=.
  sim_apply (sim_simple_call 10 [] [] ε); [solve_sim..|].
  intros rf frs frt ??? ? _ _ FREL. simplify_eq/=.
@@ -89,6 +93,12 @@ Proof.
  sim_apply sim_body_write_unique_1; [solve_sim..|].
  intros ?? Htop. simplify_eq/=.
  sim_apply sim_body_let. simplify_eq/=.
+  (* Step (2) in the proof sketch:
+      If "x" had been accessed by "f", then "f" must have use a tag different
+      from `tg_n`, which in turn implies that `tg_n` would have been popped off
+      "x"'s stack. However, since the write `*{int} "x" <- #[42]` does not have
+      undefined behavior, "f" must not have accessed "x", and "x"'s stack must
+      have remained unchanged. *)
  (* Copy local (right) *)
  sim_apply_r sim_body_copy_local_r; [solve_sim..|].
  apply: sim_body_result=>_/=.
@@ -101,11 +111,13 @@ Proof.
  (* We can go back to simple mode! *)
  eapply (sim_simplify (fun_post_simple c)). { by eauto. }
  simplify_eq/=. clear- AREL FREL LOOK.
-  (* Call unknown function *)
+  (* Call unknown function "g" *)
  sim_apply (sim_simple_call 10 [] [] ε); [solve_sim..|].
  intros rf' frs' frt' ??? ? _ _ FREL'. simplify_eq/=.
  apply: sim_simple_result. simpl.
  sim_apply sim_simple_let=>/=.
+  (* Step (3) in the proof sketch:
+    With similar reasoning to Step (2), "g" must not have accessed "x". *)
  (* Copy local (left). We drop to complex just because simple does not support this yet. *)
  intros σs σt.
  sim_apply_l sim_body_copy_local_l; [solve_sim..|].
@@ -135,13 +147,21 @@ Corollary ex1 (prog: fn_env) :
  let prog_tgt := <["ex1":=ex1_opt]> prog in
  behave_prog prog_tgt <1= behave_prog prog_src.
 Proof.
-  intros Hdec Hwf. apply sim_prog_sim_classical.
+  intros Hdec Hwf.
+  (* To show behavior inclusion, we show that the programs are in the program
+    simulation relation. *)
+  apply sim_prog_sim_classical.
  { apply Hdec. }
  { apply has_main_insert, Hwf; done. }
+  (* Program simulation means that every function in one program simulates the
+    function in the other program with the same function id. *)
  apply sim_mod_funs_local.
  apply sim_mod_funs_insert; first done.
-  - exact: ex1_sim_fun.
+  - (* the unopt version can simulate the opt version of ex1. *)
-  - exact: sim_mod_funs_refl.
+    exact: ex1_sim_fun.
+  - (* for other functions, they simulate themselves. That is, the simulation
+      relation is reflexive. *)
+    exact: sim_mod_funs_refl.
 Qed.
 Print Assumptions ex1.