diff --git a/HeapLang.md b/HeapLang.md
new file mode 100644
index 0000000000000000000000000000000000000000..0ac3a23b94776714c0537cc66155bc46b9251e56
--- /dev/null
+++ b/HeapLang.md
@@ -0,0 +1,130 @@
+# HeapLang overview
+
+HeapLang is the example language that gets shipped with Iris.  It is not the
+only language you can reason about with Iris, but meant as a reasonable demo
+language for simple examples.
+
+## Language
+
+HeapLang is a lambda-calculus with operations to allocate individual locations,
+`load`, `store`, `CAS` (compare-and-swap) and `FAA` (fetch-and-add). Moreover,
+it has a `fork` construct to spawn new threads.  In terms of values, we have
+integers, booleans, unit, heap locations, as well as (binary) sums and products.
+
+Recursive functions are the only binders, so the sum elimination (`Case`)
+expects both branches to be of function type and passes them the data component
+of the sum.
+
+For technical reasons, the only terms that are considered values are those that
+begin with the `Val` expression former.  This means that, for example, `Pair
+(Val a) (Val b)` is *not* a value -- it reduces to `Val (PairV a b)`, which is.
+This leads to some administrative redexes, and to a distinction between "value
+pairs", "value sums", "value closures" and their "expression" counterparts.
+
+However, this also makes values syntactically uniform, which we exploit in the
+definition of substitution which just skips over `Val` terms, because values
+should be closed and hence not affected by substitution.  As a consequence, we
+can entirely avoid even talking about "closed terms", that notion just does not
+have to come up anywhere.  We also exploit this when writing specifications,
+because we can just write lemmas involving terms of the form `Val ?v` and Coq
+can determine `?v` by unification (because all values start with the `Val`
+constructor).
+
+## Notation
+
+Notation for writing HeapLang terms is defined in
+[`notation.v`](theories/heap_lang/notation.v).  There are two scopes, `%E` for
+expressions and `%V` for values.  For example, `(a, b)%E` is an expression pair
+and `(a, b)%V` a value pair.  The `e` of a `WP e {{ Q }}` is implicitly in `%E`
+scope.
+
+We define a whole lot of short-hands, such as non-recursive functions (`λ:`),
+let-bindings, sequential composition, and a more conventional `match:` that has
+binders in both branches.
+
+Noteworthy is the fact that functions (`rec:`, `λ:`) in the value scope (`%V`)
+are *locked*.  This is to prevent them from being unfolded and reduced too
+eagerly.
+
+## Tactics
+
+HeapLang comes with a bunch of tactics that facilitate stepping through HeapLang
+programs as part of proving a weakest precondition.  All of these tactics assume
+that the current goal is of the shape `WP e @ E {{ Q }}`.
+
+Tactics to take one or more pure program steps:
+
+- `wp_pure`: Perform one pure reduction step.  Pure steps are defined by the
+  `PureExec` typeclass and include beta reduction, projections, constructors, as
+  well as unary and binary arithmetic operators.
+- `wp_pures`: Perform as many pure reduction steps as possible.
+- `wp_rec`, `wp_lam`: Perform a beta reduction.  Unlike `wp_pure`, this will
+  also reduce locked lambdas.
+- `wp_let`, `wp_seq`: Reduce a let-binding or a sequential composition.
+- `wp_proj`: Reduce a projection.
+- `wp_if_true`, `wp_if_false`, `wp_if`: Reduce a conditional expression. The
+  discriminant must already be `true` or `false`.
+- `wp_unop`, `wp_binop`, `wp_op`: Reduce a unary, binary or either kind of
+  arithmetic operator.
+- `wp_case`, `wp_match`: Reduce `Case`/`match:` constructs.
+- `wp_inj`, `wp_pair`, `wp_closure`: Reduce constructors that turn expression
+  sums/pairs/closures into their value counterpart.
+
+Tactics for the heap:
+
+- `wp_alloc l as "H"`: Reduce an allocation instruction and call the new
+  location `l` (in the Coq context) and the points-to assertion `H` (in the
+  spatial context).  You can leave away the `as "H"` to introduce it as an
+  anonymous assertion, i.e., that is equivalent to `as "?"`.
+- `wp_load`: Reduce a load operation.  This automatically finds the points-to
+  assertion in the spatial context, and fails if it cannot be found.
+- `wp_store`: Reduce a store operation.  This automatically finds the points-to
+  assertion in the spatial context, and fails if it cannot be found.
+- `wp_cas_suc`, `wp_cas_fail`: Reduce a succeeding/failing CAS.  This
+  automatically finds the points-to assertion.  It also automatically tries to
+  solve the (in)equality to show that the CAS succeeds/fails, and opens a new
+  goal if it cannot prove this goal.
+- `wp_cas as H1 | H2`: Reduce a CAS, performing a case distinction over whether
+  it succeeds or fails.  This automatically finds the points-to assertion.  The
+  proof of equality in the first new subgoal will be called `H1`, and the proof
+  of the inequality in the second new subgoal will be called `H2`.
+- `wp_faa`: Reduce a FAA.  This automatically finds the points-to assertion.
+
+Further tactics:
+
+- `wp_bind pat`: Apply the bind rule to "focus" the term matching `pat`.  For
+  example, `wp_bind (!_)%E` focuses a load operation.  This is useful in
+  particular when accessing invariants, which is only possible when the `WP` in
+  the goal is for a single, atomic operation -- `wp_bind` can be used to bring
+  the goal into the right shape.
+- `wp_apply pm_trm`: Apply a lemma whose conclusion is a `WP`, automatically
+  applying `wp_bind` as needed.  See the [ProofMode docs](ProofMode.md) for an
+  explanation of `pm_trm`.
+
+There is no tactic for `Fork`, just do `wp_apply wp_fork`.
+
+## Notation and lemmas for derived notions involving a thunk
+
+Sometimes, it is useful to define a derived notion in HeapLang that involves
+thunks.  For example, the parallel composition `e1 ||| e2` is defineable in
+HeapLang, but that requires thunking `e1` and `e2` before passing them to
+`par`. (This is defined in [`par.v`](theories/heap_lang/lib/par.v).)  However,
+this is somewhat subtle because of the distinction between expression lambdas
+and value lambdas.
+
+The normal `e1 ||| e2` notation uses expression lambdas, because clearly we want
+`e1` and `e2` to behave normal under substitution (which they would not in a
+value lambda).  However, the *specification* for parallel composition should use
+value lambdas, because prior to applying it the term will be reduced as much as
+possible to achieve a normal form.  To facilitate this, we define a copy of the
+`e1 ||| e2` notation in the value scope that uses value lambdas.  This is not
+actually a value, but we still but it in the value scope to differentiate from
+the other notation that uses expression lambdas.  (In the future, we might
+decide to add a separate scope for this.)  Then, we write the canonical
+specification using the notation in the value scope.
+
+This works very well for non-recursive notions.  For `while` loops, the
+situation is unfortunately more complex and proving the desired specification
+will likely be more involved than expected, see this [discussion].
+
+[discussion]: https://gitlab.mpi-sws.org/iris/iris/merge_requests/210#note_32842
diff --git a/ProofMode.md b/ProofMode.md
index a6446801c0da9fd3bf6a65f46d96ed9e421af9cd..b165c01a5cd895cf2126ffd4aff53844e8849312 100644
--- a/ProofMode.md
+++ b/ProofMode.md
@@ -335,3 +335,8 @@ Proof mode terms can be written down using the following shorthand syntaxes, too
         (H $! t1 ... tn)
         H
 
+HeapLang tactics
+================
+
+If you came here looking for the `wp_` tactics, those are described in the
+[HeapLang documentation](HeapLang.md).
diff --git a/README.md b/README.md
index aa81a7c3e51aa8a6cef3f59e2d093904130c9816..b99d813d25eaf9c88d573e28171380fa00d406bc 100644
--- a/README.md
+++ b/README.md
@@ -114,6 +114,8 @@ that should be compatible with this version:
 * Information on how to set up your editor for unicode input and output is
   collected in [Editor.md](Editor.md).
 * The Iris Proof Mode (IPM) / MoSeL is documented at [ProofMode.md](ProofMode.md).
+* HeapLang (the Iris example language) and its tactics are documented at
+  [HeapLang.md](HeapLang.md).
 * Naming conventions are documented at [Naming.md](Naming.md).
 * The generated coqdoc is [available online](https://plv.mpi-sws.org/coqdoc/iris/).
 * Discussion about the Iris Coq development happens on the mailing list
diff --git a/theories/heap_lang/lib/coin_flip.v b/theories/heap_lang/lib/coin_flip.v
index 57b085436cbc529a968614e52528d67b8272b304..7e3569b1eb12564cb7d097c37992a63ecffeb34e 100644
--- a/theories/heap_lang/lib/coin_flip.v
+++ b/theories/heap_lang/lib/coin_flip.v
@@ -1,7 +1,7 @@
 From iris.base_logic.lib Require Export invariants.
 From iris.program_logic Require Export atomic.
 From iris.proofmode Require Import tactics.
-From iris.heap_lang Require Import proofmode notation par.
+From iris.heap_lang Require Import proofmode notation.
 Set Default Proof Using "Type".
 
 (** Nondeterminism and Speculation:
diff --git a/theories/heap_lang/lib/increment.v b/theories/heap_lang/lib/increment.v
index 79f600e20796460be3f67970f9e6e9f11fd0b463..5b4c6129f7c3060b4e37c4a77f76cb8ebd2e8e62 100644
--- a/theories/heap_lang/lib/increment.v
+++ b/theories/heap_lang/lib/increment.v
@@ -155,7 +155,7 @@ Section increment_client.
       (* The continuation: From after the atomic triple to the postcondition of the WP *)
       done.
     }
-    wp_apply par_spec; wp_pures.
+    wp_apply wp_par.
     - iAssumption.
     - iAssumption.
     - iIntros (??) "_ !>". done.
diff --git a/theories/heap_lang/lib/par.v b/theories/heap_lang/lib/par.v
index 5fd18ec3f008159b067604eca73d700595c6bf18..558853f4f48d15d4eae8f61cffd88981686e4abd 100644
--- a/theories/heap_lang/lib/par.v
+++ b/theories/heap_lang/lib/par.v
@@ -12,6 +12,7 @@ Definition par : val :=
     let: "v1" := join "handle" in
     ("v1", "v2").
 Notation "e1 ||| e2" := (par (λ: <>, e1)%E (λ: <>, e2)%E) : expr_scope.
+Notation "e1 ||| e2" := (par (λ: <>, e1)%V (λ: <>, e2)%V) : val_scope.
 
 Section proof.
 Local Set Default Proof Using "Type*".
@@ -37,7 +38,7 @@ Qed.
 Lemma wp_par (Ψ1 Ψ2 : val → iProp Σ) (e1 e2 : expr) (Φ : val → iProp Σ) :
   WP e1 {{ Ψ1 }} -∗ WP e2 {{ Ψ2 }} -∗
   (∀ v1 v2, Ψ1 v1 ∗ Ψ2 v2 -∗ ▷ Φ (v1,v2)%V) -∗
-  WP e1 ||| e2 {{ Φ }}.
+  WP (e1 ||| e2)%V {{ Φ }}.
 Proof.
   iIntros "H1 H2 H".
   wp_apply (par_spec Ψ1 Ψ2 with "[H1] [H2] [H]"); [by wp_lam..|auto].