wp_store/wp_load run slowly with multiple points to assumptions in context

The wp_store and wp_load tactics appear to be very sensitive to the order of points to assumptions in the context. Consider the following snippet, placed after heap_e_spec in tests/heap_lang.v

  Definition heap_e2  : expr [] :=
    let: "x" := ref #1 in
    let: "y" := ref #1 in
    '"x" <- !'"x" + #1 ;; !'"x".
  Lemma heap_e2_spec E N :
     nclose N ⊆ E → heap_ctx N ⊢ WP heap_e2 @ E {{ v, v = #2 }}.
  Proof.
    iIntros {HN} "#?". rewrite /heap_e2. iApply (wp_mask_weaken N); first done.
    wp_alloc l. wp_let.
    wp_alloc l'. wp_let.
    time wp_load.
  Abort.

The wp_load takes 109 seconds to complete on my machine. The problem appears to be in the call to iAssumptionCore in wp_load, which tries to find an assumption l |-> ?v in the context. Because of the order of assumptions, it will first try to unify with l' |-> 1. For some reason, this does not fail instantaneously. You can check that this is the source of delay with the following example:

Remark bad_unify (l l': loc): True.
Proof.
(evar (v: Z );
  let v0 := eval unfold v in v in
unify (l ↦ #v0)%I (l' ↦ #1)%I).
Abort.

The obvious work-around is to just re-order the two points to facts in the context, or use "remember" to "hide" one of them temporarily, but this is obviously brittle. I am not an expert on how unification works under the hood, so it is hard for me to diagnose why it's so slow here.

First, does this occur for others? I am using coq 8.5pl1 and ssreflect ad273277ab38b. If so, is there some more robust work-around?