More efficient `encode`/`decode` for `string`/`ascii`
When experimenting with the benchmarks in https://github.com/xavierleroy/canonical-binary-tries I noticed that our encode
function for Countable
of string
is slow: it creates a list
for each char which is then converted to a positive
and appended to the result.
https://github.com/xavierleroy/canonical-binary-tries/blob/v2/lib/String2pos.v contains a better version, which I have adapted to std++ style. (That repository is BSD licensed so we can use it, our comment contains a link to give credit, of course.).
I also changed the coding of ascii
by reusing the one for string
; this one is also faster as the benchmark below shows.
The decode
function still converts to list
, but I don't know how to avoid that without writing 2^8 = 256 cases.
From stdpp Require Import strings countable.
Import Ascii.
Definition long := "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec euismod aliquam dui, non ultrices nisl rhoncus pretium. Donec tristique at diam eget commodo. Ut elit diam, maximus ac metus at, commodo malesuada diam. Morbi vehicula tortor et mauris malesuada, ac congue mauris ultrices. Donec id faucibus tellus. Vestibulum vulputate molestie urna, in sagittis tortor. Pellentesque placerat orci ut velit mattis consectetur. Mauris non nulla sit amet nisi egestas pellentesque. Aenean tincidunt id leo et placerat. Nam fermentum lectus sit amet pretium egestas. Lorem ipsum dolor sit amet, consectetur adipiscing elit.".
Time Eval vm_compute in Pos.iter (λ s, default "" $ decode (encode s)) long 10000.
(* old: 4.759s *)
(* new encode: 1.876s *)
(* new encode/decode: 0.916s *)
Time Eval vm_compute in (Pos.iter (λ s, default "b" $ decode (encode s)) "a" 1000000)%char.
(* old: 2.636s *)
(* new encode: 0.493s *)
(* newer encode/decode: 0.308s *)
Edited by Robbert Krebbers