Skip to content

More efficient `encode`/`decode` for `string`/`ascii`

Robbert Krebbers requested to merge robbert/string_ascii_countable into master

When experimenting with the benchmarks in https://github.com/xavierleroy/canonical-binary-tries I noticed that our encode function for Countable of string is slow: it creates a list for each char which is then converted to a positive and appended to the result.

https://github.com/xavierleroy/canonical-binary-tries/blob/v2/lib/String2pos.v contains a better version, which I have adapted to std++ style. (That repository is BSD licensed so we can use it, our comment contains a link to give credit, of course.).

I also changed the coding of ascii by reusing the one for string; this one is also faster as the benchmark below shows.

The decode function still converts to list, but I don't know how to avoid that without writing 2^8 = 256 cases.

From stdpp Require Import strings countable.
Import Ascii.

Definition long := "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec euismod aliquam dui, non ultrices nisl rhoncus pretium. Donec tristique at diam eget commodo. Ut elit diam, maximus ac metus at, commodo malesuada diam. Morbi vehicula tortor et mauris malesuada, ac congue mauris ultrices. Donec id faucibus tellus. Vestibulum vulputate molestie urna, in sagittis tortor. Pellentesque placerat orci ut velit mattis consectetur. Mauris non nulla sit amet nisi egestas pellentesque. Aenean tincidunt id leo et placerat. Nam fermentum lectus sit amet pretium egestas. Lorem ipsum dolor sit amet, consectetur adipiscing elit.".

Time Eval vm_compute in Pos.iter (λ s, default "" $ decode (encode s)) long 10000.
(* old: 4.759s *)
(* new encode: 1.876s *)
(* new encode/decode: 0.916s *)

Time Eval vm_compute in (Pos.iter (λ s, default "b" $ decode (encode s)) "a" 1000000)%char.
(* old: 2.636s *)
(* new encode: 0.493s *)
(* newer encode/decode: 0.308s *)
Edited by Robbert Krebbers

Merge request reports

Loading