Huffman with two queues
Huffman coding with two queues instead of a priority queue.
Authors: Jean-Christophe Filliâtre
Topics: Algorithms
Tools: Why3
see also the index (by topic, by tool, by reference, by year)
Huffman coding with two queues (instead of a priority queue)
Huffman coding is an algorithm to build a prefix code given a frequency for each symbol. See https://en.wikipedia.org/wiki/Huffman_coding
Huffman coding is typically implemented with a priority queue. We pick up the two trees with the smallest frequencies frequencies, we build a new tree with the sum of the frequencies, we put it back in the priority queue, until there is a single tree.
However, when the initial list of frequencies is sorted, we can implement the algorithm in a simpler (and more efficient way), using two (regular) queues instead of a priority queue. See http://www.staff.science.uu.nl/~leeuw112/huffman.pdf
In the following, we implement and prove the core of this algorithm, using a sorted list of integers as input and its sum as output (we do not build Huffman trees here). The key here is to prove that the two queues remain sorted.
Author: Jean-Christophe Filliâtre (CNRS) Thanks to Judicaël Courant for pointing this paper out.
use int.Int use seq.Seq use seq.SortedInt use seq.Sum function last (s: seq int) : int = s[length s - 1] lemma add_last: forall s: seq int, x. length s > 0 -> sorted s -> last s <= x -> sorted (snoc s x) lemma sorted_tail: forall s. sorted s -> length s >= 1 -> sorted s[1 .. ] lemma sorted_tail_tail: forall s. sorted s -> length s >= 2 -> sorted s[2 .. ] let huffman_coding (s: seq int) : int requires { length s > 0 } requires { sorted s } ensures { result = sum s } = let ref x = s in let ref y = empty in while length x + length y >= 2 do invariant { length x + length y > 0 } invariant { sum x + sum y = sum s } invariant { sorted x } invariant { sorted y } invariant { length x >= 2 -> length y >= 1 -> last y <= x[0] + x[1] } invariant { length x >= 1 -> length y >= 2 -> last y <= x[0] + y[0] } invariant { length y >= 2 -> last y <= y[0] + y[1] } variant { length x + length y } if length y = 0 then begin y <- snoc y (x[0] + x[1]); x <- x[2 .. ] end else if length x = 0 then begin y <- snoc y[2 .. ] (y[0] + y[1]); end else begin (* both non-empty *) if x[0] <= y[0] then if length x >= 2 && x[1] <= y[0] then begin y <- snoc y (x[0] + x[1]); x <- x[2 .. ] end else begin y <- snoc y[1 .. ] (x[0] + y[0]); x <- x[1 .. ] end else if length y >= 2 && y[1] <= x[0] then begin y <- snoc y[2 .. ] (y[0] + y[1]); end else begin y <- snoc y[1 .. ] (x[0] + y[0]); x <- x[1 .. ] end end done; if length x > 0 then x[0] else y[0]
download ZIP archive
Why3 Proof Results for Project "huffman_with_two_queues"
Theory "huffman_with_two_queues.Top": fully verified
Obligations | Alt-Ergo 2.3.3 | CVC4 1.8 | Z3 4.8.10 | ||||
add_last | 0.02 | --- | --- | ||||
sorted_tail | 0.07 | 0.69 | --- | ||||
sorted_tail_tail | 0.06 | 0.58 | --- | ||||
VC for huffman_coding | --- | --- | --- | ||||
split_vc | |||||||
loop invariant init | 0.01 | 0.06 | --- | ||||
loop invariant init | 0.02 | 0.08 | --- | ||||
loop invariant init | 0.01 | 0.06 | --- | ||||
loop invariant init | 0.02 | 0.08 | --- | ||||
loop invariant init | 0.01 | 0.06 | --- | ||||
loop invariant init | 0.01 | 0.06 | --- | ||||
loop invariant init | 0.01 | 0.06 | --- | ||||
precondition | 0.01 | 0.08 | --- | ||||
loop variant decrease | 0.03 | 0.13 | --- | ||||
loop invariant preservation | 0.02 | 0.08 | --- | ||||
loop invariant preservation | 0.01 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.09 | --- | ||||
loop invariant preservation | 3.12 | --- | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
precondition | 0.01 | 0.07 | --- | ||||
loop variant decrease | 0.03 | 0.12 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
assert (sorted y1[2..]) | |||||||
asserted formula | --- | --- | --- | ||||
assert (length y1 >= 2) | |||||||
asserted formula | 0.01 | 0.07 | --- | ||||
asserted formula | --- | --- | --- | ||||
split_all_full | |||||||
asserted formula | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.24 | --- | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
loop invariant preservation | 2.28 | 1.62 | --- | ||||
precondition | 0.01 | 0.06 | --- | ||||
loop variant decrease | 0.04 | 0.13 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | 2.91 | --- | --- | ||||
loop invariant preservation | --- | --- | 0.21 | ||||
loop invariant preservation | 0.42 | 0.32 | --- | ||||
precondition | 0.01 | 0.09 | --- | ||||
precondition | 0.01 | 0.09 | --- | ||||
loop variant decrease | 0.09 | 0.14 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | --- | --- | --- | ||||
inline_goal | |||||||
loop invariant preservation | 0.03 | 0.11 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | 1.20 | 1.54 | --- | ||||
loop invariant preservation | 0.96 | --- | --- | ||||
loop invariant preservation | 0.68 | 2.00 | --- | ||||
loop invariant preservation | 2.72 | 2.13 | --- | ||||
precondition | 0.01 | 0.06 | --- | ||||
loop variant decrease | 0.04 | 0.12 | --- | ||||
loop invariant preservation | 0.02 | 0.10 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | 0.02 | 0.09 | --- | ||||
loop invariant preservation | 0.01 | 0.06 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | 0.30 | --- | --- | ||||
loop invariant preservation | 0.31 | --- | --- | ||||
loop invariant preservation | 1.59 | 1.19 | --- | ||||
loop invariant preservation | 2.21 | 1.30 | --- | ||||
precondition | 0.01 | 0.09 | --- | ||||
precondition | 0.02 | 0.09 | --- | ||||
loop variant decrease | 0.10 | 0.13 | --- | ||||
loop invariant preservation | 0.02 | 0.09 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | 0.02 | 0.09 | --- | ||||
loop invariant preservation | 0.02 | 0.08 | --- | ||||
loop invariant preservation | --- | --- | --- | ||||
split_all_full | |||||||
loop invariant preservation | 0.53 | 1.74 | --- | ||||
loop invariant preservation | 1.88 | --- | --- | ||||
loop invariant preservation | 0.62 | 1.80 | --- | ||||
loop invariant preservation | 0.96 | --- | --- | ||||
postcondition | 0.19 | 0.14 | --- |