8

Go实现Huffman Tree

 2 years ago
source link: https://allenwind.github.io/blog/3626/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Mr.Feng Blog

NLP、深度学习、机器学习、Python、Go

Go实现Huffman Tree

关于哈夫曼树的知识见旧文哈夫曼树和哈夫曼编码的原理和实现。本文讲解Huffman Tree和Huffman Code的Go语言实现。对比旧文中Python的实现,可以体会到Go语言具有Python一样的简洁和效率。Go语言内置了map数据结构,我们不自己实现。

接下来的实现,使用了Go语言的map、slice、struct,尽量使用Go语言标准库自带的元素,一方面代码简洁,另外一方面标准库是经得起性能考验的。

首先我们定义树的结点

type Node struct {
value string
weigth int
left *Node
right *Node
index int
}

实现优先队列,优先队列通常使用堆实现,Go语言的标准库container/heap中有Heap的接口,我们根据该接口实现优先队列

type PriorityQueue []*Node

func (pq PriorityQueue) Len() int {
return len(pq)
}

func (pq PriorityQueue) Swap(i, j int) {
pq[i], pq[j] = pq[j], pq[i]
pq[i].index = i
pq[j].index = j
}

func (pq *PriorityQueue) Push(x interface{}) {
n := len(*pq)
node := x.(*Node)
node.index = n
*pq = append(*pq, node)
}

func (pq *PriorityQueue) Pop() interface{} {
old := *pq
n := len(old)
node := old[n-1]
node.index = -1
*pq = old[0 : n-1]
return node
}

我们也可以自己实现,实现堆的关键操作是上浮下沉

func (pq *PriorityQueue) Swim(k int) {
for k > 1 && pq.Less(k/2, k) {
pq.Exch(k/2, k)
k = k / 2
}
}

func (pq *PriorityQueue) Sink(k int) {
for 2*k <= pq.length {
j := 2 * k
if j < pq.length && pq.Less(j, j+1) {
j += 1
}
if !pq.Less(k, j) {
break
}
pq.Exch(k, j)
k = j
}
}

对于输入的字符串,我们需要统计它们出现的频次,以此作为权值

func Counter(s string) map[string]int {
c := make(map[string]int)
for i := 0; i < len(s); i++ {
c[string(s[i])] = c[string(s[i])] + 1
}
return c
}

实现哈夫曼树

func BuildHuffmanTree(nodes map[string]int) *Node {
size := len(nodes) + 1
queue := NewPriorityQueue(size)
for value, weight := range nodes {
node := &Node{
value: value,
weight: weight,
}
queue.Push(node)
}

for queue.Len() > 1 {
leftChild := queue.Pop()
rightChild := queue.Pop()
root := &Node{
weight: leftChild.weight + rightChild.weight,
left: leftChild,
right: rightChild,
}
queue.Push(root)
}

return queue.Pop()
}

实现哈夫曼编码

func huffmanCodes(node *Node, prefix string, codes map[string]string) {
if node.value != "" {
codes[node.value] = prefix
} else {
leftPrefix := prefix + "0"
huffmanCodes(node.left, leftPrefix, codes)
rightPrefix := prefix + "1"
huffmanCodes(node.right, rightPrefix, codes)
}
}

func GenerateHuffmanCodes(s string) map[string]string {
c := Counter(s)
tree := BuildHuffmanTree(c)
codes := make(map[string]string)
huffmanCodes(tree, "", codes)
return codes
}

输入”littlefeng”,试结果

map[i:101 g:1100 n:1101 e:111 l:00 t:01 f:100]

哈夫曼编码结果是不唯一的,因为同层权值一样的结点是可以交换。在Go语言语言中,导致不唯一的直接原因是遍历map的顺序是不固定。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK