作者公众号:一角钱技术(org_yijiaoqian)
Heap:可以迅速找到一堆数中的最大或者最小值的数据结构。
将根节点最大的堆叫做大顶堆或大根堆,根节点最小的堆叫做小顶堆或小根堆。
常见的堆有二叉堆、裴波那契堆等。
堆本身是一个相对比较抽象的数据结构,那么它有具体的实现就分为二叉堆(二项堆、Binary)、裴波那契堆(基于树的)。那么要实现的话一般面试来说或者经常会用的话就是二叉堆来实现,当然在工业级比较牛逼的应用都是裴波那契以及非常严格裴波那契。裴波那契堆因为相对比较复杂,你可以不知道去怎么实现,但是你可以看它的时空复杂度更好,它的实现也是基于树,但是并不少二叉树而是多叉的。
假设是大顶堆,则常见操作(API):
“堆” 这种数据结构为什么会有用? 在很多线上中的情形经常会在使用,比如说经常一个数一个数的过来,同时还有从另外一边的化经常去删除一些数据,问你这些数据里面它最大值是多少?这里的最大值可能就代表优先级最高的结点或者是那个数需要你先处理的,比如说你的任务流里面随时你要拿出优先级最高的任务优先处理,那么这种数据结构就更加的好用。
有些人可能会想我用数组来实现可不可以?或者是我直接排序可不可以?这里我们可以思考一下: 假设维护一个数组,每次有个新元素插入进来,你就需要把整个数组进行一次所谓的排序,这样的话时间复杂度就会是 nlogn,这样的话就是不够高效的,另外删除也可能比较麻烦,假设删除的是最大值或最小值在最后面的话还好,可以是 O(1) 的时间复杂度,但是如果在最前面的话,你就必须把整个数组最前面元素删除掉之后,把整个数组往前挪,这样时间复杂度又变差了。
不同的实现比较:
https://en.wikipedia.org/wiki/Heap_(data_structure)
当你提到“堆” 的时候,不要默认认为是二叉堆,同时你要知道堆的实现又很多种,而二叉堆本身的话只是因为它相对比较容易实现,它的时间效率是堆里面算比较差的。
通过完全二叉树来实现(注意:不是二叉搜索树);
什么是完全二叉树? 就是它的根和每一级结点都是满的,除了最下面一层叶子结点可能不满之外,其他上面结点都是满的。
二叉堆(大顶)它满足下列性质:
由于性质2,就可以保证根结点是最大的结点,所以我们访问最大值就直接返回这个根结点值。这里思考一个问题:为什么要做成树的结构呢?其实就要方便,因为找最大值是O(1)了是满足的,但是问题是后续你要删除一个元素或者你要添加一个元素,怎么让这个操作高效,至少是 logn 的。
因为把一个完全二叉树依次放到一维数组里面去,那么它的孩子和父亲之间的关系,就可以直接用下班的数学关系表示了。
当这个堆要进行维护操作,也就是插入元素的时候以及你要删除元素的时候要怎么做?
例子:85 添加到二叉堆中
操作的细节分为两步:
85 按照上面讲的先插入到堆的尾部,也就是一维数组的尾部,一维数组的尾部的话就上图的位置,因为这是一个完全二叉树,所以它的尾部就是50后面这个结点。插进来之后这个时候就破坏了堆,它的每一个结点都要大于它的儿子的这种属性了,接下来要做的事情就是要把 85 依次地向上浮动,怎么浮动?就是 85 大于它的父亲结点,那么就和父亲结点进行交换,直到走到根如果大于根的话,就和根也进行交换。
85 再继续往前走之后,它要和 80 再进行比较,同理可得:也就是说这个结点每次和它的父亲比,如果它大于它的父亲的话就交换,直到它不再大于它的父亲。
例子:90 从二叉堆中删除
// Java
import java.util.Arrays;
import java.util.NoSuchElementException;
public class BinaryHeap {
private static final int d = 2;
private int[] heap;
private int heapSize;
/**
* This will initialize our heap with default size.
*/
public BinaryHeap(int capacity) {
heapSize = 0;
heap = new int[capacity + 1];
Arrays.fill(heap, -1);
}
public boolean isEmpty() {
return heapSize == 0;
}
public boolean isFull() {
return heapSize == heap.length;
}
private int parent(int i) {
return (i - 1) / d;
}
private int kthChild(int i, int k) {
return d * i + k;
}
/**
* Inserts new element in to heap
* Complexity: O(log N)
* As worst case scenario, we need to traverse till the root
*/
public void insert(int x) {
if (isFull()) {
throw new NoSuchElementException("Heap is full, No space to insert new element");
}
heap[heapSize] = x;
heapSize ++;
heapifyUp(heapSize - 1);
}
/**
* Deletes element at index x
* Complexity: O(log N)
*/
public int delete(int x) {
if (isEmpty()) {
throw new NoSuchElementException("Heap is empty, No element to delete");
}
int maxElement = heap[x];
heap[x] = heap[heapSize - 1];
heapSize--;
heapifyDown(x);
return maxElement;
}
/**
* Maintains the heap property while inserting an element.
*/
private void heapifyUp(int i) {
int insertValue = heap[i];
while (i > 0 && insertValue > heap[parent(i)]) {
heap[i] = heap[parent(i)];
i = parent(i);
}
heap[i] = insertValue;
}
/**
* Maintains the heap property while deleting an element.
*/
private void heapifyDown(int i) {
int child;
int temp = heap[i];
while (kthChild(i, 1) < heapSize) {
child = maxChild(i);
if (temp >= heap[child]) {
break;
}
heap[i] = heap[child];
i = child;
}
heap[i] = temp;
}
private int maxChild(int i) {
int leftChild = kthChild(i, 1);
int rightChild = kthChild(i, 2);
return heap[leftChild] > heap[rightChild] ? leftChild : rightChild;
}
/**
* Prints all elements of the heap
*/
public void printHeap() {
System.out.print("nHeap = ");
for (int i = 0; i < heapSize; i++)
System.out.print(heap[i] + " ");
System.out.println();
}
/**
* This method returns the max element of the heap.
* complexity: O(1)
*/
public int findMax() {
if (isEmpty())
throw new NoSuchElementException("Heap is empty.");
return heap[0];
}
public static void main(String[] args) {
BinaryHeap maxHeap = new BinaryHeap(10);
maxHeap.insert(10);
maxHeap.insert(4);
maxHeap.insert(9);
maxHeap.insert(1);
maxHeap.insert(7);
maxHeap.insert(5);
maxHeap.insert(3);
maxHeap.printHeap();
maxHeap.delete(5);
maxHeap.printHeap();
maxHeap.delete(2);
maxHeap.printHeap();
}
}
C/C++
#include <IOStream>
using namespace std;
class BinaryHeap {
public:
BinaryHeap(int capacity);
void insert(int x);
int erase(int x);
int findMax();
void printHeap();
bool isEmpty() { return heapSize == 0; }
bool isFull() { return heapSize == capacity; }
~BinaryHeap() { delete[] heap; }
private:
void heapifyUp(int i);
void heapifyDown(int i);
int maxChild(int i);
int parent(int i) { return (i - 1) / 2; }
int kthChild(int i, int k) { return 2 * i + k; }
private:
int *heap;
int heapSize;
int capacity;
};
/**
* This will initialize our heap with default size.
*/
BinaryHeap::BinaryHeap(int capacity) {
this->heapSize = 0;
this->capacity = capacity;
this->heap = new int[capacity + 5];
}
/**
* Inserts new element in to heap
* Complexity: O(log N)
* As worst case scenario, we need to traverse till the root
*/
void BinaryHeap::insert(int x) {
try {
if (isFull())
throw -1;
heap[heapSize] = x;
heapSize ++;
heapifyUp(heapSize - 1);
return ;
} catch (int e) {
cout << "Heap is full, No space to insert new element" << endl;
exit(-1);
}
}
/**
* Deletes element at index x
* Complexity: O(log N)
*/
int BinaryHeap::erase(int x) {
try {
if (isEmpty())
throw -1;
int maxElement = heap[x];
heap[x] = heap[heapSize - 1];
heapSize--;
heapifyDown(x);
return maxElement;
} catch (int e) {
cout << "Heap is empty, No element to delete" << endl;
exit(-1);
}
}
/**
* Maintains the heap property while inserting an element.
*/
void BinaryHeap::heapifyUp(int i) {
int insertValue = heap[i];
while (i > 0 && insertValue > heap[parent(i)]) {
heap[i] = heap[parent(i)];
i = parent(i);
}
heap[i] = insertValue;
}
/**
* Maintains the heap property while deleting an element.
*/
void BinaryHeap::heapifyDown(int i) {
int child;
int temp = heap[i];
while (kthChild(i, 1) < heapSize) {
child = maxChild(i);
if (temp >= heap[child]) {
break;
}
heap[i] = heap[child];
i = child;
}
heap[i] = temp;
}
int BinaryHeap::maxChild(int i) {
int leftChild = kthChild(i, 1);
int rightChild = kthChild(i, 2);
return heap[leftChild] > heap[rightChild] ? leftChild : rightChild;
}
/**
* This method returns the max element of the heap.
* complexity: O(1)
*/
int BinaryHeap::findMax() {
try {
if (isEmpty())
throw -1;
return heap[0];
} catch (int e) {
cout << "Heap is empty." << endl;
exit(-1);
}
}
/**
* Prints all elements of the heap
*/
void BinaryHeap::printHeap() {
cout << "nHeap = ";
for (int i = 0; i < heapSize; i++)
cout << heap[i] << " ";
cout << endl;
return ;
}
int main() {
BinaryHeap maxHeap(10);
maxHeap.insert(10);
maxHeap.insert(4);
maxHeap.insert(9);
maxHeap.insert(1);
maxHeap.insert(7);
maxHeap.insert(5);
maxHeap.insert(3);
maxHeap.printHeap();
maxHeap.erase(5);
maxHeap.printHeap();
maxHeap.erase(2);
maxHeap.printHeap();
return 0;
}
// JavaScript
class BinaryHeap {
constructor(compare) {
this.data = [];
this.compare = compare;
}
insert(value) {
this.insertAt(this.data.length, value);
}
insertAt(index, value) {
this.data[index] = value;
// 对比当前节点与其父节点,如果当前节点更小就交换它们
while (index > 0 && this.compare(value, this.data[Math.floor((index - 1) / 2)]) < 0) {
this.data[index] = this.data[Math.floor((index - 1) / 2)];
this.data[Math.floor((index - 1) / 2)] = value;
index = Math.floor((index - 1) / 2);
}
}
delete(index) {
if (this.data.length === 0) return;
let value = this.data[index];
let i = index;
// fix heap
while (i < this.data.length) {
let left = i * 2 + 1;
let right = i * 2 + 2;
// 没有左子节点
if (left >= this.data.length) break;
// 没有右子节点
if (right >= this.data.length) {
this.data[i] = this.data[left];
i = left;
break;
}
// 比较左右子节点的大小,更小的补到父节点
if (this.compare(this.data[left], this.data[right]) < 0) {
this.data[i] = this.data[left];
i = left;
} else {
this.data[i] = this.data[right];
i = right;
}
}
// 查看最后的空位是不是最后的叶子节点
if (i < this.data.length - 1) {
this.insertAt(i, this.data.pop());
} else {
this.data.pop();
}
return value;
}
printHeap() {
console.log("nHeap = ");
console.log(this.data);
}
}
let maxHeap = new BinaryHeap((a, b) => b - a);
maxHeap.insert(10);
maxHeap.insert(4);
maxHeap.insert(9);
maxHeap.insert(1);
maxHeap.insert(7);
maxHeap.insert(5);
maxHeap.insert(3);
maxHeap.printHeap();
maxHeap.delete(5);
maxHeap.printHeap();
maxHeap.delete(2);
maxHeap.printHeap();
Python
import sys
class BinaryHeap:
def __init__(self, capacity):
self.capacity = capacity
self.size = 0
self.Heap = [0]*(self.capacity + 1)
self.Heap[0] = -1 * sys.maxsize
self.FRONT = 1
def parent(self, pos):
return pos//2
def leftChild(self, pos):
return 2 * pos
def rightChild(self, pos):
return (2 * pos) + 1
def isLeaf(self, pos):
if pos >= (self.size//2) and pos <= self.size:
return True
return False
def swap(self, fpos, spos):
self.Heap[fpos], self.Heap[spos] = self.Heap[spos], self.Heap[fpos]
def heapifyDown(self, pos):
if not self.isLeaf(pos):
if (self.Heap[pos] > self.Heap[self.leftChild(pos)] or
self.Heap[pos] > self.Heap[self.rightChild(pos)]):
if self.Heap[self.leftChild(pos)] < self.Heap[self.rightChild(pos)]:
self.swap(pos, self.leftChild(pos))
self.heapifyDown(self.leftChild(pos))
else:
self.swap(pos, self.rightChild(pos))
self.heapifyDown(self.rightChild(pos))
def insert(self, element):
if self.size >= self.capacity :
return
self.size+= 1
self.Heap[self.size] = element
current = self.size
while self.Heap[current] < self.Heap[self.parent(current)]:
self.swap(current, self.parent(current))
current = self.parent(current)
def Print(self):
for i in range(1, (self.size//2)+1):
print(" PARENT : "+ str(self.Heap[i])+" LEFT CHILD : "+
str(self.Heap[2 * i])+" RIGHT CHILD : "+
str(self.Heap[2 * i + 1]))
def minHeap(self):
for pos in range(self.size//2, 0, -1):
self.heapifyDown(pos)
def delete(self):
popped = self.Heap[self.FRONT]
self.Heap[self.FRONT] = self.Heap[self.size]
self.size-= 1
self.heapifyDown(self.FRONT)
return popped
def isEmpty(self):
return self.size == 0
def isFull(self):
return self.size == self.capacity
if __name__ == "__main__":
print('The minHeap is ')
minHeap = BinaryHeap(5)
minHeap.insert(5)
minHeap.insert(3)
minHeap.insert(17)
minHeap.insert(10)
minHeap.insert(84)
minHeap.insert(19)
minHeap.insert(6)
minHeap.insert(22)
minHeap.insert(9)
minHeap.minHeap()
minHeap.Print()
print("The Min val is " + str(minHeap.delete()))
如果只是说堆什么?那么它是一个抽象的数据结构,表示可以非常迅速地拿到一堆数里面的最大值或者最小值,它并不少二叉堆,那么二叉堆和其他的各种堆,维基百科里面有详细说明:
https://en.wikipedia.org/wiki/Heap_(data_structure)
注意:二叉堆只是堆的一种实现形式,二叉堆为什么出现得多?是因为它较为常见且简单,但是它并不是最优的实现,正是因为这个原因,所以二叉堆很多时候并不是完全的那么实用,那么如果在工程的代码里面我们可以直接调 优先队列(priority_queue) 就行了。