Heap

Heap is a special tree structure in which each parent node is less than or equal to its child node. Then it is called a Min Heap.

If each parent node is greater than or equal to its child node then it is called a max heap. It is very useful is implementing priority queues where the queue item with higher weightage is given more priority in processing.

heap

Heap Representation

Heap is a complete binary tree. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

Since a Heap is a complete binary tree, it can be easily represented as array and array based representation is space efficient.

If the parent node is stored at index I, the left child can be calculated by 2 * I and right child by 2 * I + 1 (assuming the indexing starts at 1).

Let’s visualize: Heap Visualization

class MinHeap:
    def __init__(self, size):
        self.heap = [0] * (size + 1)
        self.heap_size = 0

    def left(self, i):
        return 2 * i

    def right(self, i):
        return 2 * i + 1

    def parent(self, i):
        return i // 2

    def insert(self, key):
        if self.heap_size >= len(self.heap) - 1:
            return "Heap is full"
        
        self.heap_size += 1
        i = self.heap_size
        self.heap[i] = key

        while i > 1 and self.heap[self.parent(i)] > self.heap[i]:
            self.heap[i], self.heap[self.parent(i)] = self.heap[self.parent(i)], self.heap[i]
            i = self.parent(i)

    def min_heapify(self, i):
        left = self.left(i)
        right = self.right(i)
        smallest = i

        if left <= self.heap_size and self.heap[left] < self.heap[i]:
            smallest = left

        if right <= self.heap_size and self.heap[right] < self.heap[smallest]:
            smallest = right

        if smallest != i:
            self.heap[i], self.heap[smallest] = self.heap[smallest], self.heap[i]
            self.min_heapify(smallest)

    def extract_min(self):
        if self.heap_size == 0:
            return "Heap is empty"

        root = self.heap[1]
        self.heap[1] = self.heap[self.heap_size]
        self.heap_size -= 1
        self.min_heapify(1)

        return root

    def decrease_key(self, i, new_val):
        self.heap[i] = new_val
        while i > 1 and self.heap[self.parent(i)] > self.heap[i]:
            self.heap[i], self.heap[self.parent(i)] = self.heap[self.parent(i)], self.heap[i]
            i = self.parent(i)

    def delete_key(self, i):
        self.decrease_key(i, float('-inf'))
        self.extract_min()
    
    def draw(self):
        level = 0
        next_level = 2 ** level
        output = ""
        
        for i in range(1, self.heap_size + 1):
            if i == next_level:
                output += "\n"
                level += 1
                next_level = 2 ** level

            output += str(self.heap[i]) + " "

        return output.strip()
heap = MinHeap(15)
heap.insert(5)
heap.insert(3)
heap.insert(17)
heap.insert(10)
heap.insert(84)

# print(heap.extract_min())
# print(heap.extract_min())
# print(heap.extract_min())

print("Draw")
print(heap.draw())
print()

print("Extract Min")
print(heap.extract_min())
print()

print("Draw")
print(heap.draw())
print()

print("Extract Min")
print(heap.extract_min())
print()

print("Draw")
print(heap.draw())
print()
Draw
3 
5 17 
10 84

Extract Min
3

Draw
5 
10 17 
84

Extract Min
5

Draw
10 
84 17

Complexity

Operation Complexity
Insert O(log(n))
Delete O(log(n))
Peek O(1)
Extract O(log(n))
Find O(n)

The extract can be done in O(log(n))!

High Level Library

In python, we can use heapq library to implement heap. It is a min heap implementation. We can use heapq to implement max heap by negating the values.

# using heapq

import heapq

heap = []
data = [5, 3, 17, 10, 84]
for item in data:
    heapq.heappush(heap, item)

print(heapq.heappop(heap))
print(heapq.heappop(heap))
print(heapq.heappop(heap))
3
5
10
# max heap

import heapq

heap = []
data = [5, 3, 17, 10, 84]
for item in data:
    heapq.heappush(heap, -item)

print(-heapq.heappop(heap))
print(-heapq.heappop(heap))
print(-heapq.heappop(heap))
84
17
10

Real World Usage

Priority Queue

Heap can be used to implement Priority Queue, i.e. a queue but not FIFO. The item with higher priority is processed first.

Let’s say we want to model a queue where the older person is served first irrespective of the age of person who came after, then we can use heap to implement this.

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int

    def __lt__(self, other):
        return self.age > other.age
    
heap = []
persons = [Person("John", 25), Person("Jane", 20), Person("Dave", 30), Person("Mike", 15), Person("Alex", 25)]

for person in persons:
    heapq.heappush(heap, person)

print(heapq.heappop(heap))
print(heapq.heappop(heap))
print(heapq.heappop(heap))
print(heapq.heappop(heap))
Person(name='Dave', age=30)
Person(name='John', age=25)
Person(name='Alex', age=25)
Person(name='Jane', age=20)

Cron Job

Cron job is a job that is scheduled to run at a particular time or after a particular interval. Cron jobs are used in many places like backup, log rotation, etc.

Cron jobs can be implemented using heap. We can store the cron jobs in a min heap and the top of the heap will be the next job to be executed.

from dataclasses import dataclass

@dataclass
class Job:
    start: int
    name: str

    def __lt__(self, other):
        return self.start < other.start
    
heap = []
jobs = [Job(3, "A"), Job(1, "B"), Job(2, "C")]

for job in jobs:
    heapq.heappush(heap, job)

print(heapq.heappop(heap))
print(heapq.heappop(heap))
Job(start=1, name='B')
Job(start=2, name='C')

Heap Sort

Sorting using heap is pretty simple. We can insert all the elements in the heap and then extract the elements one by one. The extracted elements will be sorted.

This is O(nlog(n)) algorithm.

import random
import heapq
unsorted_data = [random.randint(0, 100) for i in range(20)]
print(unsorted_data)

heapq.heapify(unsorted_data)
print(unsorted_data)

unsorted_data = [random.randint(0, 100) for i in range(20)]
print(unsorted_data)

heap = []
for data in unsorted_data:
    heapq.heappush(heap, data)

sorted_data = []
for i in range(len(heap)):
    sorted_data.append(heapq.heappop(heap))

print(sorted_data)
[87, 12, 14, 9, 62, 25, 63, 26, 41, 40, 78, 24, 6, 42, 1, 50, 2, 80, 43, 1]
[1, 1, 6, 2, 12, 24, 14, 9, 41, 40, 78, 87, 25, 42, 63, 50, 26, 80, 43, 62]
[24, 52, 39, 7, 32, 71, 8, 58, 50, 74, 35, 47, 53, 83, 38, 68, 74, 26, 54, 57]
[7, 8, 24, 26, 32, 35, 38, 39, 47, 50, 52, 53, 54, 57, 58, 68, 71, 74, 74, 83]