Altcademy - a Forbes magazine logo Best Coding Bootcamp 2023

Heap Sort

Introduction to Heap Sort

Heap Sort is a comparison-based sorting algorithm that uses a binary heap data structure to sort elements. It has an average and worst-case time complexity of O(n log n), making it an efficient sorting algorithm for large data sets.

A binary heap is a complete binary tree that satisfies the heap property: each parent node is either less than or equal to (min-heap) or greater than or equal to (max-heap) its children nodes. The sorting process involves building a heap from the input data and then extracting the elements from the heap in sorted order.

In this lesson, we'll explore the heap sort algorithm in detail and see how it can be used to solve real-world problems.

Real-World Examples and Scenarios of Heap Sort

Heap Sort can be applied in various real-world scenarios, such as:

Sorting large datasets: Heap Sort is efficient for sorting large datasets, such as a database of user information or a collection of scientific data. Its O(n log n) time complexity makes it suitable for handling big data applications.

Priority Queues: Heap Sort can be used to implement priority queues, which are data structures that allow efficient access to the highest or lowest priority element. This is useful in scheduling tasks based on their priority, such as in operating systems or network packet scheduling.

Selection Algorithms: Heap Sort can be used to find the kth largest or smallest element in a dataset. This can be useful in applications like finding the top k performers in a competition or finding the k nearest neighbors in machine learning.

Real-World Scenario: Sorting Student Records

Consider a university that has a large number of student records. The university wants to sort the student records based on their grades. In this scenario, we can use the Heap Sort algorithm to sort the records efficiently.

Problem Statement and Formal Definition

Given an array of student records, where each record contains the student's name and grade, sort the records in ascending order based on their grades.

Input: An array of student records, where each record is a tuple (name, grade), and 0 <= grade <= 100. Output: The sorted array of student records in ascending order based on their grades.

Tying the Problem Statement with the Real-World Scenario

In our university example, we have a list of student records that need to be sorted based on their grades. We can use the Heap Sort algorithm to build a min-heap from the input records and then extract the elements in sorted order.

Solution to the Problem

To solve the problem, we'll follow these steps:

  1. Build a min-heap using the input student records.
  2. Extract the elements from the min-heap in ascending order.

Step 1: Build a Min-Heap

First, we'll build a min-heap from the input student records. In a min-heap, the parent nodes have grades less than or equal to their children nodes. We'll create helper functions to manipulate the heap and maintain the heap property.

Step 2: Extract Elements from the Min-Heap

Once the min-heap is built, we'll extract the elements from the heap in ascending order. We'll swap the root node with the last node, remove the last node, and then heapify the remaining heap. We'll repeat this process until the heap is empty.

Code Solution with High-Level Comments

Here's the Python code to implement the Heap Sort algorithm for our student records problem:

def min_heapify(arr, n, i):
    # Find the smallest among the root, left child, and right child
    smallest = i
    left = 2 * i + 1
    right = 2 * i + 2

    if left < n and arr[left][1] < arr[smallest][1]:
        smallest = left

    if right < n and arr[right][1] < arr[smallest][1]:
        smallest = right

    # Swap and continue heapifying if the root is not the smallest
    if smallest != i:
        arr[i], arr[smallest] = arr[smallest], arr[i]
        min_heapify(arr, n, smallest)

def build_min_heap(arr, n):
    # Build a min-heap by heapifying each non-leaf node
    for i in range(n // 2 - 1, -1, -1):
        min_heapify(arr, n, i)

def heap_sort(arr):
    n = len(arr)

    # Build a min-heap from the input records
    build_min_heap(arr, n)

    # Extract elements from the min-heap in ascending order
    for i in range(n - 1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]
        min_heapify(arr, i, 0)

    return arr

Calling Functions with Actual Values

Now, let's use the heap_sort function to sort a list of student records based on their grades:

student_records = [
    ("Alice", 85),
    ("Bob", 90),
    ("Charlie", 78),
    ("David", 92),
    ("Eva", 74)
]

sorted_records = heap_sort(student_records)
print(sorted_records)

Output:

[('Eva', 74), ('Charlie', 78), ('Alice', 85), ('Bob', 90), ('David', 92)]

As we can see, the student records are sorted in ascending order based on their grades.

Explanation of the Code Solution

The heap_sort function first builds a min-heap from the input student records using the build_min_heap function. The build_min_heap function iterates through each non-leaf node and calls the min_heapify function to maintain the heap property.

The min_heapify function compares the root node with its left and right children, and swaps the root node with the smallest child if necessary. This process is recursively applied to the subtree rooted at the smallest child to maintain the heap property.

After building the min-heap, the heap_sort function extracts the elements from the heap in ascending order by swapping the root node with the last node, removing the last node, and heapifying the remaining heap.

Applying the Solution to Other Real-World Problems

The Heap Sort algorithm can be applied to other real-world problems that involve sorting large datasets, such as:

  1. Sorting a list of products based on their prices.
  2. Sorting a list of cities based on their populations.
  3. Sorting a list of books based on their publication dates.

In each of these cases, the Heap Sort algorithm can efficiently sort the data in ascending or descending order based on the desired attribute.