Preptide

Sorting Algorithms

Detailed analysis of Sorting Algorithms, including QuickSort, MergeSort, HeapSort, and Linear Sorts

Fundamental Concepts

Stability

A sorting algorithm is stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted.

  • Stable: Merge Sort, Insertion Sort, Bubble Sort, Counting Sort.
  • Unstable: Quick Sort, Heap Sort, Selection Sort.

In-Place vs Out-of-Place

  • In-Place: Uses O(1)O(1) (or O(logn)O(\log n) stack) extra space. (Quick, Heap, Insertion).
  • Out-of-Place: Uses O(n)O(n) extra space. (Merge, Counting).

Comparison-Based Sorts

1. Quick Sort

Strategy: Divide and Conquer. Partition array around a pivot such that elements < pivot are left, > pivot are right. Complexity:

  • Avg: O(nlogn)O(n \log n)
  • Worst: O(n2)O(n^2) (sorted array with bad pivot).
  • Space: O(logn)O(\log n) (stack). Optimization: Randomized pivot, Median-of-3 pivot, Tail call optimization.

Pseudocode:

QUICKSORT(A, p, r):
    if p < r:
        q = PARTITION(A, p, r)
        QUICKSORT(A, p, q-1)
        QUICKSORT(A, q+1, r)

PARTITION(A, p, r):
    x = A[r]  // pivot
    i = p - 1
    for j = p to r-1:
        if A[j] <= x:
            i = i + 1
            swap A[i] and A[j]
    swap A[i+1] and A[r]
    return i + 1
import random

def quick_sort(arr, low=0, high=None):
    if high is None:
        high = len(arr) - 1
    
    if low < high:
        pivot_idx = partition(arr, low, high)
        quick_sort(arr, low, pivot_idx - 1)
        quick_sort(arr, pivot_idx + 1, high)

def partition(arr, low, high):
    # Randomized pivot
    pivot_idx = random.randint(low, high)
    arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
    
    pivot = arr[high]
    i = low - 1
    
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

2. Merge Sort

Strategy: Divide and Conquer. Split in half, sort halves, merge sorted halves. Complexity: O(nlogn)O(n \log n) always. Space O(n)O(n). Use: Linked Lists (can be O(1)O(1) space), External Sorting (disk).

Pseudocode:

MERGESORT(A, p, r):
    if p < r:
        q = floor((p + r) / 2)
        MERGESORT(A, p, q)
        MERGESORT(A, q+1, r)
        MERGE(A, p, q, r)

MERGE(A, p, q, r):
    n1 = q - p + 1
    n2 = r - q
    L[1..n1+1] = A[p..q]
    R[1..n2+1] = A[q+1..r]
    L[n1+1] = infinity
    R[n2+1] = infinity
    i = 1, j = 1
    for k = p to r:
        if L[i] <= R[j]:
            A[k] = L[i]
            i = i + 1
        else:
            A[k] = R[j]
            j = j + 1
def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    
    result.extend(left[i:])
    result.extend(right[j:])
    return result

3. Heap Sort

Strategy: Build Max-Heap. Swap root (max) with end, reduce size, heapify root. Complexity: O(nlogn)O(n \log n). Space O(1)O(1). Pros: No worst-case O(n2)O(n^2) like Quick Sort. In-place. Cons: Poor cache locality compared to Quick Sort. Unstable.

Pseudocode:

HEAPSORT(A):
    BUILD-MAX-HEAP(A)
    for i = A.length downto 2:
        swap A[1] and A[i]
        A.heap-size = A.heap-size - 1
        MAX-HEAPIFY(A, 1)

BUILD-MAX-HEAP(A):
    A.heap-size = A.length
    for i = floor(A.length/2) downto 1:
        MAX-HEAPIFY(A, i)

MAX-HEAPIFY(A, i):
    l = LEFT(i)
    r = RIGHT(i)
    if l <= A.heap-size and A[l] > A[i]:
        largest = l
    else:
        largest = i
    if r <= A.heap-size and A[r] > A[largest]:
        largest = r
    if largest != i:
        swap A[i] and A[largest]
        MAX-HEAPIFY(A, largest)
def heap_sort(arr):
    n = len(arr)
    
    # Build max heap
    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)
    
    # Extract elements from heap
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(arr, i, 0)

def heapify(arr, n, i):
    largest = i
    left = 2 * i + 1
    right = 2 * i + 2
    
    if left < n and arr[left] > arr[largest]:
        largest = left
    
    if right < n and arr[right] > arr[largest]:
        largest = right
    
    if largest != i:
        arr[i], arr[largest] = arr[largest], arr[i]
        heapify(arr, n, largest)

4. Insertion Sort

Strategy: Build sorted subarray arr[0..i]. Insert arr[i+1] into correct position. Complexity: O(n2)O(n^2). Best case O(n)O(n) (already sorted). Use: Small arrays (N<50N < 50), Nearly sorted data.

Pseudocode:

INSERTION-SORT(A):
    for j = 2 to A.length:
        key = A[j]
        i = j - 1
        while i > 0 and A[i] > key:
            A[i+1] = A[i]
            i = i - 1
        A[i+1] = key
def insertion_sort(arr):
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        
        arr[j + 1] = key

5. Bubble Sort

Strategy: Repeatedly swap adjacent elements if they are in wrong order. Complexity: O(n2)O(n^2). Best case O(n)O(n) (already sorted). Use: Educational purposes, rarely in practice.

Pseudocode:

BUBBLE-SORT(A):
    for i = 1 to A.length - 1:
        for j = A.length downto i + 1:
            if A[j] < A[j-1]:
                swap A[j] and A[j-1]
def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        swapped = False
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        if not swapped:
            break

6. Selection Sort

Strategy: Find minimum element, swap with first position. Repeat for remaining subarray. Complexity: O(n2)O(n^2) always. Use: Educational purposes.

Pseudocode:

SELECTION-SORT(A):
    for i = 1 to A.length - 1:
        min_idx = i
        for j = i + 1 to A.length:
            if A[j] < A[min_idx]:
                min_idx = j
        swap A[i] and A[min_idx]
def selection_sort(arr):
    n = len(arr)
    for i in range(n):
        min_idx = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_idx]:
                min_idx = j
        arr[i], arr[min_idx] = arr[min_idx], arr[i]

Non-Comparison Sorts

1. Counting Sort

Strategy: Count frequency of each element. Compute prefix sums to find positions. Constraint: Keys must be integers in range [0,k][0, k]. Complexity: O(n+k)O(n + k). Space O(k)O(k).

Pseudocode:

COUNTING-SORT(A, k):
    C[0..k] = new array initialized to 0
    B[1..n] = new array
    for j = 1 to A.length:
        C[A[j]] = C[A[j]] + 1
    // C[i] now contains number of elements equal to i
    for i = 1 to k:
        C[i] = C[i] + C[i-1]
    // C[i] now contains number of elements <= i
    for j = A.length downto 1:
        B[C[A[j]]] = A[j]
        C[A[j]] = C[A[j]] - 1
    return B
def counting_sort(arr):
    if not arr: return arr
    k = max(arr)
    count = [0] * (k + 1)
    output = [0] * len(arr)
    
    for x in arr: count[x] += 1
    
    # Accumulate
    for i in range(1, k + 1):
        count[i] += count[i-1]
        
    # Build output (Backwards for stability)
    for x in reversed(arr):
        output[count[x] - 1] = x
        count[x] -= 1
        
    return output

2. Radix Sort

Strategy: Sort digit by digit from Least Significant (LSD) to Most Significant (MSD). Uses a stable subroutine (Counting Sort). Complexity: O(d(n+k))O(d \cdot (n + k)) where dd is digits. Use: Integers, Strings.

Pseudocode:

RADIX-SORT(A, d):
    for i = 1 to d:
        use a stable sort to sort array A on digit i
def radix_sort(arr):
    max_val = max(arr)
    exp = 1
    
    while max_val // exp > 0:
        counting_sort_by_digit(arr, exp)
        exp *= 10

def counting_sort_by_digit(arr, exp):
    n = len(arr)
    output = [0] * n
    count = [0] * 10
    
    for i in range(n):
        index = (arr[i] // exp) % 10
        count[index] += 1
    
    for i in range(1, 10):
        count[i] += count[i - 1]
    
    for i in range(n - 1, -1, -1):
        index = (arr[i] // exp) % 10
        output[count[index] - 1] = arr[i]
        count[index] -= 1
    
    for i in range(n):
        arr[i] = output[i]

3. Bucket Sort

Strategy: Distribute elements into buckets (ranges). Sort buckets (Insertion sort). Concat. Complexity: Avg O(n+k)O(n+k). Worst O(n2)O(n^2) (all in one bucket). Use: Uniformly distributed floats in [0,1)[0, 1).

Pseudocode:

BUCKET-SORT(A):
    n = A.length
    B[0..n-1] = empty arrays
    for i = 1 to n:
        insert A[i] into B[floor(n * A[i])]
    for i = 0 to n-1:
        sort B[i] with insertion sort
    concatenate B[0], B[1], ..., B[n-1]
def bucket_sort(arr):
    n = len(arr)
    buckets = [[] for _ in range(n)]
    
    # Distribute elements into buckets
    for num in arr:
        bucket_idx = int(n * num)
        buckets[bucket_idx].append(num)
    
    # Sort each bucket (using insertion sort)
    for bucket in buckets:
        insertion_sort(bucket)
    
    # Concatenate buckets
    result = []
    for bucket in buckets:
        result.extend(bucket)
    
    return result

Advanced Concepts

1. Topological Sort

See Graph Algorithms. Ordering of DAG nodes.

2. Dutch National Flag Problem (3-Way Partition)

Sort array of 0s, 1s, 2s (Red, White, Blue). Algorithm: One pass O(n)O(n), O(1)O(1) space.

  • Pointers: low (end of 0s), mid (curr), high (start of 2s).
  • If 0: swap mid, low; low++, mid++.
  • If 1: mid++.
  • If 2: swap mid, high; high--.

Pseudocode:

DUTCH-FLAG-SORT(A):
    low = 0, mid = 0, high = A.length - 1
    while mid <= high:
        if A[mid] == 0:
            swap A[low] and A[mid]
            low = low + 1
            mid = mid + 1
        else if A[mid] == 1:
            mid = mid + 1
        else:
            swap A[mid] and A[high]
            high = high - 1
def dutch_flag_sort(arr):
    low = mid = 0
    high = len(arr) - 1
    
    while mid <= high:
        if arr[mid] == 0:
            arr[low], arr[mid] = arr[mid], arr[low]
            low += 1
            mid += 1
        elif arr[mid] == 1:
            mid += 1
        else:
            arr[mid], arr[high] = arr[high], arr[mid]
            high -= 1

3. Cyclic Sort

Sort array of numbers 11 to nn in O(n)O(n) time, O(1)O(1) space. Algorithm: Iterate. While nums[i] is not at index nums[i]-1, swap nums[i] with nums[nums[i]-1].

Pseudocode:

CYCLIC-SORT(A):
    i = 0
    while i < A.length:
        correct_pos = A[i] - 1
        if A[i] != A[correct_pos]:
            swap A[i] and A[correct_pos]
        else:
            i = i + 1
def cyclic_sort(arr):
    i = 0
    while i < len(arr):
        correct_pos = arr[i] - 1
        if arr[i] != arr[correct_pos]:
            arr[i], arr[correct_pos] = arr[correct_pos], arr[i]
        else:
            i += 1

Interview Problem Types

Type 1: Kth Element

GivenFindApproach
Unsorted arrayKth largest/smallestQuickSelect (Partitioning like QuickSort). Avg O(n)O(n).
Unsorted arrayKth largestMin-Heap of size K. O(nlogk)O(n \log k).

Type 2: Interval Merging

GivenFindApproach
Set of intervalsMerge overlapsSort by start time. Iterate and merge. O(nlogn)O(n \log n).
Set of intervalsNon-overlapping countSort by end time. Greedy.

Type 3: Anagrams

GivenFindApproach
List of stringsGroup anagramsSort string chars as key: "eat" -> "aet". HashMap.

Type 4: Custom Sort

GivenFindApproach
Logs/VersionsSort by specific rulesCustom Comparator (key=lambda x: (...)).

Common Pitfalls

Pitfall 1: Worst Case Quick Sort

Wrong: Using deterministic pivot (first/last) on sorted data. Correct: Random shuffle or random pivot.

Pitfall 2: String Sorting Complexity

Wrong: Assuming sorting string array is O(NlogN)O(N \log N). Correct: Comparison takes O(L)O(L) (string length). Total O(NLlogN)O(N \cdot L \cdot \log N).

Pitfall 3: Unstable Sort for Multiple Criteria

Wrong: Using QuickSort to sort by Age then Name. Correct: Use Stable sort (Merge/Python's Timsort).

Quick Reference

AlgoTime (Avg)Time (Worst)SpaceStableBest For
Quicknlognn \log nn2n^2logn\log nNoArrays (Cache friendly)
Mergenlognn \log nnlognn \log nnnYesLinked Lists, Stability
Heapnlognn \log nnlognn \log n11NoLow memory
Insertionn2n^2n2n^211YesSmall/Nearly sorted
Bubblen2n^2n2n^211YesEducational
Selectionn2n^2n2n^211NoEducational
Countingn+kn+kn+kn+kkkYesSmall integer range
Radixdndndndnn+kn+kYesFixed length ints/strings
Bucketn+kn+kn2n^2n+kn+kYesUniform distribution

Practice Problem Categories

  • Partitioning: Sort Colors, Move Zeroes, Partition Array.
  • Intervals: Merge Intervals, Insert Interval, Meeting Rooms II.
  • Sorting Tricks: H-Index, Maximum Gap (Radix/Bucket idea), Wiggle Sort.
  • Selection: Kth Largest Element in Array.