Sorting Algorithms
Detailed analysis of Sorting Algorithms, including QuickSort, MergeSort, HeapSort, and Linear Sorts
Fundamental Concepts
Stability
A sorting algorithm is stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted.
- Stable: Merge Sort, Insertion Sort, Bubble Sort, Counting Sort.
- Unstable: Quick Sort, Heap Sort, Selection Sort.
In-Place vs Out-of-Place
- In-Place: Uses (or stack) extra space. (Quick, Heap, Insertion).
- Out-of-Place: Uses extra space. (Merge, Counting).
Comparison-Based Sorts
1. Quick Sort
Strategy: Divide and Conquer. Partition array around a pivot such that elements < pivot are left, > pivot are right. Complexity:
- Avg:
- Worst: (sorted array with bad pivot).
- Space: (stack). Optimization: Randomized pivot, Median-of-3 pivot, Tail call optimization.
Pseudocode:
QUICKSORT(A, p, r):
if p < r:
q = PARTITION(A, p, r)
QUICKSORT(A, p, q-1)
QUICKSORT(A, q+1, r)
PARTITION(A, p, r):
x = A[r] // pivot
i = p - 1
for j = p to r-1:
if A[j] <= x:
i = i + 1
swap A[i] and A[j]
swap A[i+1] and A[r]
return i + 1import random
def quick_sort(arr, low=0, high=None):
if high is None:
high = len(arr) - 1
if low < high:
pivot_idx = partition(arr, low, high)
quick_sort(arr, low, pivot_idx - 1)
quick_sort(arr, pivot_idx + 1, high)
def partition(arr, low, high):
# Randomized pivot
pivot_idx = random.randint(low, high)
arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 12. Merge Sort
Strategy: Divide and Conquer. Split in half, sort halves, merge sorted halves. Complexity: always. Space . Use: Linked Lists (can be space), External Sorting (disk).
Pseudocode:
MERGESORT(A, p, r):
if p < r:
q = floor((p + r) / 2)
MERGESORT(A, p, q)
MERGESORT(A, q+1, r)
MERGE(A, p, q, r)
MERGE(A, p, q, r):
n1 = q - p + 1
n2 = r - q
L[1..n1+1] = A[p..q]
R[1..n2+1] = A[q+1..r]
L[n1+1] = infinity
R[n2+1] = infinity
i = 1, j = 1
for k = p to r:
if L[i] <= R[j]:
A[k] = L[i]
i = i + 1
else:
A[k] = R[j]
j = j + 1def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result3. Heap Sort
Strategy: Build Max-Heap. Swap root (max) with end, reduce size, heapify root. Complexity: . Space . Pros: No worst-case like Quick Sort. In-place. Cons: Poor cache locality compared to Quick Sort. Unstable.
Pseudocode:
HEAPSORT(A):
BUILD-MAX-HEAP(A)
for i = A.length downto 2:
swap A[1] and A[i]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A, 1)
BUILD-MAX-HEAP(A):
A.heap-size = A.length
for i = floor(A.length/2) downto 1:
MAX-HEAPIFY(A, i)
MAX-HEAPIFY(A, i):
l = LEFT(i)
r = RIGHT(i)
if l <= A.heap-size and A[l] > A[i]:
largest = l
else:
largest = i
if r <= A.heap-size and A[r] > A[largest]:
largest = r
if largest != i:
swap A[i] and A[largest]
MAX-HEAPIFY(A, largest)def heap_sort(arr):
n = len(arr)
# Build max heap
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements from heap
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0]
heapify(arr, i, 0)
def heapify(arr, n, i):
largest = i
left = 2 * i + 1
right = 2 * i + 2
if left < n and arr[left] > arr[largest]:
largest = left
if right < n and arr[right] > arr[largest]:
largest = right
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)4. Insertion Sort
Strategy: Build sorted subarray arr[0..i]. Insert arr[i+1] into correct position.
Complexity: . Best case (already sorted).
Use: Small arrays (), Nearly sorted data.
Pseudocode:
INSERTION-SORT(A):
for j = 2 to A.length:
key = A[j]
i = j - 1
while i > 0 and A[i] > key:
A[i+1] = A[i]
i = i - 1
A[i+1] = keydef insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key5. Bubble Sort
Strategy: Repeatedly swap adjacent elements if they are in wrong order. Complexity: . Best case (already sorted). Use: Educational purposes, rarely in practice.
Pseudocode:
BUBBLE-SORT(A):
for i = 1 to A.length - 1:
for j = A.length downto i + 1:
if A[j] < A[j-1]:
swap A[j] and A[j-1]def bubble_sort(arr):
n = len(arr)
for i in range(n):
swapped = False
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
if not swapped:
break6. Selection Sort
Strategy: Find minimum element, swap with first position. Repeat for remaining subarray. Complexity: always. Use: Educational purposes.
Pseudocode:
SELECTION-SORT(A):
for i = 1 to A.length - 1:
min_idx = i
for j = i + 1 to A.length:
if A[j] < A[min_idx]:
min_idx = j
swap A[i] and A[min_idx]def selection_sort(arr):
n = len(arr)
for i in range(n):
min_idx = i
for j in range(i + 1, n):
if arr[j] < arr[min_idx]:
min_idx = j
arr[i], arr[min_idx] = arr[min_idx], arr[i]Non-Comparison Sorts
1. Counting Sort
Strategy: Count frequency of each element. Compute prefix sums to find positions. Constraint: Keys must be integers in range . Complexity: . Space .
Pseudocode:
COUNTING-SORT(A, k):
C[0..k] = new array initialized to 0
B[1..n] = new array
for j = 1 to A.length:
C[A[j]] = C[A[j]] + 1
// C[i] now contains number of elements equal to i
for i = 1 to k:
C[i] = C[i] + C[i-1]
// C[i] now contains number of elements <= i
for j = A.length downto 1:
B[C[A[j]]] = A[j]
C[A[j]] = C[A[j]] - 1
return Bdef counting_sort(arr):
if not arr: return arr
k = max(arr)
count = [0] * (k + 1)
output = [0] * len(arr)
for x in arr: count[x] += 1
# Accumulate
for i in range(1, k + 1):
count[i] += count[i-1]
# Build output (Backwards for stability)
for x in reversed(arr):
output[count[x] - 1] = x
count[x] -= 1
return output2. Radix Sort
Strategy: Sort digit by digit from Least Significant (LSD) to Most Significant (MSD). Uses a stable subroutine (Counting Sort). Complexity: where is digits. Use: Integers, Strings.
Pseudocode:
RADIX-SORT(A, d):
for i = 1 to d:
use a stable sort to sort array A on digit idef radix_sort(arr):
max_val = max(arr)
exp = 1
while max_val // exp > 0:
counting_sort_by_digit(arr, exp)
exp *= 10
def counting_sort_by_digit(arr, exp):
n = len(arr)
output = [0] * n
count = [0] * 10
for i in range(n):
index = (arr[i] // exp) % 10
count[index] += 1
for i in range(1, 10):
count[i] += count[i - 1]
for i in range(n - 1, -1, -1):
index = (arr[i] // exp) % 10
output[count[index] - 1] = arr[i]
count[index] -= 1
for i in range(n):
arr[i] = output[i]3. Bucket Sort
Strategy: Distribute elements into buckets (ranges). Sort buckets (Insertion sort). Concat. Complexity: Avg . Worst (all in one bucket). Use: Uniformly distributed floats in .
Pseudocode:
BUCKET-SORT(A):
n = A.length
B[0..n-1] = empty arrays
for i = 1 to n:
insert A[i] into B[floor(n * A[i])]
for i = 0 to n-1:
sort B[i] with insertion sort
concatenate B[0], B[1], ..., B[n-1]def bucket_sort(arr):
n = len(arr)
buckets = [[] for _ in range(n)]
# Distribute elements into buckets
for num in arr:
bucket_idx = int(n * num)
buckets[bucket_idx].append(num)
# Sort each bucket (using insertion sort)
for bucket in buckets:
insertion_sort(bucket)
# Concatenate buckets
result = []
for bucket in buckets:
result.extend(bucket)
return resultAdvanced Concepts
1. Topological Sort
See Graph Algorithms. Ordering of DAG nodes.
2. Dutch National Flag Problem (3-Way Partition)
Sort array of 0s, 1s, 2s (Red, White, Blue). Algorithm: One pass , space.
- Pointers:
low(end of 0s),mid(curr),high(start of 2s). - If 0: swap
mid,low;low++,mid++. - If 1:
mid++. - If 2: swap
mid,high;high--.
Pseudocode:
DUTCH-FLAG-SORT(A):
low = 0, mid = 0, high = A.length - 1
while mid <= high:
if A[mid] == 0:
swap A[low] and A[mid]
low = low + 1
mid = mid + 1
else if A[mid] == 1:
mid = mid + 1
else:
swap A[mid] and A[high]
high = high - 1def dutch_flag_sort(arr):
low = mid = 0
high = len(arr) - 1
while mid <= high:
if arr[mid] == 0:
arr[low], arr[mid] = arr[mid], arr[low]
low += 1
mid += 1
elif arr[mid] == 1:
mid += 1
else:
arr[mid], arr[high] = arr[high], arr[mid]
high -= 13. Cyclic Sort
Sort array of numbers to in time, space.
Algorithm: Iterate. While nums[i] is not at index nums[i]-1, swap nums[i] with nums[nums[i]-1].
Pseudocode:
CYCLIC-SORT(A):
i = 0
while i < A.length:
correct_pos = A[i] - 1
if A[i] != A[correct_pos]:
swap A[i] and A[correct_pos]
else:
i = i + 1def cyclic_sort(arr):
i = 0
while i < len(arr):
correct_pos = arr[i] - 1
if arr[i] != arr[correct_pos]:
arr[i], arr[correct_pos] = arr[correct_pos], arr[i]
else:
i += 1Interview Problem Types
Type 1: Kth Element
| Given | Find | Approach |
|---|---|---|
| Unsorted array | Kth largest/smallest | QuickSelect (Partitioning like QuickSort). Avg . |
| Unsorted array | Kth largest | Min-Heap of size K. . |
Type 2: Interval Merging
| Given | Find | Approach |
|---|---|---|
| Set of intervals | Merge overlaps | Sort by start time. Iterate and merge. . |
| Set of intervals | Non-overlapping count | Sort by end time. Greedy. |
Type 3: Anagrams
| Given | Find | Approach |
|---|---|---|
| List of strings | Group anagrams | Sort string chars as key: "eat" -> "aet". HashMap. |
Type 4: Custom Sort
| Given | Find | Approach |
|---|---|---|
| Logs/Versions | Sort by specific rules | Custom Comparator (key=lambda x: (...)). |
Common Pitfalls
Pitfall 1: Worst Case Quick Sort
Wrong: Using deterministic pivot (first/last) on sorted data. Correct: Random shuffle or random pivot.
Pitfall 2: String Sorting Complexity
Wrong: Assuming sorting string array is . Correct: Comparison takes (string length). Total .
Pitfall 3: Unstable Sort for Multiple Criteria
Wrong: Using QuickSort to sort by Age then Name. Correct: Use Stable sort (Merge/Python's Timsort).
Quick Reference
| Algo | Time (Avg) | Time (Worst) | Space | Stable | Best For |
|---|---|---|---|---|---|
| Quick | No | Arrays (Cache friendly) | |||
| Merge | Yes | Linked Lists, Stability | |||
| Heap | No | Low memory | |||
| Insertion | Yes | Small/Nearly sorted | |||
| Bubble | Yes | Educational | |||
| Selection | No | Educational | |||
| Counting | Yes | Small integer range | |||
| Radix | Yes | Fixed length ints/strings | |||
| Bucket | Yes | Uniform distribution |
Practice Problem Categories
- Partitioning: Sort Colors, Move Zeroes, Partition Array.
- Intervals: Merge Intervals, Insert Interval, Meeting Rooms II.
- Sorting Tricks: H-Index, Maximum Gap (Radix/Bucket idea), Wiggle Sort.
- Selection: Kth Largest Element in Array.