Summary

Takuya Kitazawa’s concise explanation and from-scratch Python implementation of the array bisection algorithm (binary search for insertion point). Covers the naive O(N) approach, the O(log N) bisection approach, and a practical application using timestamped records.

Takuya Kitazawa 對數組二等分算法(查找插入點的二分搜索)的簡明解釋和 Python 從頭實現。涵蓋樸素的 O(N) 方法、O(log N) 二等分方法以及使用時間戳記錄的實際應用。

Key Points

  • Problem: find insertion point i in sorted array such that arr[i-1] < val <= arr[i]
  • Naive approach: linear scan O(N)
  • Bisection approach: repeatedly split into halves, narrow range while maintaining arr[lo] < val <= arr[hi]; O(log N)
  • Python standard library: bisect module implements this; tutorial reimplements from scratch for understanding
  • Implementation details: edge cases for val < arr[0] (return 0) and val > arr[-1] (return len(arr)); equality checks at current lo/hi/mid before narrowing
  • Practical application: finding the last tweet before a given date from a sorted tweet array; bisect on (timestamp, '') tuple returns insertion point

Insights

The “find last item before timestamp” application is the most useful real-world pattern: bisection on sorted historical records (logs, time-series, events) is far more efficient than linear scan. The key invariant — maintain arr[lo] < val <= arr[hi] through each iteration — is the insight that makes the implementation correct and helps reason about edge cases. Python’s bisect module (bisect_left, bisect_right) handles the difference between left/right insertion points; understanding the from-scratch version makes it clear which to use when.

Connections

Raw Excerpt

Binary search finds out an insertion point in O(log N) time complexity. The basic idea of the method is to repeatedly split arr into two chunks… until a dividing point mid reaches the target value val.