CMSA Computer Science for Mathematicians: Point Location and Active Learning – Learning Halfspaces Almost Optimally
Max Hopkins - UCSD
The point location problem is a fundamental question in computational geometry. It asks, given a known partition of R^d into cells by n hyperplanes and an unknown input point x, which cell contains x? More formally, given access to x via linear queries (i.e. for any hyperplane h in R^d we may ask: "is x above or below h?"), our goal is to locate the cell containing x in as few queries as possible.
In its dual formulation, point location is equivalent to a well-studied problem in machine learning: active learning halfspaces. In this version of the problem, we are given a known set of data points X in R^d, and an unknown hyperplane h. The goal is to label all points in X by asking as few linear queries as possible, where in this formulation a linear query simply corresponds to asking for the label of some point in R^d.
It has long been known that solving the point location problem on n hyperplanes in d-dimensions requires at least Omega(dlog(n)) queries. Over the past 40 years, a series of works in the computational geometry literature lowered the corresponding upper bound to O(d^2log(d)log(n)). In this talk, we show how taking a learning theoretic approach to the problem allows us to reach near-linear dependence on d. The proof combines old margin-based learning and vector-scaling techniques with a novel structural decomposition used for dimensionality reduction.
Based on joint work with Daniel Kane, Shachar Lovett, and Gaurav Mahajan.