CMSA Computer Science for Mathematicians: Rank-Based Independence Testing in Near Linear Time
Chaim Even-Zohar - Alan Turing Institute, London
In 1948 Hoeffding proposed a nonparametric test that detects dependence between two continuous random variables (X,Y), based on the ranking of n paired samples (Xi,Yi). The computation of this commonly-used test statistic requires O(n log n) time. Hoeffding's test is consistent against any dependent probability density f(x,y), but can be fooled by other bivariate distributions with continuous margins. Variants of this test with stronger consistency have been considered in works by Blum, Kiefer, and Rosenblatt, Yanagimoto, and Bergsma and Dassios, and others. The so far best known algorithms to compute them have required quadratic time.
We present an algorithm that computes these improved tests in time O(n log n). It is based on a new combinatorial approach for counting pattern occurrences in a given permutation, which we call corner tree formulas, and will be explained in the talk.
Joint work with Calvin Leng.