CMSA Computer Science for Mathematicians: Counting cliques in real-world graphs

CMSA EVENTS

View Calendar
February 2, 2021 11:30 am - 12:30 pm
via Zoom Video Conferencing
Speaker:

Shweta Jain - University of Illinois, Urbana-Champaign

Cliques are important structures in network science that have been used in numerous applications including spam detection, graph analysis, graph modeling, community detection among others. Obtaining the counts of k-cliques in real-world graphs with millions of nodes and edges is a challenging problem due to combinatorial explosion. Essentially, as k increases, the number of k-cliques goes up exponentially. Existing techniques are (typically) able to count k-cliques for only up to k=5.

In this talk, I will present two algorithms for obtaining k-clique counts that improve the state of the art, both in theory and in practice. The first method is a randomized algorithm that gives a (1+ε)-approximation for the number of k-cliques in a graph. Its running time is proportional to the size of an object called the Turán Shadow, which, for real-world graphs is found to be small. In practice, this algorithm works well for k<=10 and gives orders of magnitude improvement over existing methods. This paper won the Best Paper Award at WWW, 2017.

The second method, a somewhat surprising result, is a simple but powerful algorithm called Pivoter that gives the exact k-clique counts for all k and runs in O(n3^{n/3}) time in the worst case. It uses a classic technique called pivoting that drastically cuts down the search space for cliques and builds a structure called the Succinct Clique Tree from which global and local (per-vertex and per-edge) k-clique counts can be easily read off. In practice, the algorithm is orders of magnitude faster than even other parallel algorithms and makes clique counting feasible for a number of graphs for which clique counting was infeasible before. This paper won the Best Paper Award at WSDM, 2020.

Zoom: https://harvard.zoom.us/j/98231541450