# Statistical Significance of Modular Structure Detection

Yu-Teng Chang (Signal and Image Processing Institute, University of Southern California, Los Angeles, CA), Dimitrios Pantazis (McGovern Institute for Brain Research, MIT, Boston, MA, USA), Richard Leahy (Signal and Image Processing Institute, University of Southern California, Los Angeles, CA)

Human brain functional modules organize hierarchically and their structure changes with several factors, including normal aging, adolescence, and certain diseases. A large number of methods have been proposed to identify natural divisions of networks into groups. Perhaps the most popular is modularity [1], which compares the network against a null model and favors within module connections when edges are stronger than their expected values. Divisions that increase modularity are preferred because they lead to modules with high community structure.

Random networks can exhibit high modularity because of incidental concentration of edges, even though they have no underlying organizational structure [2]. This is even more evident in large networks where the number of possible divisions increases rapidly with the network size [3]. Therefore, significant divisions of a network should have higher modularity than random graphs [2, 4].

We propose a statistical procedure to test the significance of a community structure based on its modularity value. As a surrogate of modularity, we use the largest eigenvalue of the difference between the affinity matrices of the network and its null model. Based on previous work on null models [5], we show that the distribution of the largest eigenvalue can be well approximated with a Gamma distribution (Fig. 1a). We derive an empirical formula for the parameters of the Gamma distribution with respect to the size of the network and the variance of its edges (Fig. 1b). Based on this distribution we compute a p-value for the community structure, which can be used as a threshold criterion when partitioning a graph. We demonstrate our method with simulated networks and structural brain networks (Fig. 2).