We propose a sequential importance sampling strategy to estimate subgraph frequencies and detect network motifs. The method is developed by sampling subgraphs sequentially node by node using a carefully chosen proposal distribution. Viewing the subgraphs as rooted trees, we propose a recursive formula that approximates the number of subgraphs containing a particular node or set of nodes. The proposal used to sample nodes is proportional to this estimated number of subgraphs. The method generates subgraphs from a distribution close to uniform, and performs better than competing methods. We apply the method to four real-world networks and demonstrate outstanding performance in practical examples. Supplemental materials for the article are available online.
- Motif detection
- Sequential importance sampling
- Subgraph concentration
ASJC Scopus subject areas
- Statistics and Probability
- Discrete Mathematics and Combinatorics
- Statistics, Probability and Uncertainty