Abstract
We propose a sequential importance sampling strategy to estimate subgraph frequencies and detect network motifs. The method is developed by sampling subgraphs sequentially node by node using a carefully chosen proposal distribution. Viewing the subgraphs as rooted trees, we propose a recursive formula that approximates the number of subgraphs containing a particular node or set of nodes. The proposal used to sample nodes is proportional to this estimated number of subgraphs. The method generates subgraphs from a distribution close to uniform, and performs better than competing methods. We apply the method to four real-world networks and demonstrate outstanding performance in practical examples. Supplemental materials for the article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 503-515 |
Number of pages | 13 |
Journal | Journal of Computational and Graphical Statistics |
Volume | 27 |
Issue number | 3 |
DOIs | |
State | Published - Jul 3 2018 |
Keywords
- Motif detection
- Sequential importance sampling
- Subgraph concentration
ASJC Scopus subject areas
- Statistics and Probability
- Discrete Mathematics and Combinatorics
- Statistics, Probability and Uncertainty