Abstract
We propose a sequential importance sampling strategy to estimate subgraph frequencies and detect network motifs. The method is developed by sampling subgraphs sequentially node by node using a carefully chosen proposal distribution. Viewing the subgraphs as rooted trees, we propose a recursive formula that approximates the number of subgraphs containing a particular node or set of nodes. The proposal used to sample nodes is proportional to this estimated number of subgraphs. The method generates subgraphs from a distribution close to uniform, and performs better than competing methods. We apply the method to four real-world networks and demonstrate outstanding performance in practical examples. Supplemental materials for the article are available online.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 503-515 |
| Number of pages | 13 |
| Journal | Journal of Computational and Graphical Statistics |
| Volume | 27 |
| Issue number | 3 |
| DOIs | |
| State | Published - Jul 3 2018 |
Keywords
- Motif detection
- Sequential importance sampling
- Subgraph concentration
ASJC Scopus subject areas
- Statistics and Probability
- Discrete Mathematics and Combinatorics
- Statistics, Probability and Uncertainty