Detection and Analysis of Spikes in a Random Sequence

Anirban Dasgupta, Bo Li

Research output: Contribution to journalArticle

Abstract

Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.

Original languageEnglish (US)
Pages (from-to)1429-1451
Number of pages23
JournalMethodology and Computing in Applied Probability
Volume20
Issue number4
DOIs
StatePublished - Dec 1 2018

Fingerprint

Random Sequence
Spike
Climate Change
Continuous Time
Formulation
Shooting
Empirical Process
Extreme Values
Stationary Process
Supremum
Poisson process
Asymptotic Analysis
Hazard
Spacing
Scanning
Clustering
Higher Order
Subset

Keywords

  • Poisson process
  • Probability
  • Random sequence
  • Scan statistic

ASJC Scopus subject areas

  • Statistics and Probability
  • Mathematics(all)

Cite this

Detection and Analysis of Spikes in a Random Sequence. / Dasgupta, Anirban; Li, Bo.

In: Methodology and Computing in Applied Probability, Vol. 20, No. 4, 01.12.2018, p. 1429-1451.

Research output: Contribution to journalArticle

@article{4738b6d161374bccbe16b20934647a74,
title = "Detection and Analysis of Spikes in a Random Sequence",
abstract = "Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.",
keywords = "Poisson process, Probability, Random sequence, Scan statistic",
author = "Anirban Dasgupta and Bo Li",
year = "2018",
month = "12",
day = "1",
doi = "10.1007/s11009-018-9637-0",
language = "English (US)",
volume = "20",
pages = "1429--1451",
journal = "Methodology and Computing in Applied Probability",
issn = "1387-5841",
publisher = "Springer Netherlands",
number = "4",

}

TY - JOUR

T1 - Detection and Analysis of Spikes in a Random Sequence

AU - Dasgupta, Anirban

AU - Li, Bo

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.

AB - Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.

KW - Poisson process

KW - Probability

KW - Random sequence

KW - Scan statistic

UR - http://www.scopus.com/inward/record.url?scp=85045926425&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045926425&partnerID=8YFLogxK

U2 - 10.1007/s11009-018-9637-0

DO - 10.1007/s11009-018-9637-0

M3 - Article

AN - SCOPUS:85045926425

VL - 20

SP - 1429

EP - 1451

JO - Methodology and Computing in Applied Probability

JF - Methodology and Computing in Applied Probability

SN - 1387-5841

IS - 4

ER -