Frequency domain predictive modelling with aggregated data

Avradeep Bhowmik, Joydeep Ghosh, Oluwasanmi Oluseye Koyejo

Research output: Contribution to conferencePaper

Abstract

Existing work in spatio-temporal data analysis invariably assumes data available as individual measurements with localised estimates. However, for many applications like econometrics, financial forecasting and climate science, data is often obtained as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and application of standard techniques is susceptible to ecological fallacy. In this manuscript we investigate the problem of predictive linear modelling in the scenario where data is aggregated in a non-uniform manner across targets and features. We introduce a novel formulation of the problem in the frequency domain, and develop algorithmic techniques that exploit the duality properties of Fourier analysis to bypass the inherent structural challenges of this setting. We provide theoretical guarantees for generalisation error for our estimation procedure and extend our analysis to capture approximation effects arising from aliasing. Finally, we perform empirical evaluation to demonstrate the efficacy of our algorithmic aproach in predictive modelling on synthetic data, and on three real datasets from agricultural studies, ecological surveys and climate science.

Original languageEnglish (US)
StatePublished - Jan 1 2017
Event20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 - Fort Lauderdale, United States
Duration: Apr 20 2017Apr 22 2017

Conference

Conference20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017
CountryUnited States
CityFort Lauderdale
Period4/20/174/22/17

Fingerprint

Predictive Modeling
Frequency Domain
Climate
Fourier analysis
Generalization Error
Spatio-temporal Data
Data Aggregation
Agglomeration
Aliasing
Fourier Analysis
Synthetic Data
Econometrics
Forecasting
Efficacy
Data analysis
Duality
Scenarios
Target
Formulation
Evaluation

ASJC Scopus subject areas

  • Artificial Intelligence
  • Statistics and Probability

Cite this

Bhowmik, A., Ghosh, J., & Koyejo, O. O. (2017). Frequency domain predictive modelling with aggregated data. Paper presented at 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, United States.

Frequency domain predictive modelling with aggregated data. / Bhowmik, Avradeep; Ghosh, Joydeep; Koyejo, Oluwasanmi Oluseye.

2017. Paper presented at 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, United States.

Research output: Contribution to conferencePaper

Bhowmik, A, Ghosh, J & Koyejo, OO 2017, 'Frequency domain predictive modelling with aggregated data' Paper presented at 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, United States, 4/20/17 - 4/22/17, .
Bhowmik A, Ghosh J, Koyejo OO. Frequency domain predictive modelling with aggregated data. 2017. Paper presented at 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, United States.
Bhowmik, Avradeep ; Ghosh, Joydeep ; Koyejo, Oluwasanmi Oluseye. / Frequency domain predictive modelling with aggregated data. Paper presented at 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, United States.
@conference{106708e6a2d54f18846f41318cfac666,
title = "Frequency domain predictive modelling with aggregated data",
abstract = "Existing work in spatio-temporal data analysis invariably assumes data available as individual measurements with localised estimates. However, for many applications like econometrics, financial forecasting and climate science, data is often obtained as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and application of standard techniques is susceptible to ecological fallacy. In this manuscript we investigate the problem of predictive linear modelling in the scenario where data is aggregated in a non-uniform manner across targets and features. We introduce a novel formulation of the problem in the frequency domain, and develop algorithmic techniques that exploit the duality properties of Fourier analysis to bypass the inherent structural challenges of this setting. We provide theoretical guarantees for generalisation error for our estimation procedure and extend our analysis to capture approximation effects arising from aliasing. Finally, we perform empirical evaluation to demonstrate the efficacy of our algorithmic aproach in predictive modelling on synthetic data, and on three real datasets from agricultural studies, ecological surveys and climate science.",
author = "Avradeep Bhowmik and Joydeep Ghosh and Koyejo, {Oluwasanmi Oluseye}",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
note = "20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 ; Conference date: 20-04-2017 Through 22-04-2017",

}

TY - CONF

T1 - Frequency domain predictive modelling with aggregated data

AU - Bhowmik, Avradeep

AU - Ghosh, Joydeep

AU - Koyejo, Oluwasanmi Oluseye

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Existing work in spatio-temporal data analysis invariably assumes data available as individual measurements with localised estimates. However, for many applications like econometrics, financial forecasting and climate science, data is often obtained as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and application of standard techniques is susceptible to ecological fallacy. In this manuscript we investigate the problem of predictive linear modelling in the scenario where data is aggregated in a non-uniform manner across targets and features. We introduce a novel formulation of the problem in the frequency domain, and develop algorithmic techniques that exploit the duality properties of Fourier analysis to bypass the inherent structural challenges of this setting. We provide theoretical guarantees for generalisation error for our estimation procedure and extend our analysis to capture approximation effects arising from aliasing. Finally, we perform empirical evaluation to demonstrate the efficacy of our algorithmic aproach in predictive modelling on synthetic data, and on three real datasets from agricultural studies, ecological surveys and climate science.

AB - Existing work in spatio-temporal data analysis invariably assumes data available as individual measurements with localised estimates. However, for many applications like econometrics, financial forecasting and climate science, data is often obtained as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and application of standard techniques is susceptible to ecological fallacy. In this manuscript we investigate the problem of predictive linear modelling in the scenario where data is aggregated in a non-uniform manner across targets and features. We introduce a novel formulation of the problem in the frequency domain, and develop algorithmic techniques that exploit the duality properties of Fourier analysis to bypass the inherent structural challenges of this setting. We provide theoretical guarantees for generalisation error for our estimation procedure and extend our analysis to capture approximation effects arising from aliasing. Finally, we perform empirical evaluation to demonstrate the efficacy of our algorithmic aproach in predictive modelling on synthetic data, and on three real datasets from agricultural studies, ecological surveys and climate science.

UR - http://www.scopus.com/inward/record.url?scp=85066067155&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066067155&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85066067155

ER -