Abstract

In the Big Data era, informational systems involving humans and machines are being deployed in multifarious societal settings. Many use data analytics as subcomponents for descriptive, predictive, and prescriptive tasks, often trained using machine learning. Yet when analytics components are placed in large-scale sociotechnical systems, it is often difficult to characterize how well the systems will act, measured with criteria relevant in the world. Here, we propose a system modeling technique that treats data analytics components as "noisy black boxes" or stochastic kernels, which together with elementary stochastic analysis provides insight into fundamental performance limits. An example application is helping prioritize people's limited attention, where learning algorithms rank tasks using noisy features and people sequentially select from the ranked list. This paper demonstrates the general technique by developing a stochastic model of analytics-enabled sequential selection, derives fundamental limits using concomitants of order statistics, and assesses limits in terms of system-wide performance metrics, such as screening cost and value of objects selected. Connections to sample complexity for bipartite ranking are also made.

Original languageEnglish (US)
Article number2
JournalFrontiers in ICT
Volume3
Issue numberFEB
DOIs
StatePublished - 2016

Keywords

  • Concomitants of order statistics
  • Data analytics
  • Fundamental limits
  • Sequential selection
  • Sociotechnical systems
  • Stochastic kernels

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Fundamental limits of data analytics in sociotechnical systems'. Together they form a unique fingerprint.

Cite this