Identification and characterization of information-networks in long-tail data collections

Mostafa M. Elag, Praveen Kumar, Luigi Marini, James D. Myers, Margaret Hedstrom, Beth A. Plale

Research output: Contribution to journalArticlepeer-review


Scientists' ability to synthesize and reuse long-tail scientific data lags far behind their ability to collect and produce these data. Many Earth Science Cyberinfrastructures enable sharing and publishing their data over the web using metadata standards. While profiling data attributes advances the Linked Data approach, it has become clear that building information-networks among distributed data silos is essential to increase their integration and reusability. In this research, we developed a Long-Tail Information-Network (LTIN) model, which uses a metadata-driven approach to build semantic information-networks among datasets published over the web and aggregate them around environmental events. The model identifies and characterizes the spatial and temporal contextual association links and dependencies among datasets. This paper presents the design and application of the LTIN model, and an evaluation of its performance. The model capabilities were demonstrated by inferring the information-network of a stream discharge located at the downstream end of the Illinois River.

Original languageEnglish (US)
Pages (from-to)100-111
Number of pages12
JournalEnvironmental Modelling and Software
StatePublished - 2017


  • Cyberinfrastructure
  • Data-intensive science
  • Environmental data
  • Information-networks
  • Linked-data
  • Long-tail data

ASJC Scopus subject areas

  • Software
  • Environmental Engineering
  • Ecological Modeling


Dive into the research topics of 'Identification and characterization of information-networks in long-tail data collections'. Together they form a unique fingerprint.

Cite this