### Abstract

This paper addresses the problem of choosing the right sources to solicit data from in sensing applications involving broadcast channels, such as those crowdsensing applications where sources share their observations on social media. The goal is to select sources such that expected fusion error is minimized. We assume that soliciting data from a source incurs a cost and that the cost budget is limited. Contrary to other formulations of this problem, we focus on the case where some sources influence others. Hence, asking a source to make a claim affects the behavior of other sources as well, according to an influence model. The paper makes two contributions. First, we develop an analytic model for estimating expected fusion error, given a particular influence graph and solution to the source selection problem. Second, we use that model to search for a solution that minimizes expected fusion error, formulating it as a zero-one integer non-linear programming (INLP) problem. To scale the approach, the paper further proposes a novel reliability-based pruning heuristic (RPH) and a similarity-based lossy estimation (SLE) algorithm that significantly reduce the complexity of the INLP algorithm at the cost of a modest approximation. The analytically computed expected fusion error is validated using both simulations and real-world data from Twitter, demonstrating a good match between analytic predictions and empirical measurements. It is also shown that our method outperforms baselines in terms of resulting fusion error.

Original language | English (US) |
---|---|

Title of host publication | Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017 |

Editors | Kisung Lee, Ling Liu |

Publisher | Institute of Electrical and Electronics Engineers Inc. |

Pages | 1157-1167 |

Number of pages | 11 |

ISBN (Electronic) | 9781538617915 |

DOIs | |

State | Published - Jul 13 2017 |

Event | 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 - Atlanta, United States Duration: Jun 5 2017 → Jun 8 2017 |

### Publication series

Name | Proceedings - International Conference on Distributed Computing Systems |
---|

### Other

Other | 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 |
---|---|

Country | United States |

City | Atlanta |

Period | 6/5/17 → 6/8/17 |

### Fingerprint

### Keywords

- Crowdsourcing
- Expected fusion error
- Similarity based lossy estimation (SLE) algorithm
- Social sensing
- Zero-one integer non-linear programming (INLP)

### ASJC Scopus subject areas

- Software
- Hardware and Architecture
- Computer Networks and Communications

### Cite this

*Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017*(pp. 1157-1167). [7980056] (Proceedings - International Conference on Distributed Computing Systems). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDCS.2017.275

**Optimizing Source Selection in Social Sensing in the Presence of Influence Graphs.** / Shao, Huajie; Wang, Shiguang; Li, Shen; Yao, Shuochao; Zhao, Yiran; Amin, Tanvir; Abdelzaher, Tarek; Kaplan, Lance.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017.*, 7980056, Proceedings - International Conference on Distributed Computing Systems, Institute of Electrical and Electronics Engineers Inc., pp. 1157-1167, 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017, Atlanta, United States, 6/5/17. https://doi.org/10.1109/ICDCS.2017.275

}

TY - GEN

T1 - Optimizing Source Selection in Social Sensing in the Presence of Influence Graphs

AU - Shao, Huajie

AU - Wang, Shiguang

AU - Li, Shen

AU - Yao, Shuochao

AU - Zhao, Yiran

AU - Amin, Tanvir

AU - Abdelzaher, Tarek

AU - Kaplan, Lance

PY - 2017/7/13

Y1 - 2017/7/13

N2 - This paper addresses the problem of choosing the right sources to solicit data from in sensing applications involving broadcast channels, such as those crowdsensing applications where sources share their observations on social media. The goal is to select sources such that expected fusion error is minimized. We assume that soliciting data from a source incurs a cost and that the cost budget is limited. Contrary to other formulations of this problem, we focus on the case where some sources influence others. Hence, asking a source to make a claim affects the behavior of other sources as well, according to an influence model. The paper makes two contributions. First, we develop an analytic model for estimating expected fusion error, given a particular influence graph and solution to the source selection problem. Second, we use that model to search for a solution that minimizes expected fusion error, formulating it as a zero-one integer non-linear programming (INLP) problem. To scale the approach, the paper further proposes a novel reliability-based pruning heuristic (RPH) and a similarity-based lossy estimation (SLE) algorithm that significantly reduce the complexity of the INLP algorithm at the cost of a modest approximation. The analytically computed expected fusion error is validated using both simulations and real-world data from Twitter, demonstrating a good match between analytic predictions and empirical measurements. It is also shown that our method outperforms baselines in terms of resulting fusion error.

AB - This paper addresses the problem of choosing the right sources to solicit data from in sensing applications involving broadcast channels, such as those crowdsensing applications where sources share their observations on social media. The goal is to select sources such that expected fusion error is minimized. We assume that soliciting data from a source incurs a cost and that the cost budget is limited. Contrary to other formulations of this problem, we focus on the case where some sources influence others. Hence, asking a source to make a claim affects the behavior of other sources as well, according to an influence model. The paper makes two contributions. First, we develop an analytic model for estimating expected fusion error, given a particular influence graph and solution to the source selection problem. Second, we use that model to search for a solution that minimizes expected fusion error, formulating it as a zero-one integer non-linear programming (INLP) problem. To scale the approach, the paper further proposes a novel reliability-based pruning heuristic (RPH) and a similarity-based lossy estimation (SLE) algorithm that significantly reduce the complexity of the INLP algorithm at the cost of a modest approximation. The analytically computed expected fusion error is validated using both simulations and real-world data from Twitter, demonstrating a good match between analytic predictions and empirical measurements. It is also shown that our method outperforms baselines in terms of resulting fusion error.

KW - Crowdsourcing

KW - Expected fusion error

KW - Similarity based lossy estimation (SLE) algorithm

KW - Social sensing

KW - Zero-one integer non-linear programming (INLP)

UR - http://www.scopus.com/inward/record.url?scp=85027259965&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027259965&partnerID=8YFLogxK

U2 - 10.1109/ICDCS.2017.275

DO - 10.1109/ICDCS.2017.275

M3 - Conference contribution

AN - SCOPUS:85027259965

T3 - Proceedings - International Conference on Distributed Computing Systems

SP - 1157

EP - 1167

BT - Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017

A2 - Lee, Kisung

A2 - Liu, Ling

PB - Institute of Electrical and Electronics Engineers Inc.

ER -