To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects

Saikat Dutta, Anshul Arunachalam, Sasa Misailovic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many Machine Learning (ML) algorithms are in-herently random in nature-executing them using the same inputs may lead to slightly different results across different runs. Such randomness makes it challenging for developers to write tests for their implementations of ML algorithms. A natural consequence of randomness is test flakiness-tests both pass and fail non-deterministically for same version of code. Developers often choose to alleviate test flakiness in ML projects by setting seeds in the random number generators used by the code under test. However, this approach commonly serves as a 'workaround' rather than an actual solution. Instead, it may be possible to mitigate flakiness and alleviate the negative effects of setting seeds using alternative approaches. To understand the role of seeds and the feasibility of alternative solutions, we conduct the first large-scale empirical study of the usage of seeds and its implications on testing on a corpus of 114 Machine Learning projects. We identify 461 tests in these projects that fail without seeds and study their nature and root causes. We try to minimize the flakiness of a subset of 42 identified tests using alternative strategies such as tuning algorithm hyper-parameters and adjusting assertion bounds and send them to developers. So far, developers have accepted our fixes for 26 tests. We further manually analyze a subset of 56 tests and study various characteristics such as the nature of test oracles and how the seed settings evolve over time. Finally, we provide a general set of recommendations for both researchers and developers in the context of setting seeds in tests.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE 15th International Conference on Software Testing, Verification and Validation, ICST 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages151-161
Number of pages11
ISBN (Electronic)9781665466790
DOIs
StatePublished - 2022
Event15th IEEE International Conference on Software Testing, Verification and Validation, ICST 2022 - Virtual, Online, Spain
Duration: Apr 4 2022Apr 13 2022

Publication series

NameProceedings - 2022 IEEE 15th International Conference on Software Testing, Verification and Validation, ICST 2022

Conference

Conference15th IEEE International Conference on Software Testing, Verification and Validation, ICST 2022
Country/TerritorySpain
CityVirtual, Online
Period4/4/224/13/22

Keywords

  • Flaky Tests
  • Machine Learning
  • Seeds
  • Testing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects'. Together they form a unique fingerprint.

Cite this