Abstract
Researchers studying the correspondences between Deep Neural Networks (DNNs) and humans often give little consideration to severe testing when drawing conclusions from empirical findings, and this is impeding progress in building better models of minds. We first detail what we mean by severe testing and highlight how this is especially important when working with opaque models with many free parameters that may solve a given task in multiple different ways. Second, we provide multiple examples of researchers making strong claims regarding DNN-human similarities without engaging in severe testing of their hypotheses. Third, we consider why severe testing is undervalued. We provide evidence that part of the fault lies with the review process. There is now a widespread appreciation in many areas of science that a bias for publishing positive results (among other practices) is leading to a credibility crisis, but there seems less awareness of the problem here.
Original language | English (US) |
---|---|
Article number | 101158 |
Journal | Cognitive Systems Research |
Volume | 82 |
DOIs | |
State | Published - Dec 2023 |
Keywords
- Memory
- Neural networks
- Perception
- Psychology
- Severe testing
- Vision
ASJC Scopus subject areas
- Software
- Experimental and Cognitive Psychology
- Cognitive Neuroscience
- Artificial Intelligence