TY - JOUR
T1 - Batch equalization with a generative adversarial network
AU - Qian, Wesley Wei
AU - Xia, Cassandra
AU - Venugopalan, Subhashini
AU - Narayanaswamy, Arunachalam
AU - Dimon, Michelle
AU - Ashdown, George W
AU - Baum, Jake
AU - Peng, Jian
AU - Ando, D Michael
N1 - Funding Information:
We thank Luke Metz for his helpful comments, Joel Shor for his code reviews, and the Google Accelerated Science Team for their helpful discussions and suggestions. The malaria data were funded by a grant from the Bill & Melinda Gates Foundation [OPP1181199].
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Motivation: Advances in automation and imaging have made it possible to capture a large image dataset that spans multiple experimental batches of data. However, accurate biological comparison across the batches is challenged by batch-to-batch variation (i.e. batch effect) due to uncontrollable experimental noise (e.g. varying stain intensity or cell density). Previous approaches to minimize the batch effect have commonly focused on normalizing the low-dimensional image measurements such as an embedding generated by a neural network. However, normalization of the embedding could suffer from over-correction and alter true biological features (e.g. cell size) due to our limited ability to interpret the effect of the normalization on the embedding space. Although techniques like flat-field correction can be applied to normalize the image values directly, they are limited transformations that handle only simple artifacts due to batch effect. Results: We present a neural network-based batch equalization method that can transfer images from one batch to another while preserving the biological phenotype. The equalization method is trained as a generative adversarial network (GAN), using the StarGAN architecture that has shown considerable ability in style transfer. After incorporating new objectives that disentangle batch effect from biological features, we show that the equalized images have less batch information and preserve the biological information. We also demonstrate that the same model training parameters can generalize to two dramatically different types of cells, indicating this approach could be broadly applicable.
AB - Motivation: Advances in automation and imaging have made it possible to capture a large image dataset that spans multiple experimental batches of data. However, accurate biological comparison across the batches is challenged by batch-to-batch variation (i.e. batch effect) due to uncontrollable experimental noise (e.g. varying stain intensity or cell density). Previous approaches to minimize the batch effect have commonly focused on normalizing the low-dimensional image measurements such as an embedding generated by a neural network. However, normalization of the embedding could suffer from over-correction and alter true biological features (e.g. cell size) due to our limited ability to interpret the effect of the normalization on the embedding space. Although techniques like flat-field correction can be applied to normalize the image values directly, they are limited transformations that handle only simple artifacts due to batch effect. Results: We present a neural network-based batch equalization method that can transfer images from one batch to another while preserving the biological phenotype. The equalization method is trained as a generative adversarial network (GAN), using the StarGAN architecture that has shown considerable ability in style transfer. After incorporating new objectives that disentangle batch effect from biological features, we show that the equalized images have less batch information and preserve the biological information. We also demonstrate that the same model training parameters can generalize to two dramatically different types of cells, indicating this approach could be broadly applicable.
UR - http://www.scopus.com/inward/record.url?scp=85099209885&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099209885&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btaa819
DO - 10.1093/bioinformatics/btaa819
M3 - Article
C2 - 33381813
SN - 1367-4803
VL - 36
SP - i875-i883
JO - Bioinformatics (Oxford, England)
JF - Bioinformatics (Oxford, England)
IS - Supplement_2
ER -