Manually optimizing the tradeoffs between accuracy, performance and energy for resource-intensive applications with flexible accuracy or precision requirements is extremely difficult. We present ApproxTuner, an automatic framework for accuracy-aware optimization of tensor-based applications while requiring only high-level end-to-end quality specifications. ApproxTuner implements and manages approximations in algorithms, system software, and hardware. The key contribution in ApproxTuner is a novel three-phase approach to approximation-tuning that consists of development-time, install-time, and run-time phases. Our approach decouples tuning of hardware-independent and hardware-specific approximations, thus providing retargetability across devices. To enable efficient autotuning of approximation choices, we present a novel accuracy-aware tuning technique called predictive approximation-tuning, which significantly speeds up autotuning by analytically predicting the accuracy impacts of approximations. We evaluate ApproxTuner across 10 convolutional neural networks (CNNs) and a combined CNN and image processing benchmark. For the evaluated CNNs, using only hardware-independent approximation choices we achieve a mean speedup of 2.1x (max 2.7x) on a GPU, and 1.3x mean speedup (max 1.9x) on the CPU, while staying within 1 percentage point of inference accuracy loss. For two different accuracy-prediction models, ApproxTuner speeds up tuning by 12.8x and 20.4x compared to conventional empirical tuning while achieving comparable benefits.