The predictive accuracy of the Navier–Stokes equations is known to degrade at the limits of the continuum assumption, for example, in rarefied and/or nonequilibrium gases, thereby necessitating expensive and often highly approximate solutions to the Boltzmann equation. While tractable in one spatial dimension, their high dimensionality (n physical plus three phase-space) makes multi-dimensional Boltzmann calculations impractical for all but canonical configurations. It is therefore desirable to augment the accuracy of the Navier–Stokes equations in these regimes. We present an application of a deep learning (DL) method to extend the validity of the Navier–Stokes equations to the transitional flow regime. It works by encoding the “miss-ing” (i.e., sub-continuum) physics in the Navier–Stokes equations via a neural network, which is trained by targeting density, velocity, and energy profiles obtained from direct Boltzmann solutions. While standard DL methods (e.g., those developed for image recognition, language processing, etc.) can be considered ad-hoc due to the absence of underlying physical laws, at least in the sense that the systems are not governed by known partial differential equations (PDEs), the DL framework applied here leverages the a priori-known Boltzmann physics while ensuring that the trained model is consistent with the Navier–Stokes PDEs. The online training procedure solves adjoint PDEs, which efficiently provide the gradient of the loss function with respect to the forward PDE solution. The adjoint PDEs are automatically constructed using algorithmic differentiation (AD). The model is trained and applied to predict shock thickness in low-pressure argon. The resulting DL-augmented Navier–Stokes equations have comparable accuracy to the target Boltzmann solutions. Extensions to other regimes and gas flows are discussed for future work.