In this paper, we propose a bilevel sparse coding model for coupled feature spaces, where we aim to learn dictionaries for sparse modeling in both spaces while enforcing some desired relationships between the two signal spaces. We first present our new general sparse coding model that relates signals from the two spaces by their sparse representations and the corresponding dictionaries. The learning algorithm is formulated as a generic bilevel optimization problem, which is solved by a projected first-order stochastic gradient descent algorithm. This general sparse coding model can be applied to many specific applications involving coupled feature spaces in computer vision and signal processing. In this work, we tailor our general model to learning dictionaries for compressive sensing recovery and single image super-resolution to demonstrate its effectiveness. In both cases, the new sparse coding model remarkably outperforms previous approaches in terms of recovery accuracy.