The design of space-time codes for frequency flat, spatially correlated MIMO fading channels is considered. The focus of the paper is on the class of space-time block codes known as Linear Dispersion (LD) codes, introduced by Hassibi and Hochwald. The LD codes are optimized with respect to the mutual information between the inputs to the space-time encoder and the output of the channel. The use of the mutual information as both a design criterion and a performance measure is justified by allowing soft decisions at the output of the space-time decoder. A spatial Fourier (virtual) representation of the channel is exploited to allow for the analysis of MIMO channels with quite general fading statistics. Conditions, known as Generalized Orthogonality Conditions (GOC), are derived for an LD code to achieve an upper bound on the mutual information, with the understanding that LD codes that achieve the upper bound, if they exist, are optimal. Explicit code constructions and properties of the optimal power allocation schemes are also derived. In particular, it is shown that optimal LD codes correspond to beamforming to a single virtual transmit angle at low SNR, and a necessary and sufficient condition for beamforming to be optimal is provided. Finally, numerical results are provided to illustrate the optimal code design for two examples of sparse scattering environments. The performance of the optimal LD codes for these scattering environments is compared with that of LD codes designed assuming the i.i.d. Rayleigh (rich scattering) model, and it is shown that the optimal LD codes perform significantly better. The optimal LD codes are also compared to beamforming LD codes and it is shown that beamforming is nearly optimal over a range of SNR's of interest.