The continual growth of electronic medical record (EMR) databases has paved the way for many data mining applications, including the discovery of novel disease-drug associations and the prediction of patient survival rates. However, these tasks are hindered because EMRs are usually segmented or incomplete. EMR analysis is further limited by the overabundance of medical term synonyms and morphologies, which causes existing techniques to mismatch records containing semantically similar but lexically distinct terms. Current solutions ill in missing values with techniques that tend to introduce noise rather than reduce it. In this paper, we propose to simultaneously infer missing data and solve semantic mismatching in EMRs by irst integrating EMR data with molecular interaction networks and domain knowledge to build the HEMnet, a heterogeneous medical information network. We then project this network onto a low-dimensional space, and group entities in the network according to their relative distances. Lastly, we use this entity distance information to enrich the original EMRs. We evaluate the effectiveness of this method according to its ability to separate patients with dissimilar survival functions. We show that our method can obtain signiicant (p-value < 0.01) results for each cancer subtype in a lung cancer dataset, while the baselines cannot.