BACKGROUND Clinical decision support systems assist physicians in interpreting complex patient data. However, they typically operate on a per-patient basis and do not exploit the extensive latent medical knowledge in electronic health records (EHRs). The emergence of large EHR systems offers the opportunity to integrate population information actively into these tools. METHODS Here, we assess the ability of a large corpus of electronic records to predict individual discharge diagnoses. We present a method that exploits similarities between patients along multiple dimensions to predict the eventual discharge diagnoses. RESULTS Using demographic, initial blood and electrocardiography measurements, as well as medical history of hospitalized patients from two independent hospitals, we obtained high performance in cross-validation (area under the curve >0.88) and correctly predicted at least one diagnosis among the top ten predictions for more than 84% of the patients tested. Importantly, our method provides accurate predictions (>0.86 precision in cross validation) for major disease categories, including infectious and parasitic diseases, endocrine and metabolic diseases and diseases of the circulatory systems. Our performance applies to both chronic and acute diagnoses. CONCLUSIONS Our results suggest that one can harness the wealth of population-based information embedded in electronic health records for patient-specific predictive tasks.