OBJECTIVE To evaluate the summative assessment (OSCE) of a communication training programme for dealing with challenging doctor-patient encounters in the 4th study year. METHODS Our OSCE consists of 4 stations (breaking bad news, guilt and shame, aggressive patients, shared decision making), using a 4-item global rating (GR) instrument. We calculated reliability coefficients for different levels, discriminability of single items and interrater reliability. Validity was estimated by gender differences and accordance between GR and a checklist. RESULTS In a pooled sample of 456 students in 3 OSCEs over 3 terms, total reliability was α=0.64, reliability coefficients for single stations were >0.80, and discriminability in 3 of 4 stations was within the range of 0.4-0.7. Except for one station, interrater reliability was moderate to strong. Reliability on item level was poor and pointed to some problems with the use of the GR. CONCLUSION The application of the GR on regular undergraduate medical education shows moderate reliability in need of improvement and some traits of validity. Ongoing development and evaluation is needed with particular regard to the training of the examiners. PRACTICE IMPLICATIONS Our CoMeD-OSCE proved suitable for the summative assessment of communication skills in challenging doctor-patient encounters.