An analysis of 244 immature teratomas (IT) was undertaken to evaluate the clinical usefulness and reproducibility of the grading system. Clinical follow-up was available for 143 stage I tumors and ranged from 7 to 204 months (mean 85 months, median 84). Sixteen of 91 (18%) patients with high-grade teratomas were dead with metastases. These results are much improved over the years before the use of modern (1970 or later) combination chemotherapy. In contrast, only 3 of 52 (6%) patients with grade 1 tumors died, 1 of whom was living with tumor 6 years after surgery. None of the three neoplasms was adequately sampled. Of the three grade 1 tumors that progressed, the smallest one weighed > 1,500 g and the other two were huge; yet, only one to five slides per tumor were available for review. The study confirmed that small foci (2 mm or less) of other germ cell elements do not adversely affect the prognosis of IT. The reproducibility between pathologists of the traditional grading system for IT, based on the amount of immature neuroepithelium, is only moderate if a three-tiered scale is used and limited by the results of the least skilled observer (kappa = 0.54). Although never approaching complete agreement, interobserver variability is reduced if a two-tiered system is used (kappa = 0.66). Microscopic patterns that were the source of disagreement were identified. Agreement could be improved with training sessions.