UNLABELLED Two experiments examined reliability and classification accuracy of a narration-based dynamic assessment task. PURPOSE The first experiment evaluated whether parallel results were obtained from stories created in response to 2 different wordless picture books. If so, the tasks and measures would be appropriate for assessing pretest and posttest change within a dynamic assessment format. The second experiment evaluated the extent to which children with language impairments performed differently than typically developing controls on dynamic assessment of narrative language. METHOD In the first experiment, 58 1st- and 2nd-grade children told 2 stories about wordless picture books. Stories were rated on macrostructural and microstructural aspects of language form and content, and the ratings were subjected to reliability analyses. In the second experiment, 71 children participated in dynamic assessment. There were 3 phases: a pretest phase, in which children created a story that corresponded to 1 of the wordless picture books from Experiment 1; a teaching phase, in which children attended 2 short mediation sessions that focused on storytelling ability; and a posttest phase, in which children created a story that corresponded to a second wordless picture book from Experiment 1. Analyses compared the pretest and posttest stories that were told by 2 groups of children who received mediated learning (typical and language impaired groups) and a no-treatment control group of typically developing children from Experiment 1. RESULTS The results of the first experiment indicated that the narrative measures applied to stories about 2 different wordless picture books had good internal consistency. In Experiment 2, typically developing children who received mediated learning demonstrated a greater amount of pretest to posttest change than children in the language impaired and control groups. Classification analysis indicated better specificity and sensitivity values for measures of response to intervention (modifiability) and posttest storytelling than for measures of pretest storytelling. Observation of modifiability was the single best indicator of language impairment. Posttest measures and modifiability together yielded no misclassifications. CONCLUSION The first experiment supported the use of 2 wordless picture books as stimulus materials for collecting narratives before and after mediation within a dynamic assessment paradigm. The second experiment supported the use of dynamic assessment for accurately identifying language impairments in school-age children.