Programming and computational thinking skills are promoted in schools worldwide. However, there is still a lack of tools that assist learners and educators in the assessment of these skills. We have implemented an assessment tool, called Dr. Scratch, that analyzes Scratch projects with the aim to assess the level of development of several aspects of computational thinking. One of the issues to address in order to show its validity is to compare the (automatic) evaluations provided by the tool with the (manual) evaluations by (human) experts. In this paper we compare the assessments provided by Dr. Scratch with over 450 evaluations of Scratch projects given by 16 experts in computer science education. Our results show strong correlations between automatic and manual evaluations. As there is an ample debate among educators on the use of this type of tools, we discuss the implications and limitations, and provide recommendations for further research.