We study the evolution of artificial learning systems by means of selection. Genetic programming is used to generate populations of programs that implement algorithms used by neural network classifiers to learn a rule in a supervised learning scenario. In contrast to concentrating on final results, which would be the natural aim while designing good learning algorithms, we study the evolution process. Phenotypic and genotypic entropies, which describe the distribution of fitness and of symbols, respectively, are used to monitor the dynamics. We identify significant functional structures responsible for the improvements in the learning process. In particular, some combinations of variables and operators are useful in assessing performance in rule extraction and can thus implement annealing of the learning schedule. We also find combinations that can signal surprise, measured on a single example, by the difference between predicted and correct classification. When such favorable structures appear, they are disseminated on very short time scales throughout the population. Due to such abruptness they can be thought of as dynamical transitions. But foremost, we find a strict temporal order of such discoveries. Structures that measure performance are never useful before those for measuring surprise. Invasions of the population by such structures in the reverse order were never observed. Asymptotically, the generalization ability approaches Bayesian results.