Visual fidelity and interactivity are the main goals in Computer Graphics research, but recently also audio is assuming an important role. Binaural rendering can provide extremely pleasing and realistic three-dimensional sound, but to achieve best results it’s necessary either to measure or to estimate individual Head Related Transfer Function (HRTF). This function is strictly related to the peculiar features of ears and face of the listener. Recent sound scattering simulation techniques can calculate HRTF starting from an accurate 3D model of a human head. Hence, the use of binaural rendering on large scale (i.e. video games, entertainment) could depend on the possibility to produce a sufficiently accurate 3D model of a human head, starting from the smallest possible input. In this paper we present a completely automatic system, which produces a 3D model of a head starting from simple input data (five photos and some key-points indicated by user). The geometry is generated by extracting information from images and accordingly deforming a 3D dummy to reproduce user head features. The system proves to be fast, automatic, robust and reliable: geometric validation and preliminary assessments show that it can be accurate enough for HRTF calculation.