An automatic system for creating a virtual head that is compatible with MPEG-4 facial object specification is presented. Color classification and a valley detection filter are performed to find face and Facial Definition Points (FDPs) at the initialization stage. Extracted FDPs are tracked by normalized correlation and their trajectories are fed into an extended Kalman filter (EKF) to recover camera geometry, facial orientation, and depth of selected FDPs. Based on a recovered point-wise 3-D structure, Dirichlet Free-Form Deformations (DFFD) is applied to deform a generic 3-D model. Once a virtual head is created, the head can be used to track FDPs for large outof-plane rotations and to update the head model continuously based on refined depth information. A complete texture map is created by mixing frontal and rotated faces based on the recovered face orientation.