This diploma thesis investigates the use of appearance-based features for the recognition of gestures using video input. Previously, work in the field of gesture recognition usually first segmented parts of the input images — for example the hand — and then used features calculated from this segmented input. Results in the field of object recognition in images suggest that this intermediate segmentation step is not necessary and we can instead employ features directly obtained from the input images, so-called appearance-based features. In this work, we show that using these features and appropriate models of image variability, we can obtain excellent results for gesture recognition tasks. Very good results can be obtained using a downscaled image of each video frame and tangent distance as a model of image variability. Also a new dynamic tracking algorithm is introduced which makes its tracking decisions at the end of a video sequence using the information of all frames. This tracking method allows for tracking under very noisy circumstances. Finally, a new database with the German fingerspelling alphabet was recorded which will be freely available for further research.