A video sequence usually consists of separate scenes, and each scene includes many shots. For video understanding purposes, it is most important to detect scene breaks. To analyze the content of each scene, detection of shot breaks is also required. Usually, a scene break is associated with a simultaneous change of image, motion, and audio characteristics, while a shot break is only accompanied with changes in image or motion or both. We propose to use audio information along with image and motion information to accomplish segmentation at different levels. Promising results have been obtained with videos digitized from TV programs.