Ariel Ephrat

Learn More
While egocentric video is becoming increasingly popular, browsing it is very difficult. In this paper we present a compact 3D Convolutional Neural Network (CNN) architecture for long-term activity recognition in egocentric videos. Recognizing long-term activities enables us to temporally segment (index) long and unstructured egocentric videos. Existing(More)
Speechreading is a notoriously difficult task for humans to perform. In this paper we present an end-to-end model based on a convolutional neural network (CNN) for generating an intelligible acoustic speech signal from silent video frames of a speaking person. The proposed CNN generates sound features for each frame based on its neighboring frames.(More)
We present a new, fully automatic algorithm for liver tumors segmentation in follow-up CT studies. The inputs are a baseline CT scan and a delineation of the tumors in it and a follow-up scan; the outputs are the tumors delineations in the follow-up CT scan. The algorithm consists of four steps: 1) deformable registration of the baseline scan and tumors(More)
When video is recorded in a studio, sound is clear of external noises and unrelated sounds. However, most video is not shot at studios. Voice of people shot in family events is mixed with music and with other voices. Video conferences from home or office are often disturbed by other people, ringing phones, or barking dogs. TV reporting from city streets is(More)
  • 1