Fine-grain annotation of cricket videos

@article{Sharma2015FinegrainAO,
  title={Fine-grain annotation of cricket videos},
  author={Rahul Anand Sharma and K. Pramod Sankar and C. V. Jawahar},
  journal={2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)},
  year={2015},
  pages={421-425}
}
The recognition of human activities is one of the key problems in video understanding. Action recognition is challenging even for specific categories of videos, such as sports, that contain only a small set of actions. Interestingly, sports videos are accompanied by detailed commentaries available online, which could be used to perform action annotation in a weakly-supervised setting. For the specific case of Cricket videos, we address the challenge of temporal segmentation and annotation of… 

Figures and Tables from this paper

Cricket stroke extraction: Towards creation of a large-scale cricket actions dataset
TLDR
This paper deals with the problem of temporal action localization for a large-scale untrimmed cricket videos dataset and applies a trained random forest model for CUTs detection and two linear SVM camera models for first frame detection, trained on HOG features of CAM1 and CAM2 video shots.
Scene Classification for Sports Video Summarization Using Transfer Learning
TLDR
A novel method for sports video scene classification with the particular intention of video summarization using pre-trained AlexNet Convolutional Neural Network for scene classification and employing new, fully connected layers in an encoder fashion is proposed.
Temporal Cricket Stroke Localization from Untrimmed Highlight Videos
TLDR
This work model the temporal Cricket stroke localization problem by training a recurrent neural network (RNN) and predicting the localized stroke segments using a sliding-window approach and collects 26 Highlight videos of a single T20 tournament and hand-label the Cricket stroke segments in them.
CRNN Based Jersey-Bib Number/Text Recognition in Sports and Marathon Images
TLDR
This work proposed a new framework based on detecting the human body parts such that both Jersey Bib number and text is localized reliably reliably and is compared with the state-of-the-art methods on all four datasets.
Analyzing Racket Sports From Broadcast Videos
TLDR
A method to analyze a large corpus of broadcast videos by segmenting the points played, tracking and recognizing the players in each point and annotating their respective strokes and an end-to-end framework for automatic attributes tagging and analysis of broadcast sport videos is proposed.
Discovering Cricket Stroke Classes in Trimmed Telecast Videos
Activity recognition in sports telecast videos is challenging, especially, in outdoor field events, where there is a lot of camera motion. Generally, camera motions like zoom, pan, and tilt introduce
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment
  • Paritosh Parmar, B. Morris
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This paper proposes to learn spatio-temporal features that explain three related tasks - fine-grained action recognition, commentary generation, and estimating the AQA score, and shows that the MTL approach outperforms STL approach using two different kinds of architectures: C3D-AVG and MSCADC.
Analytical Review on Textual Queries Semantic Search based Video Retrieval
TLDR
Textual queries semantic search can contain temporal and spatial information about multiple objects like trees and building present in the scene and matches the nouns, verb and adverb detected in the video frame to detect the action and position of the object by using semantic meaningful graph.
CommBox: Utilizing sensors for real-time cricket shot identification and commentary generation
TLDR
A framework to automate cricket shot identification and commentary generation using sensor data as features for machine learning models is proposed.
...
1
2
...

References

SHOWING 1-10 OF 16 REFERENCES
Text Driven Temporal Segmentation of Cricket Videos
TLDR
A multi-modal approach where clues from different information sources are merged to perform the segmentation of videos based on textual descriptions or commentaries of the action in the video to provide a semantic access and retrieval of video segments, which is difficult to obtain from existing visual feature based approaches.
Learning realistic human actions from movies
TLDR
A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.
Personalized retrieval of sports video
TLDR
A novel approach is proposed to achieve personalized retrieval of sport video, which includes two research tasks: semantic annotation of sports video and acquisition of user's preference, which has an encouraging performance.
Event based indexing of broadcasted sports video by intermodal collaboration
TLDR
The experimental results for broadcasted sports video of American football games indicate that intermodal collaboration is effective for video indexing by the events such as touchdown (TD) and field goal (FG).
Learning to Track and Identify Players from Broadcast Sports Videos
TLDR
A system that possesses the ability to detect and track multiple players, estimates the homography between video frames and the court, and identifies the players, and proposes a novel Linear Programming (LP) Relaxation algorithm for predicting the best player identification in a video clip.
Segmentation of intentional human gestures for sports video annotation
TLDR
This work uses a Hidden Markov Model approach for gesture modeling with both isolated gestures and gestures segmented from a stream to augment video with data obtained from accelerometers worn as wrist bands by one or more officials.
Video Google: a text retrieval approach to object matching in videos
We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpoint
Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos
TLDR
An Integer Programming framework for action recognition and storyline extraction using the storyline model and visual groundings learned from training data is formulated.
Live sports event detection based on broadcast video and web-casting text
TLDR
A novel approach for event detection from the live sports game using web-casting text and broadcast video, able to detect live event only based on the partial content captured from the web and TV and create personalized summary related to certain event, player or team according to user's preference.
Automatically extracting highlights for TV Baseball programs
TLDR
This paper explores how to provide for the ability to extract highlights automatically, so that viewing time can be reduced, and presents results comparing output of algorithms against human-selected highlights for a diverse collection of baseball games with very encouraging results.
...
1
2
...