Matin Kamali

Learn More
This paper describes recent advances in hidden Markov model (HMM) based OCR for machine-printed arabic documents. A combination of script-independent and script-specific techniques are applied to glyph models and language models (LM). Script-independent techniques we applied are higher order n-gram LMs for N-best rescoring and discriminative estimation of(More)
Handwritten text in Arabic and other languages exhibit significant variations in the slant and baseline of characters across words and also within a single word. Since the concept of baseline does not have a precise mathematical definition, existing approaches use heuristic methods to first identify a set of baseline relevant pixels and then fit(More)
Offline handwriting recognition of free-flowing Arabic text is a challenging task due to the plethora of factors that contribute to the variability in the data. In this paper, we address some of these sources of variability, and present experimental results on a large corpus of handwritten documents. Specific techniques such as the application of(More)
When performing handwriting recognition on natural language text, the use of a word-level language model (LM) is known to significantly improve recognition accuracy. The most common type of language model, the n-gram model, decomposes sentences into short, overlapping chunks. In this paper, we propose a new type of language model which we use in addition to(More)
Offline handwriting recognition (OHR) is an extremely challenging task because of many factors including variations in writing style, writing device and material, and noise in the scanning and collection process. Due to the diverse nature of the above challenges, it is highly unlikely that a single recognition technique can address all the characteristics(More)
Significant advances have been achieved in Speech-to-Speech (S2S) translation systems in recent years. However, rapid configuration of S2S systems for low-resource language pairs and domains remains a challenging problem due to lack of human translated bilingual training data. In this paper, we report on an effort to port our existing English/Iraqi S2S(More)
In this paper, we introduce a new operational platform for end-to-end document image analysis, recognition, and machine translation. The Raytheon BBN Document Analysis Service (BBN DAS) performs the following operations on scanned machine-print document images: (1) image pre-processing and segmentation to identify homogenous zones of text, (2) text(More)
Many kinds of neuroscience data are being acquired regarding the dynamic behaviour and phenotypic diversity of nerve cells. But as the size, complexity and numbers of 3D neuroanatomical datasets grow ever larger, the need for automated detection and analysis of individual neurons takes on greater importance. We describe here a method that detects and(More)
2 Acknowledgments I would like to express my special thanks to Prof. Dana Brooks for all his advice, comments, and inspiration both during the project and the thesis. Many thanks to Prof. Abstract Vertebrate central nervous systems (CNS) contain hundreds or thousands of distinct nerve cell types with specialized morphologies and functions. As neural systems(More)
  • 1