Adolfo Hernandez

Learn More
Multiple Classifier Systems (MCSs) allow evaluation of the uncertainty of classification outcomes that is of crucial importance for safety critical applications. The uncertainty of classification is determined by a trade-off between the amount of data available for training, the classifier diversity and the required performance. The interpretability of MCSs(More)
This paper reports our participation on the text normaliza-tion shared task campaign organized by the CAW 2.0 workshop. Through a Statistical Machine Translation (SMT) system we managed to produce sentences syntactically correct given sentences written with misspelled words and chatting slangs. This approach was applied on the evaluation campaign's test set(More)
This paper describes the UPC participation in the WMT 12 evaluation campaign. All systems presented are based on standard phrase-based Moses systems. Variations adopted several improvement techniques such as morphology simplification and generation and domain adaptation. The morphology simplification overcomes the data sparsity problem when translating into(More)
Given n independent, identically distributed random vectors in R d , drawn from a common density f , one wishes to find out whether the support of f is convex or not. In this paper we describe a test which decides correctly for sufficiently large n, with probability 1, whenever f is bounded away from zero in its compact support. We also show that the(More)
The uncertainty of classification outcomes is of crucial importance for many safety critical applications including, for example, medical diagnostics. In such applications the uncertainty of classification can be reliably estimated within a Bayesian model averaging technique that allows the use of prior information. Decision Tree (DT) classification models(More)
Bayesian averaging (BA) over ensembles of decision models allows evaluation of the uncertainty of decisions that is of crucial importance for safety-critical applications such as medical diagnostics. The interpretability of the ensemble can also give useful information for experts responsible for making reliable decisions. For this reason, decision trees(More)
This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) for our participation in the IWSLT'08 evaluation campaign. We present N gram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems'(More)
This work aims to improve an N-gram-based statistical machine translation system between the Catalan and Spanish languages, trained with an aligned Spanish– Catalan parallel corpus consisting of 1.7 million sentences taken from El Periódico newspaper. Starting from a linguistic error analysis above this baseline system, orthographic, morphological, lexical,(More)
In this paper we experimentally compare the classification uncertainty of the randomised Decision Tree (DT) ensemble technique and the Bayesian DT technique with a restarting strategy on a synthetic dataset as well as on some datasets commonly used in the machine learning community. For quantitative evaluation of classification uncertainty, we use an(More)