Josep Ma Crego

Learn More
We describe two methods to improve SMT accuracy using shallow syntax information. First, we use chunks to refine the set of word alignments typically used as a starting point in SMT systems. Second, we extend an N-gram-based SMT system with chunk tags to better account for long-distance reorderings. Experiments are reported on an Arabic-English task showing(More)
This work discusses translation results for the four Euparl data sets which were made available for the shared task " Exploiting Parallel Texts for Statistical Machine Translation ". All results presented were generated by using a statistical machine translation system which implements a log-linear combination of feature functions along with a bilingual(More)
This paper presents a reordering framework for statistical machine translation (SMT) where source-side reorderings are integrated into SMT decoding, allowing for a highly constrained reordered search graph. The monotone search is extended by means of a set of reordering patterns (linguistically motivated rewrite patterns). Patterns are automatically learnt(More)
This work presents translation results for the three data sets made available in the shared task " Exploiting Parallel Texts for Statistical Machine Translation " of the HLT-NAACL 2006 Workshop on Statistical Machine Translation. All results presented were generated by using the N-gram-based statistical machine translation system which has been enhanced(More)
This paper describes the 2007 Ngram-based statistical machine translation system developed at the TALP Research Center of the UPC (Uni-versitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the previous years system, being highlighted and empirically compared. Mainly, these include a novel word ordering strategy(More)
This paper describes the UPC's bilingual n-gram approach to statistical machine translation, which implements the log-linear combination of a bilingual n-gram translation model with six additional feature functions. A brief description of the complete system is presented and special attention is devoted to the novel features and reordering strategies that(More)
This article presents and describes an experimental prototype system for performing Chinese-to-Spanish and Spanish-to-Chinese machine translation. The system is based on the statistical machine translation (SMT) framework and, more specifically, it implements the bilingual n-gram SMT approach. Since, as far as we know, no large Chinese-Spanish parallel(More)
This paper reports on the participation of the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) to the ACL WMT 2008 evaluation campaign. This year's system is the evolution of the one we employed for the 2007 campaign. Main updates and extensions involve linguistically motivated word reordering based on the reordering patterns(More)